Title: Data Manipulation
1Data Manipulation
- "Computer Science is no more about computers than
astronomy is about telescopes." - -- E. W. Dijkstra.
2Inside a typical PC
3Typical PC Subsystems
- Processing "Thinking parts of your PC that
process data and pass them back and forth as
needed - CPU, RAM, cache memory along with other
subsystems - Input/Output (I/O)
- Monitor, video interface, printer, modem etc.
that are usually external to the machine - Storage Holds all the data for easy access
- Hard drive, floppy, CD-ROM drive, Zip disk and
other removable media
4Computer Architecture
- Typical components of a computer system
- Central Processing Unit (CPU)
- Primary Storage (Main Memory)
- Secondary Storage
- I/O Devices
- System Bus
5Components of CPU
6CPU Components
- ALU Arithmetic Logic Unit
- Circuits for arithmetic and logic data
manipulation - CU Control Unit
- Circuits for coordinating and controlling
activities - Registers
- General-purpose Registers storing intermediate
results temporarily for CPUs use - Special-purpose registers storing instructions
and program counter
7Primary Storage
- Known as Main Memory
- Sequence of contiguous, adjacent cells
- with unique address corresponding to its physical
location in the sequence - Typically random-access (RAM)
- Random-access or direct-access allows CPU to
access a specific address location in memory for
reading or writing
8Main Memory
- Storage area for bits
- Bits are grouped into (typically 8-bit) chunks
called bytes - Bytes are grouped into (typically 4-byte) chunks
called words - Each word has a memory address
- Memory capacity is measured in bytes example,
256 MegaBytes of RAM
9Main Memory Word Size
- Word is loosely defined as the amount of data
that CPU processes at one time - Word Size usually matches the size of CPU
registers - Typically, Word Size is 32-bits in modern
computers - Word Size is a fundamental CPU design decision
that has implications for the design of other
components like the Bus width
10Processor Registers
- Temporary storage areas in CPU
- Two categories
- General purpose
- Special purpose
- Total number of registers in a CPU is variable by
design - Each register is referenced by a unique register
number - For example, in 4-bit notation, register numbers
would range from 00002 to 11112, a total of 16
registers - For purposes of notation, register numbers and
memory addresses usually are expressed in Hex - So, 16 registers in Hex would be numbered 016
through F16
11General purpose registers
- Used in a variety of ways by programs
- Hold data, intermediate results and such that are
frequently accessed during execution, rather than
fetch from main memory every time - Increase in number of general-purpose registers
would increase execution speed of a program - Unfortunately, registers are very expensive, so
we need to establish a trade-off
12Special purpose registers
- Content and use are specified as part of CPU
design - Two of the typical special-purpose registers are
- The Instruction Register
- Used by Control Unit to hold instruction just
loaded from memory, before decode and execute - The Program Counter (or, instruction pointer)
- Pointer to the next instruction to be executed by
CPU
13System Bus
- A set of parallel communication lines connecting
CPU with main memory and other system components - Dedicated buses control bus, address bus, data
bus
14Summary CPU Architecture
15Secondary Storage
- Before we talk about Machine Language and Machine
Instructions, let us take a brief look at
Secondary or Mass Storage devices. - (Refer Chapter 1, Section 1.3)
16Memory Organization
- Memory Hierarchy
- in
- modern computers
17Memory and Storage
- Basic Principles
- Make the common case fast
- Temporal and spatial locality
- Secondary storage contains all the data and
instructions not currently needed by CPU - It should be easy to access, should hold data
indefinitely, and should be both read-able and
write-able
18Basic Memory Organization
- Registers (CPU)
- General-purpose, Special-purpose
- High-speed, quick response times
- Cache Memory
- Often located within CPU
- High-speed response time similar to CPU Registers
- Holds a copy of that portion of Main Memory in
current use - Main Memory (RAM)
- Collection of adjacent cells
- Accessed via system bus by CPU
- Secondary Memory, I/O devices
19Main Memory
- Memory cells are usually byte-sized and arranged
by their addresses, so, easy for random access
(RAM) - If you know the cell address, you can get the
contents of that cell
20Byte-size memory cell
There is no physical demarcation for cell
boundaries. Chunks of 8 bits forms an imaginary
grouping for our convenience. Data stored in a
memory cell is in the form of 0s and 1s, which
we envision as arranged in a row, left-to-right.
Figure 1.7, Page 27The organization of a
byte-size memory cell
21RAM, the live memory
- RAMs come in a few different forms based on chip
layout (hardware design) - DRAM Dynamic RAM, soldered to motherboard
- SIMM Single Inline Memory Module, mounted 32-bit
RAM chips that plug into a special socket - DIMM Dual Inline Memory Module, similar to SIMM
but has more pins and allows for wider addressing
(64-bit)
22Memory Capacity Units
- One byte 8 bits
- 210 bytes 1024 bytes 1 kilo-byte
- 220 bytes 1,048,576 bytes 1mega-byte
- 230 1 giga-byte
23Secondary Storage Devices
- On-line storage
- Devices that are connected and readily available
to the machine without human intervention - Off-line storage
- Devices that need human intervention to make them
available to the machine - Mass Storage Devices
- can be on-line, or, off-line devices
- with direct access or sequential access
- Examples
- Magnetic Disks (direct access)
- Compact Disks (direct access)
- Magnetic Tapes (sequential access)
24Magnetic Disks Hard Drive
25Hard Disk Drive
- Refers to magnetic disks with hard metal
substrates (bases) - Disk or Platter size is about 14 in diameter
- Several platters mounted on disk drive devices
- Data is recorded in concentric tracks
- Read/Write heads do the actual data access
- Capacity ranges from 50 MB to 1.5 GB
- Direct access device
- Performance evaluated in terms of access time and
transfer rate
26Access Time of Hard Drives
- Access time refers to the average amount of time
(in millisecond) taken by the hard drive to
locate a particular piece of data on the hard
disk. - Older drives had access time of about 50 ms
whereas newer ones are around 10 ms. - CD-ROM access times are usually in hundreds of ms
- Transfer Rate refers to the rate at which data
can be transferred to RAM once it has been
located by the hard drive. -
27Disk Storage System
Figure 1.9,Page 31 A disk storage system
- Access Time
- Seek time Rotational delay
- Seek Time time required to position the
read/write head along the correct track - Rotational Delay time required to rotate the
platter to position the correct sector underneath
the read/write head
28Magnetic Tape Storage
Figure 1.11 Page 33A magnetic tape storage
mechanism
- Like an audio tape, this is a sequential access
device - Data must be accessed in the physical order in
which they were stored in the tape - Very slow, but appropriate and cost-effective for
archival copies for low-demand applications - Capacity measured in terms of recording density
Bytes per inch , Bpi - Max recording density can range as high as 36,000
Bpi
29Compact Disks
- Like LPs, but spiraling inside to outside
- Has tracks and sectors just like magnetic disks
- Laser beam reads the data, just like read/write
head does in magnetic disks - Typical capacity 600MB to 700 MB
30Logical vs. Physical Records
Figure 1.12Logical records versus physical
records on a disk
- File Storage/Retrieval If each of you write
your first name, last name, favorite sports/games
on separate sheets of paper, how can we identify
all pieces of info about each one of you? - Physical Records are on Paper1, Paper2 and Paper3
- How can I get the Logical Records
- First name, Last name, favorite sports/games per
student - Logical record size need not match the physical
record size of the disk
31Summary Memory Hierarchy
32Storage Options at a Glance
33Summary Memory Access
- Memory of a computer stores data and program for
processing - Memory Addressing
- Organizing as a sequence of addressable cells,
each with a unique address - Memory Allocation
- Assignment of specific memory address to data and
other elements of system software
34Back to Data Manipulation
- Well try to understand certain terminology we
will be using from here on
35Clock rate
- If CPU is the heart of a computer system, then
clock-rate is like its heartbeat. - System clock generates timing pulses, clock
ticks, and sends it over separate line on control
bus to all other components of the computer - CPU actions are timed according to this system
clock - CPU is designed to do several actions in one tick
of this system clock - Typical CPU actions are fetch instruction or
data and execute instruction
36Clock Rate Megahertz (MHz)
- Definition number of cycles per second is known
as the frequency, expressed in units of Hertz
after the physicist Hertz - Frequency of the system clock is measured in
millions of cycles per second, or, Megahertz, MHz
for short. - Typical desktops have a few hundred MHz clock
speed - Typically, CPUs performance is measured in terms
of Millions of Instructions per Second (MIPS)
37Measuring CPU Performance
- So, is a 500 MHz machine better than a 200 MHz
machine? - Cannot compare unless we specify they are the
same CPU architecture design - Total number of clock cycles needed to complete
an instruction differs between CPU designs, so
clock speed alone does not help in comparing
performance - A meaningful comparison is via Benchmarking
like, comparing how different machines execute
the same program
38Stored Program Concept
- Early computers were designed to perform a single
task at a time and needed to be re-wired for
different tasks - Code breaking
- Artillery ballistics
- The instructions were hard-wired
- More like a music box that plays the same music
every time - To be more useful, needed the flexibility of a
CD-changer
39Program vs. Data
- Flexibility in early computers was achieved by
realizing that programs (i.e., instructions to
manipulate data) can be treated the same way as
data can be treated - Programs can be encoded and stored in memory,
just like data - Rather than fetching just data and executing the
hard-wired instructions, store both instructions
and data and fetch them both as needed to execute
the task
40Von Neumann Machines
- Though credited to von Neumann, stored-program
concept was apparently developed by researchers
at J.P Eckert and Moore School of Electrical
Engineering, UPenn. - Essential features
- Single general-purpose processor
- Stored programs
- Sequential processing of instructions
- Alternating instruction and execution cycle
41Back to Data Manipulation
- A closer look at what happens inside a machine
42Machine Language Objectives
- Machine Language
- Instruction Set architecture
- RISC, CISC
- Op-code, Operands
- Machine Cycle
- Von Neumann Machine Primitives
- Data-Copying (Transfer) operations
- Data Transformation operations
- Sequence Control
43Machine Language
- Instruction Set is the basic set of operations
that a computer can perform - Processor hardware implements a pre-defined
instruction set - Any operation not defined in the processor
instruction set must be implemented in software
as a sequence of these pre-defined processor
instruction set - Fundamental design trade-off size and
complexity of the instruction set
44Basic Instructions
- Data Transfer
- copy data from one location in memory to another
- Store data in a specified location in memory
- Examples Load, Store, Move
- Involves registers, main memory, system bus
- Data Transformation
- Perform arithmetic/logic operations on data
- Examples AND, OR, XOR, ADD, SHIFT, ROTATE
- Involves ALU, registers
- Sequence Control
- Change the order in which instructions are
executed - Dont really deal with data manipulation
directly, but rather with instruction execution - Examples Branch (JUMP), Halt
- Involves special-purpose registers, control unit
45Machine Instructions CISC
- CISC Complex Instruction Set Computing
- Early machines had limited memory as memory was
very expensive, so had to keep programs short - One response to memory issues is Design complex
set of instructions that combine several simpler
instructions - Example, Floating-point arithmetic rather than
handling whole and fraction parts separately and
combining the result using a sequence of simple
operations, implement a complex operation - Such implementations might tend to be slow,
depending on their design - Intels Pentium series of processors use CISC
46Machine Instructions RISC
- RISC Reduced Instruction Set Computing
- No complex instructions
- Fixed-length instructions that are shorter than
CISC - Reduced circuitry, compared to CISC, means lower
cycle times, therefore higher speeds - PowerPC series of processors use RISC
- Uses Registers for data transformations rather
than rely on main memory
47Trade-offs RISC vs. CISC
- CISC
- Typically, combines data copying and data
transformation operations in the same instruction - Complex instructions implemented as micro program
- Example Add(addr1, addr2, addr3)
- Simply means
- Get contents from memory addr1
- Get contents from memory addr2
- Add the two contents
- Store the result in memory addr3
- One instruction does it all!
- Advantage
- Might use only one processor cycle to complete
executing the above instruction - Other similar processor cycle savings possible in
other instructions - Saves memory space
- RISC
- Separate data copying and data transformation
instructions - Simple instructions implemented in hardware
- Same CISC example
- LOAD addr1
- LOAD addr2
- ADD contents of addr1 and addr2
- STORE result in addr3
- FOUR simple instructions to get the same job done
- Advantage
- Simple circuitry, means smaller physical
machine-size - Lower cycle times, means higher speeds
- Uses registers which have faster response time
than accessing main memory via bus
48Hypothetical Question
- Question Which of the two CPU Instruction set
designs (RISC or CISC) would take more time to
finish the same task that of adding two numbers
and storing the result? - Assume
- a RISC processor cycle takes 1 second to complete
each one of its instructions - a CISC processor also takes 1 second to complete
each one of its instructions - one CISC instruction as in the example we just
saw takes one processor cycle to complete - viz., Add(addr1, addr2, addr3)
- each one of the RISC instructions as in the
example we just saw in previous slide takes the
one processor cycle to complete - viz., LOAD addr1, LOAD addr2, ADD (addr1,
addr2), STORE addr3
49Summary Instruction Set Architecture
- Cannot compare performance without knowing the
specific CPU Instruction Set - CISC uses less memory than RISC but involves
complex circuitry - RISC uses simpler circuitry than CISC but more
memory - Is program execution faster with complex
instructions but relatively slow processor ? - Or, is program execution faster with simpler
instructions with a faster processor?
50Machine Instruction Format
- Just like data and programs, an instruction is
represented as a string of bits (0s and 1s) - Different computers use different format to
specify machine instructions - Each instruction consists of two parts
- Op-code, short for operation-code
- Operands
- Op-code indicates which operation to perform
like AND, XOR, STORE, LOAD, SHIFT - Operand indicates all supplementary details
needed for executing the op-code like the data
location, or data value
51Machine Cycle Fetch-Decode-Execute
- Machine Cycle is a three-step process that goes
on and on until instructed to Stop - Fetch
- Decode
- Execute
52Machine Cycle Fetch-Decode
- Fetch step
- CU asks Main memory to give the instruction
indicated in Program Counter, and increments the
program counter - CU puts the current instruction in Instruction
Register - Decode step
- CU separates the instruction in the Instruction
Register to op-code and operands - CU internally signals ALU to start executing
53Machine Cycle Execute
- Execute step
- CU activates appropriate logic circuitry of ALU
to execute the instruction - Data passes through the circuitry and produces an
output (result) for that instruction - The result is placed in a register or CU writes
it to main memory - CU again starts the Fetch step!
- Remember that CU already incremented the Program
Counter
54Machine cycle Flow of control
55Brookshear Machine
- 16 general-purpose registers
- numbered 0 through 15
- Hex notation numbered 0 through F
- 256 byte-size main memory cells (i.e., 8 bits
each) - numbered 0 through 255
- Hex notation numbered 00 through FF
- 12 simple instructions
- encoded using 16 bits (2 bytes) per instruction
- Hex notation 4 hex digits per instruction
- One hex digit for op-code
- Other three hex digits for Operands
56Test Your Knowledge
- The Brookshear machine instruction provides 4
bits to identify a general purpose register - How many registers can the machine have?
- How many bits in a Brookshear byte?
- The Brookshear machine instruction provides for a
byte-size memory address that addresses - How many memory locations are addressable
directly?
57Test Your Knowledge Answers
- 4 bit register addresses
- Maximum combination is
- 2 x 2 x 2 x 2 24 16 possible registers
- 8 bits in a Brookshear byte
- 1 byte memory address
- Brookshear byte is 8 bits
- Maximum combination is
- 28 256 possible memory locations
58Instruction format for Brookshear machine
59Brookshear Instruction Example
Actual 16-bit patterns per instruction
0011
0101
1010
0111
Hex form (4 bits)
3
5
A
7
Operand
Op-code
60Brookshear Machine Architecture
Figure 2.4, Page 79 The architecture of the
machine described in Appendix C
61Brookshear machine Appendix C
- Study Appendix C and do the following exercise
(5 minutes) - How many Data Copying/Transfer instructions are
encoded for this machine? - How many Data Transformation (i.e.,
Arithmetic/Logic) instructions are encoded for
this machine? - How many Control instructions are encoded for
this machine? - Complete the table by listing the Brookshear
machine instructions in the appropriate columns
62Brookshear machine Appendix C
- Answers
- FOUR Data Copying/Transfer instructions are
encoded for this machine - SIX Data Transformation (i.e., Arithmetic/Logic)
instructions are encoded for this machine - TWO Control instructions are encoded for this
machine - Complete the table by listing the Brookshear
machine instructions in the appropriate columns
Study the difference between LOAD1 and
LOAD2 ADD1 and ADD2
63Exercise trace by hand
- Using Appendix C, trace the following machine
instruction by hand and write the results, given
the contents of main memory as in the table - 156C
- 166D
- 5056
- 306E
- C000
Main memory
64Program Execution Examples
- Finally you get to use a computer!!
- Boot up your machine and open a browser to the
following address - http//cs01.pcc.edu/eodekirk/cs160/bsmachine/mach
inegui.html
65Brookshear Machine Simulator
66Sample Program
67Program 1
68Communicating with other devices
Figure 2.13 Controllers attached to a machines
bus