Title: WICS TP Chapter 1
1The Whirlwind Tour
Chapter 1a
2Transactions Where It All Started
Cuneiform documents now number about half a
million, three- quarters of them more or less
directly related to the history of law - dealing,
as they do, with contracts, acknowledgment of
debts, receipts, inventories, and accounts, as
well as containing records and minutes of
judgments rendered in courts, business letters,
administrative and diplomatic correspondence,
laws, international treaties, and other official
transactions. The total evidence enables the
historian to reach back as far as the beginnings
of writing, to the dawn of history. ...
Moreover, because of the inconvenience of
writing in stone or clay, Mesopotamians wrote
only when economic or political necessity
demanded it. (Encyclopaedia Britannica, 1974
edition)
3From Transactions to Transaction Processing
Systems - I
The Sumerian way of doing business involved two
components
- Database. An abstract system state, represented
as marks on clay tablets, was maintained. Today,
we would call this the database. - Transactions. Scribes recorded state changes with
new records (clay tablets) in the database.
Today, we would call these state changes
transactions.
4From Transactions to Transaction Processing
Systems - II
The real state is represented by an abstraction,
called the database, and the transformation of
the real state is mirrored by the execution of a
program, called a transaction, that transforms
the database.
5Transactions Are In ...
Communications
- Each time you make a phone call, there is a
call setup transaction that allocates some
resources to your conversation the call teardown
is a second transaction, freeing those resources.
The call setup increasingly involves complex
algorithms to find the callee (800 numbers could
be anywhere in the world) and to decide who is to
be billed (800 and 900 numbers have complex
billing). The system must deal with features like
call forwarding, call waiting, and voice mail.
After the call teardown, billing may involve many
phone companies.
6Transactions Are In ...
Finance
Each time you purchase gas using a credit card,
the point-of-sale terminal connects to the credit
card company's computer. In case that fails, it
may alternatively try to debit the amount to your
account by connecting to your bank. This
generalizes to all kinds of point-of-sale
terminals such as cash registers, ATMs,
etc. When banks balance their accounts with each
other (electronic fund transfer), they use
transactions for reliability and recoverability.
7Transactions Are In ...
Travel
Making reservations for a trip requires many
related bookings and ticket purchases from
airlines, hotels, rental car companies, and so
on. From the perspective of the customer, the
whole trip package is one purchase. From the
perspective of the multiple systems involved,
many transactions are executed One per airline
reservation (at least), one for each hotel
reservation, one for each car rental, one for
each ticket to be printed, on for setting up the
bill, etc. Along the way, each inquiry that may
not have resulted in a reservation is a
transaction, too.
8Transactions Are In ...
Manufacturing
Order entry, job and inventory planning and
scheduling, accounting, and so on are classical
application areas of transaction processing.
Computer integrated manufacturing (CIM) is a key
technique for improving industrial productivity
and efficiency. Just-in-time inventory control,
automated warehouses, and robotic assembly lines
each require a reliable data storage system to
represent the factory state.
9Transactions Are In ...
Real-Time Systems
This application area includes all kinds of
physical machinery that needs to interact with
the real world, either as a sensor, or as an
actor. Traditionally, such systems were custom
made for each individual plant, starting from the
hardware. The usual reason for that was that 20
years ago off-the-shelf systems could not
guarantee real-time behavior that is critical in
these applications. This has changed, and so has
the feasibility of building entire systems from
scratch. Standard software is now used to ensure
that the application will be portable.
10A Transaction Processing System
A transaction processing system (TP-system)
provides tools to ease or automate application
programming, execution, and administration of
complex, distributed applications. Transaction
processing applications typically support a
network of devices that submit queries and
updates to the application. Based on these
inputs, the application maintains a database
representing some real-world state. Application
responses and outputs typically drive real-world
actuators and transducers that alter or control
the state. The applications, database, and
network tend to evolve over several decades.
Increasingly, the systems are geographically
distributed, heterogeneous (they involve
equipment and software from many different
vendors), continuously available (there is no
scheduled downtime), and have stringent response
time requirements.
11ACID Properties First Definition
- Atomicity A transactions changes to the state
are atomic either all happen or none happen.
These changes include database changes, messages,
and actions on transducers. - Consistency A transaction is a correct
transformation of the state. The actions taken as
a group do not violate any of the integrity
constraints associated with the state. This
requires that the transaction be a correct
program. - Isolation Even though transactions execute
concurrently, it appears to each transaction T,
that others executed either before T or after T,
but not both. - Durability Once a transaction completes
successfully (commits), its changes to the state
survive failures.
12Structure of a Transaction Program
- The application program declares the start of a
new transaction by invoking BEGIN_WORK(). - All subsequent operations will be covered by the
transaction. Eventually, the application program
will call COMMIT_WORK(), if a new consistent
state has been reached. This makes sure the new
state becomes durable. - If the application program cannot complete
properly (violation of consistency constraints),
it will invoke ROLLBACK_WORK(), which appeals to
the atomicity of the transaction, thus removing
all effects the program might have had so far. - If for some reason the application fails to call
either commit or rollback (there could be an
endless loop, a crash, a forced process
termination), the transaction system will
automatically invoke ROLLBACK_WORK() for that
transaction.
13The End Users View of a Transaction Processing
System
14The Administrator's/Operators View of a TP System
15Performance Measures of Interactive Transactions
- Performance/ Small/Simple Medium Complex
- Transaction
- __________________________________________________
______________ - Instr./transaction 100k 1M 100M
- Disk I/O / TA 1 10 1000
- Local msgs. (B) 10 (5KB) 100 (50KB)
1000 (1MB) - Remote msgs. (B) 2 (300B) 2 (4KB) 100
(1MB) - Cost/TA/second 10k/tps
100k/tps 1M/tps - Peak tps/site 1000 100 1
16Client-Server Computing The Classical Idea
17Client-Server Computing The CORBA Idea
Client on WS Presentation Services etc
Object Implementation Jims Mailbox
IDL Stub
IDL Skeleton
Request Delete
Object Request Broker
18Client-Server Computing The WWW Idea
Java- applet
HTTP
WWW- Browser
Server
JDBC- driver code
proprietary protocol
Java-Applet Java Database Connection (JDBC) Dr
iver Code
prop. protocol
JDBC-ODBC- bridge
ODBC driver
Database Server
JDBC network driver
public protocol
JDBC driver
(e.g. TCP/IP)
19Using Transactional Remote Procedure Calls (TRPCs)
20Terms We Have Introduced So Far
- Resource manager The system comes with an array
of transactional resource managers that provide
ACID operations on the objects they implement.
Database systems, persistent programming
languages, and queue managers are typical
examples. - Durable state Application state represented as
durable data stored by the resource managers. - TRPC Transactional remote procedure calls allow
the application to invoke local and remote
resource managers as though they were local. They
also allow the application designer to decompose
the application into client and server processes
on different computers. - Transaction program Inquiries and state
transfor-mations are written as programs in
conventional or specialized programming
languages. The programmer brackets the successful
execution of the program with a Begin-Commit pair
and brackets a failed execution with a
Begin-Rollback pair.
21Terms We Have Introduced So Far
- Atomicity At any point before the commit, the
application or the system may abort the
transaction, invoking rollback. If the
transaction is aborted, all of its changes to
durable objects will be undone (reversed), and it
will be as though the transaction never ran. - Consistency The work within a Begin-Commit pair
must be a correct transformation. - Isolation While the transaction is executing,
the resource managers ensure that all objects the
transaction reads are isolated from the updates
of concurrent transactions. - Durability Once the commit has been successfully
executed, all the state transformations of that
transaction are made durable and public.
22The World According to the Resource Manager
23Where To Split Client/Server?
Thin
Fat
Presentation
Flow Control
Application Logic (business objects)
Data Access
Thin
Fat
24Client/Server Infrastructure
Client
Server
Middleware
Objects Group- ware TP-Mon. DBMS OS
Files
GUI OOUI System Mgmt. OS
SQL
ORB
TRPC
Mail
Security
WWW
Transport
etc.
25Transactional Core Services
26The X/Open TP-Model
27The X/Open Distributed Transaction Processing
Model
28The OTS Model
transmitted with request
transaction originator
recoverable server
creation termination
invocation
commit coordination
Transaction service
29Transaction Processing System Feature List
- Application development features
- Application generators graphical programming
interfaces screen painters compilers CASE
tools test data generators starter system with
a complete set of administrative and operations
functions, security, and accounting. - Repository features
- Description of all components of the system,
both hardware and software. Description of the
dependencies among components (bill-of-material).
Description of all changes to all components to
keep track of different versions. The repository
is a database. Its role in the system must be
complete, extensible, active and allow for local
autonomy. - TP-Monitor Features
- Process management server classes
transactional remote procedure calls
request-based authentication and authorization
support for applications and resource managers in
implementing ACID operations on durable objects.
30Transaction Processing System Feature List
- Data communications features
- Uniform I/O interfaces device independence
virtual terminal screen painter support support
for RPC and TRPC support for context-oriented
communication (peer-to-peer). - Database features
- Data independence data definition data
manipulation data control data display
database operations. - Operations features
- Archiving reorganization diagnosis recovery
disaster recovery change control security
system extension. - Education and testing features
- Imbedded education online documentation
training systems national language features
test database generators test drivers.
31Data Communications Protocols
32Presentation Management
33SQL Data Definition
34SQL Data Manipulation
35Summary of Chapter 1
- A transaction processing system is a large web of
application generators, system design and
operation tools, and the more mundane language,
database, network, and operations software. - The repository and the applications that maintain
it are the mechanisms needed to manage the TP
system. The repository is a transaction
processing application. - It represents the system configuration as a
database and supplies change control by
transactions that manipulate the configuration
and the repository. - The transaction concept, like contract law, is
intended to resolve the situation when exceptions
arise. The first order of business in designing a
system is, therefore, to have a clear model of
system failure modes. What breaks? How often do
things break?
36Basic Terminology
Chapter 1b
37A Word About Words (Chapter 2)
Humpty Dumpty When I use a word, it means
exactly what I chose it to mean nothing more
nor less. Alice The question is, whether you
can make words mean so many different
things. Humpty Dumpty The question is, which
is to be master, thats all.
Lewis Carroll
38Basic Computer Terms
To get any confusion that might be caused by the
many synonyms in our field out of the way, let us
adopt the following conventions for the rest of
this class domain data type ... field
column attribute ... record tuple object
entity ... block page frame slot
... file data set table ... process task
thread actor ... functionrequestmethod...
All the other terms and definitions we need
will be briefly introduced and explained during
the session.
39Basic Hardware Architecture I
In Bell and Newells classic taxonomy, hardware
consists of three types of modules
Processors, memory, and communications (switches
or wires). Processors execute instructions from
a program, read and write memory, and send data
via communication lines. Computers are generally
classified as supercomputers, mainframes,
minicomputers, workstations, and personal
computers. However, these distinctions are
becoming fuzzy with current shifts in
technology.
40Basic Hardware Architecture II
Todays workstation has the power of yesterdays
mainframe. Similarly, todays WAN (wide area
network) has the communications bandwidth of
yesterdays LAN (local area network). In
addition, electronic memories are growing in size
to include much of the data formerly stored on
magnetic disk. These technology trends have
deep implications for transaction processing.
41Basic Hardware Architecture III
- Distributed processing Processing is moving
closer to the producers and consumers of the data
(workstations, intelligent sensors, robots, and
so on). - Client-server These computers interact with each
other via request-reply protocols. One machine,
called the client, makes requests to another,
called the server. Of course, the server may in
turn be a client to other machines. - Clusters Powerful servers consist of clusters of
many processors and memories, cooperating in
parallel to perform common tasks.
42Basic Hardware Architecture IV
43Memories - The Economic Perspective I
- The processor executes instructions from virtual
memory, and it reads and alters bytes from the
virtual memory. The mapping between virtual
memory and real memory includes electronic
memory, which is close to the processor,
volatile, fast, and expensive, and magnetic
memory, which is "far away" from the processor,
non-volatile, slow, and cheap. The mapping
process is handled by the operating system with
some hardware assistance. - Memory performance is measured by its access
time - Given an address, the memory presents the data
at some later time. The delay is called the
memory access time. Access time is a combination
of latency (the time to deliver the first byte),
and transfer time (the time to move the data).
Transfer time, in turn, is determined by the
transfer size and the transfer rate. This
produces the following overall equation - memory access time latency ( transfer size /
transfer rate )
44Memories - The Economic Perspective II
- Memory price-performance is measured in one of
two ways - Cost/byte. The cost of storing a byte of data in
that media. - Cost/access. The cost of reading a block of data
from that media. - This is computed by dividing the device cost by
the number of accesses per second that the
device can perform. - The actual units are cost/access/second, but the
time unit is implicit in the metrics name. - These two cost measures reflect the two different
views of a memorys purpose - it stores data, and
- it receives and retrieves data.
45Memories- The Economic Perspective III
Typical large system capacity
46Memories- The Economic Perspective VI
/ MB
47Magnetic Memory
- There are two types of magnetic storage media
disk and tape. Disks rotate, passing the data in
the cylinder by the electronic read-write heads
every few milliseconds. This gives low access
latency. The disk arm can move among cylinders in
tens of milliseconds. Tapes have approximately
the same storage density and transfer rate, but
they must move long distances if random access is
desired. Consequently, tapes have large random
access latencieson the order of seconds. - Disk Access Time Seek_Time
- Rotational_Latency
- (Transfer_Size/ Transfer_Rate)
48Magnetic Memory
- Compare the times required for two access
patterns to 1MB stored in 1000 blocks on disk - Sequential access Read or write sectors x, x
1, ..., x 999 in ascending order. This
requires one seek (10 ms) and half a rotation (5
ms) before the data in the cylinder begins
transferring the megabyte at 10 MBps (the
transfer takes 100 ms, ignoring one-cylinder
seeks). - The total access time is 115ms.
- Random access Read the 1000 sectors x, ..., x
999 in random order. In this case, each read
requires a seek (10 ms), half a rotation (5 ms),
and then the 1 kb transfer (.1 ms). Since there
are 1000 of these events, the total access time
is 15.1 seconds.
49Memory Hierarchies
50Memory Hierarchies
- The hierarchy uses small, fast, expensive cache
memories to cache some data present in larger,
slower, cheaper memories. - If hit ratios are good, the overall memory speed
approximates the speed of the cache. - At any level of the memory hierarchy, the hit
ratio is defined as - hit ratio references satisfied by cache / all
references to cache - Suppose a cache memory with access time C has hit
rate H, and suppose that on a miss the secondary
memory access time is S. Further, suppose that C
.01 S. The effective access time of the cache
will be as follows - Effective memory access time H C (1 - H)
S - H (.01 S) ( 1 - H) S
- (1 - .99 H) S
- (1 - H) S
51The Five Minute Rule
- Assume there are no special response time
(real-time) requirements the decision to keep
something in cache is, therefore, purely
economic. - To make things simple, suppose that data blocks
are 10 KB. - At 1995 prices, 10 KB of main memory cost about
1. Thus, we could keep the data in main memory
forever if we were willing to spend a dollar. - With 10 KB of disk costing only .10, we could
save .90 if we kept the 10 KB on disk. - In reality, the savings are not so great if the
disk data is accessed, it must be moved to main
memory, and that costs something. How much, then,
does a disk access cost? - A disk, along with all its supporting hardware,
costs about 3,000 (in 1995) and delivers about
30 acc./sec. the cost, therefore, is about 100.
At this rate, if the data is accessed once a
second, it costs 100.10 to store it on disk
(disk storage and disk access costs). That is
considerably more than the 1 to store it in main
memory. - The break-even point is about one access per 100
seconds. At that rate, the main memory cost is
about the same as the disk storage cost plus the
disk access costs. At a more frequent access
rate, diskstorage is more expensive. At a less
frequent rate, disk storage is cheaper.
Anticipating the cheaper main memory that will
result from technology changes, this observation
is called the five-minute rule rather than the
two-minute rule.
52The Five Minute Rule
Keep a data item in electronic memory if its
access frequency is five minutes or higher
otherwise keep it in magnetic memory. Similar
arguments apply to objects stored on tape and
cached on disk. Given the object size, the cost
of cache, the cost of secondary memory, and the
cost of accessing the object in secondary memory
once per second, the frequency at the break-even
point in units of accesses per second (a/s) is
given by the following formula Frequency
((Cache_Cost/Byte - Secondary_Cost/Byte) .
Object_Bytes) / (Object_Access_Per_Second_Cost)
a/s
53The Rules of Exponential Growth
Electronic memory MemoryChipCapacity(year) 4
Kb/chip for year in
1970...2000 Moores Law Magnetic
memory MagneticAreaDensity(year) 10
Mb/inch2 for year
1970...2000 Hoaglands Law Processors SunMi
ps(year) 2 MIPS for year in
1984...2000 Joys Law
((year-1970)/3)
((year-1970)/10)
(year-1984)
54Communication Hardware
The early 90s
The definition of the four kinds of networks by
their diameters. These diameters imply certain
latencies (based on the speed of light). In 1990,
Ethernet (at 10 Mbps) was the dominant LAN.
Metropolitan networks typically are based on 1
Mbps public lines. Such lines are too expensive
for transcontinental links at present most
long-distance lines are therefore 50 Kbps or
less. As you will get from the news, these things
are changing fast.
55Communication Hardware
Scenario 2000
Point-to-point bandwidth likely to be common
among computers by the year 2000.
56Processor Architectures
57Processor Architectures
- Shared nothing In a shared-nothing design, each
memory is dedicated to a single processor. All
accesses to that data must pass through that
processor. Processors communicate by sending
messages to each other via the communications
network. - Shared global In a shared-global design, each
processor has some private memory not accessible
to other processors. There is, however, a pool of
global memory shared by the collection of
processors. This global memory is usually
addressed in blocks (units of a few kilobytes or
more) and is RAM disk or disk. - Shared memory In a shared-memory design, each
processor has transparent access to all memory.
If multiple processors access the data
concurrently, the underlying hardware regulates
the access to the shared data and provides each
processor a current view of the data.
58Address Spaces
59Address Spaces
- Memory segmentation and sharing A process
executes in an address spacea paged, segmented
array of bytes. Some segments may be shared with
other address spaces. The sharing may be
execute-only, read-only, or read-write. Most of
the segment slots are empty (lightly shaded
boxes), and most of the occupied segments are
only partially full of programs or data. - To simplify memory addressing, the virtual
address space is divided into fixed-size segment
slots, and each segment partially fills a slot. - Typical slot sizes range from 224 to 232
bytes. This gives a two-dimensional address
space, where addresses are segment_number,
byte. Again, segments are often partitioned into
virtual memory pages, which are the unit of
transfer between main and secondary memory. If an
object is bigger than a segment, it can be mapped
into consecutive segments of the address.
60Processes
- A process is a virtual processor. It has an
address space that contains the program the
process is executing and the memory the process
reads and writes. One can imagine a process
executing Java programs statement by statement,
with each statement reading and writing bytes in
the address space or sending messages to other
processes. - Processes provide an ability to execute programs
in parallel they provide a protection entity
and they provide a way of structuring
computations into independent execution streams.
So they provide a form of fault containment in
case a program fails. - Processes are building blocks for transactions,
but the two concepts are orthogonal. A process
can execute many different transactions over
time, and parts of a single transaction may be
executed by many processes. - Each process executes on behalf of some user, or
authority, and with some priority. The authority
determines what the process can do which other
processes, devices, and files the process can
address and communicate with. The process
priority determines how quickly the processs
demand for resour-ces will be serviced if other
processes make competing demands. Short tasks
typically run with high priority, while large
tasks are given lower priority.
61Protection Domains
- There are two ways to provide protection
- Process protection domain Each subsystem
executes as a separate process with its own
private address space. Applications execute
subsystem requests by switching processes, that
is, by sending a message to a process. - Address space protection domain A process has
many address spaces one for each protected
subsystem and one for the application.
Applications execute subsystem requests by
switching address spaces. The address space
protection domain of a subsystem is just an
address space that contains some of the callers
segments in addition, it contains program and
data segments belonging to the called subsystem.
A process connects to the domain by asking the
subsystem or OS kernel to add the segment to the
address space. Once connected, the domain is
callable from other domains in the process by
using a special instruction or kernel call.
62Protection Domains
A process may have many protection domains.
63Threads
- There is a need for multiple processes per
address space - For example, to scan through a data stream, one
process is appointed the producer, which reads
the data from an external source, while the
second process processes the data. Further
examples of cooperating processes are file
read-ahead, asynchronous buffer flushing, and
other housekeeping chores in the system. - Processes can share the same address space simply
by having all their address spaces point to the
same segments. Most operating systems do not make
a clean distinction between address spaces and
processes. Thus a new concept, called a thread or
a task, is introduced. - But note Several operating systems do not use
the term process at all. For example, in the Mach
operating system, thread means process, and task
means address space in MVS, task means process,
and so on.
64Threads
- The term thread often implies a second property
inexpensive to create and dispatch. Threads are
commonly provided by some software that found the
operating system processes to be too expensive to
create or dispatch. The thread software
multiplexes one big operating system process
among many threads, which can be created and
dispatched hundreds of times faster than a
process. - The term thread is used in the following to
connote these light-weight processes. Unless this
light-weight property is intended, process is
used. Several threads usually share a common
address space. Typically, all the threads have
the same authorization identifier, since they are
part of the same address space domain, but they
may have different scheduling priorities.
65Messages and Sessions
- There are two styles of communication among
processes - Datagrams The sender of a message determines the
recipient's address (e.g. the process name) and
constructs an envelope consisting of the sender's
name and address, the recipient's name and
address, and the message text. This envelope is
delivered to the capable hands of the
communication system. It is analogous to sending
letters by mail. - Sessions Before any messages are sent, a fixed
connection is established between sender and
receiver, a so-called session. Once it has been
established, both parties can send and receive
messages via this session. This symmetry is often
referred to as "peer-to-peer". Establishing a
session requires a datagram. A session must at
some point be closed down explicitly. It is
analogous to a phone conversation.
66Advantages of Sessions
- Shared state A session represents shared state
between the client and the server. A datagram
might go to any process with the designated name,
but a session goes to a particular instance of
that name. - Authorization Processes do not always trust each
other. The server often checks the clients
credentials to see that the client is authorized
to perform the requested function. The
authentication protocols require multi-message
exchanges. Once the session key is established,
it is shared state. - Error correction Messages flowing in each
session direction are numbered sequentially.
These sequence numbers can detect lost messages
and duplicate messages. - Performance The operations described are fairly
costly. Each of the steps often involves several
messages. By establishing a session, this
information is cached.
67Clients and Servers
- The question of how computations consisting of
many interacting processes should be structured
has no simple answer. Currently, two styles are
particularly popular peer-to-peer and
client-server. - The debate about which style is "better" often
creates the impression that they are radically
different. But in reality, peer-to-peer is more
general and more complex, and it subsumes
client-server. Here is a brief characterization - Peer-to-peer The two processes are independent
peers, each executing its computation and
occasionally exchanging data with the other. - Client-server The two processes interact via
request-reply exchanges in which one process, the
client, makes a request to a second process, the
server, which performs this request and replies
to the client.
68Clients and Servers
- The limitation of the client-server model lies in
the fact that it implies a synchronous pattern of
one request/one response. - There are, however, cases in which one request
generates thousands of replies, or where
thousands of requests generate one reply.
Operations that have this property include
transferring a file between the client and server
or bulk reading and writing of databases. In
other situations, a client request generates a
request to a second server, which, in turn,
replies to the client. Parallelism is a third
area where simple RPC is inappropriate. Because
the client-server model postulates synchronous
remote procedure calls, the computation uses one
processor at a time. However, there is growing
interest in schemes that allow many processes to
work on problems in parallel. The RPC model in
its simplest form does not allow any parallelism.
69Remote Procedure Calls (RPCs)
70Naming
- Naming has to do with the problem of how a client
denotes a server it wants to invoke. Typical
naming schemes distinguish between an object's
name, its address, and its location. The name is
an abstract identifier for the object, the
address is the path to the object, and the
location is where the object is. - An object can have several names. Some of these
names may be synonyms, called aliases. Let us say
that Bruce and Lindsay are two aliases for Bruce
Lindsay. For this to be explicit, all names,
addresses, and locations must be interpreted in
some context, called a directory. For example, in
our RPC context, Bruce means Bruce Nelson, and in
our publishing context, Bruce means Bruce Spatz.
Within the 408 telephone area, Bruce Lindsays
address is 927-1747, and outside the United
States it is 1-408-927-1747.
71Name Servers
- Names are grouped into a hierarchy called the
name space. An international commission has
defined a universal name space standard, X.500,
for computer systems. The commission administers
the root of that name space. Each interior node
of the hierarchy is a directory. A sequence of
names delimited by a period (.) gives a path name
from the directory to the object. - No one stores the entire name spaceit is too
big, and it is changing too rapidly. Certain
processes, called name servers, store parts of
the name space local to their neighborhood in
addition, they store a directory of more global
name servers.
72Authentication Techniques
- Passwords are the simplest technique. The client
has a secret password, a string of bytes known
only to it and the server. The client sends his
password to the server to prove the clients
identity. A second password is then needed to
authenticate the server to the client. Thus, two
passwords are required, and they must be sent
across the wire. - Challenge-response uses only one password or key.
In this scheme, the client and the server share a
secret encryption key. The server picks a random
number, N, and encrypts it with the key as EN.
The server sends EN to the client and challenges
the client to decrypt it using the secret key. If
the client responds with N, the server believes
the client knows the secret encryption key. The
client can also authenticate the server by
challenging it to decrypt a second random number.
The shared secret is stored at both ends, but
random numbers are sent across the wire.
73Authentication Techniques
- Public key system Each authid has a pair of
keysa public encryption key, EK, and a private
decryption key, DK. The keys are chosen so that
DK(EK(X)) X, but knowing only EK and EK(X) it
is hard to compute X. Thus, a processs ability
to compute X from EK(X) is proof that the process
knows the secret DK. Each authid publishes its
public key to the world. Anyone wanting to
authenticate the process as that authid goes
through the challenge protocol The challenger
picks a random number X, encrypts it with the
authids public key EK, and challenges the
process to compute X from EK(X). Secrets are
stored in one place only, and they do not go
across the wire.
74Scheduling
- The purpose of scheduling is to make sure all
requests get processed, i.e. are assigned to a
specific server process. There are basically two
additional constraints - Short response times The requests should not
wait longer than necessary before they get
serviced. - Economic usage of resources The required
throughput should be achieved with the minimum
number of resources (processors, nodes, links,
etc.). - Throughput and response time at resource
utilization r are related by the following
formula - Average_Response_Time(r) (1/ (1 - r))
Service_Time
75The Scheduling Problem
76File Organizations
77SQL in a Distributed Environment
78Software Performance
79Protocol Standards
80Relevant FAP-Standards
- CSMA/CD, Token Ring, etc. Low-level protocols
that specify how bits are physically transmitted
across a shared medium. - IP/TCP, NetBIOS, HTTP Transport level protocols.
- LU6.2 SNAs peer-to-peer protocol that allows
both session oriented and client-server-style
communication under transaction protection. - OSI-TP ISOs rendering of a protocol that
provides a functionality very similar to LU6.2. - ASN.1 Protocol for exchanging data formatting
and structuring information. Required for RPCs in
a heterogeneous environment. - DRDA Interoperability standard for IBM
SQL-systems. - ODBC, JDBC Interoperability standards for
general SQL-systems.
81Relevant API-Standards
- SQL Portability standard for accessing
relational databases (lots of proprietary
extensions). - APPC, CPI-C Two of IBMs APIs for the LU6.2
protocol. - X/Open-XA, X/Open-XA, etc. APIs by the X/Open
consortium on ISOs OSI-TP protocols. - IDL OMGs interface definition language to let
objects be integrated through an object request
broker. - STDL Language for programming TP-applications
based on the ACMS TP-monitor. - Java The webs favorite programming language
comes with its own FAP-component.
82OSI Standards and X/Open APIs
83A Last Glance at TP-Standards
Each resource manager (RM) registers with its
local transaction manager (TM). Applications
start and commit transactions by calling their
local TM. At commit, the TM invokes every
participating RM. If the transaction is
distributed, the communications manager informs
the local and remote TM about the incoming or
outgoing transaction, so that the two TMs can use
the OSI-TP protocol to commit the transaction.
84Summary
- Transaction processing systems comprise all parts
of a system, software and hardware. - Building such a system requires to consider
end-to-end arguments at all levels of
abstraction. - The performance of distributed TP systems is
influenced by the hardware architecture (what is
shared), by software issues (which protocols are
used), and by configuration aspects (what limits
scaleability). - The multitude of those influences gives rise to a
constant dilemma Should one restrict the variety
to few (proprietary) components for better tuning
and performance, or should one embrace all the
standards for openness - at the risk of poor
scaleability and performance?