LinuxHA Release 2 An Overview - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

LinuxHA Release 2 An Overview

Description:

High-Availability Best Practices IV October, 2005 ... High-Availability Best Practices IV October, 2005. What Can HA Clustering Do For You? ... – PowerPoint PPT presentation

Number of Views:87

Avg rating:3.0/5.0

Slides: 42

Provided by: linu

Category:

more less

Transcript and Presenter's Notes

Title: LinuxHA Release 2 An Overview

1
Linux-HA Release 2 An Overview

Alan Robertson
Project Leader Linux-HA project
alanr_at_unix.sh
(a.k.a. alanr_at_us.ibm.com)
IBM Linux Technology Center

2
Agenda

High-Availability (HA) Clustering?
What is the Linux-HA project?
Linux-HA applications and customers
Linux-HA release 1 / Release 2 /Feature
Comparison
Release 2 Details
Request for Feedback
DRBD an important component
Thoughts about cluster security

3
What Is HA Clustering?

Putting together a group of computers which trust
each other to provide a service even when system
components fail
When one machine goes down, others take over its
work
This involves IP address takeover, service
takeover, etc.
New work comes to the remaining machines
Not primarily designed for high-performance

4
High Availability Through Redundancy and
Monitoring

Redundancy eliminates Single Points Of Failure
(SPOF)
Monitoring determines when things need to change
Reduces cost of planned and unplanned outagesby
reducing MTTR(Mean Time To Repair)

5
Failover and Restart

Monitoring detects failures (hardware, network,
applications)
Automatic Recovery from failures (no human
intervention)
Managed restart or failover to standby systems,
components

6
What Can HA Clustering Do For You?

It cannot achieve 100 availability nothing
can.
HA Clustering designed to recover from single
faults
It can make your outages very short
From about a second to a few minutes
It is like a Magician's (Illusionist's) trick
When it goes well, the hand is faster than the
eye
When it goes not-so-well, it can be reasonably
visible
A good HA clustering system adds a 9 to your
base availability
99-99.9, 99.9-99.99, 99.99-99.999,
etc.

7
Lies, Damn Lies, and Statistics

Counting nines downtime allowed per year

8
The Desire for HA systems

Who wants low-availability systems?
Why are so few systems High-Availability?

9
Why isn't everything HA?

Cost
Complexity

10
Complexity

Complexity is the Enemy of Reliability

11
(No Transcript)
12
Commodity HA?

Installations with more than 200 Linux-HA pairs
Autostrada Italy
Italian Bingo Authority
Oxfordshire School System
Many retailers (through IRES and others)
Karstadt's
Circuit City
etc.
Also a component in commercial routers,
firewalls, security hardware

13
The HA Continuum

Single node HA system (monitoring w/o redundancy)
Provides for application monitoring and restart
Easy, near-zero-cost entry point HA system
starts init scripts instead of /etc/init.d/rc (or
equivalent)
Addresses Solaris / Linux functional gap
Multiple Virtual Machines Single Physical
machine
Adds OS crash protection, rolling upgrades of OS
and application good for security fixes, etc.
Many possibilities for interactions with virtual
machines exist
Multiple Physical Machines (normal cluster)
Adds protection against hardware failures
Split-Site (stretch) Clusters
Adds protection against site-wide failures
(power, air-conditioning, flood, fire)

14
How Does HA work?

Manage redundancy to improve service availability
Like a cluster-wide-super-init with monitoring
Even complex services are now respawn
on node (computer) death
on impairment of nodes
on loss of connectivity
for services that aren't working (not necessarily
stopped)
managing potentially complex dependency
relationships

15
Single Points of Failure (SPOFs)

A single point of failure is a component whose
failure will cause near-immediate failure of an
entire system or service
Good HA design adds redundancy to eliminate
single points of failure
Non-Obvious SPOFs can require deep expertise to
spot

16
The Three R's of High-Availability

Redundancy
Redundancy
Redundancy
If this sounds redundant, that's probably
appropriate...
Most SPOFs are eliminated by redundancy
HA Clustering is a good way of providing and
managing redundancy

17
Redundant Communications

Intra-cluster communication is critical to HA
system operation
Most HA clustering systems provide mechanisms for
redundant internal communication for heartbeats,
etc.
External communications is usually essential to
provision of service
External communication redundancy is usually
accomplished through routing tricks
Having an expert in BGP or OSPF routing is a help

18
Fencing

Guarantees resource integrity in the case of
certain difficult cases (split-brain)
Four Common Methods
FiberChannel Switch lockouts
SCSI Reserve/Release (painful to make reliable)
Self-Fencing (like IBM ServeRAID)
STONITH Shoot The Other Node In The Head
Linux-HA has native support for the last two

19
Redundant Data Access

Replicated
Copies of data are kept updated on more than one
computer in the cluster
Shared
Typically Fiber Channel Disk (SAN)
Sometimes shared SCSI
Back-end Storage (Somebody Else's Problem)
NFS, SMB
Back-end database
All are supported by Linux-HA

20
Data Sharing Replication

Some applications provide their own replication
DNS, DHCP, LDAP, DB2, etc.
Linux has excellent disk replication methods
available
DRBD is my favorite
DRBD-based HA clusters are shockingly cheap
Some environments can live with less precise
replication methods rsync, etc.
Generally does not support parallel access
Fencing usually required
EXTREMELY cost effective

21
Data Sharing ServeRAID et al

IBM ServeRAID SCSI controller is self-fencing
This helps integrity in failover environments
This makes cluster filesystems, etc. impossible
No Oracle RAC, no GPFS, etc.
ServeRAID failover requires a script to perform
volume handover
Linux-HA provides such a script in open source
Linux-HA is ServerProven with ServeRAID

22
Data Sharing Shared Disk

The most classic data sharing mechanism
commonly fiber channel
Allows for failover mode
Allows for true parallel access
Oracle RAC, Cluster filesystems, etc.
Fencing always required with Shared Disk

23
Data Sharing Back-End

Network Attached Storage can act as a data
sharing method
Existing Back End databases can also act as a
data sharing mechanism
Both make reliable and redundant data sharing
Somebody Else's Problem (SEP).
If they did a good job, you can benefit from
them.
Beware SPOFs in your local network

24
The Linux-HA Project

Linux-HA is the oldest high-availability project
for Linux, with the largest associated community
Linux-HA is the OSS portion of IBM's HA strategy
for Linux
Linux-HA is the best-tested Open Source HA
product
The Linux-HA package is called Heartbeat(though
it does much more than heartbeat)
Linux-HA has been in production since 1999, and
is currently in use on more than ten thousand
sites
Linux-HA also runs on FreeBSD and Solaris, and is
being ported to OpenBSD and others
Linux-HA shipped with every major Linux
distribution except one.
Release 2 shipped end of July more than 6000
downloads since then

25
Linux-HA Release 1 Applications

Database Servers (DB2, Oracle, MySQL, others)
Load Balancers
Web Servers
Custom Applications
Firewalls
Retail Point of Sale Solutions
Authentication
File Servers
Proxy Servers
Medical Imaging
Almost any type server application you can think
of except SAP

26
Linux-HA customers

FedEx Truck Location Tracking
BBC Internet infrastructure
Oxfordshire Schools Universal servers an HA
pair in every school
The Weather Channel (weather.com)
Sony (manufacturing)
ISO New England manages power grid using 25
Linux-HA clusters
MAN Nutzfahrzeuge AG truck manufacturing
division of Man AG
Karstadt, Circuit City use Linux-HA and databases
each in several hundred stores
Citysavings Bank in Munich (infrastructure)
Bavarian Radio Station (Munich) coverage of 2002
Olympics in Salt Lake City
Emageon medical imaging services
Incredimail bases their mail service on Linux-HA
on IBM hardware
University of Toledo (US) 20k student Computer
Aided Instruction system

27
Linux-HA Release 1 capabilities

Supports 2-node clusters
Can use serial, UDP bcast, mcast, ucast
communication
Fails over on node failure
Fails over on loss of IP connectivity
Capability for failing over on loss of SAN
connectivity
Limited command line administrative tools to fail
over, query current status, etc.
Active/Active or Active/Passive
Simple resource group dependency model
Requires external tool for resource (service)
monitoring
SNMP monitoring

28
Linux-HA Release 2 capabilities

Built-in resource monitoring
Support for the OCF resource standard
Much larger clusters supported ( 8 nodes)
Sophisticated dependency model
Rich constraint support (resources, groups,
incarnations, master/slave)
XML-based resource configuration
Coming in 2.0.x (later in 2005)
Configuration and monitoring GUI
Support for GFS cluster filesystem
Multi-state (master/slave) resource support
Monitoring of arbitrary external entities (temp,
SAN, network)

29
Release 2 Credits

Andrew Beekhof (SUSE) CRM, CIB
Gouchun Shi (NCSA) significant infrastructure
improvements
Sun, Jiang Dong and Huang, Zhen LRM, Stonithd
and testing
Lars Marowsky-Bree (NCSA) architecture,
leadership
Alan Robertson architecture, project
leadership, original heartbeat code, testing,
evangelism

30
Linux-HA Release 1 Architecture
31
Linux-HA Release 2 Architecture(add TE and PE)

32
Linux-HA Release 2 Architecture(more detail)
33
Resource Objects in Release 2

Release 2 supports resource objects which can
be any of the following
Primitive Resources
Resource Groups
Resource Clones n resource objects
Multi-state (master/slave) resources

34
Classes of Resource Agents in R2(resource
primitives)

OCF Open Cluster Framework - http//opencf.org/
take parameters as name/value pairs through the
environment
Can be monitored well by R2
Heartbeat R1-style heartbeat resources
Take parameters as command line arguments
Can be monitored by status action
LSB Standard LSB Init scripts
Take no parameters
Can be monitored by status action
Stonith Node Reset Capability
Very similar to OCF resources

35
An OCF primitive object
typeIPaddr providerheartbeat

nameip value192.168.224.5/

Attribute nvpairs are translated into
environment variables

36
An LSB primitive resource object(i. e., an init
script)

typesmb

37
A STONITH primitive resource

typeibmhmc providerheartbeat

value192.168.224.99 /
ive

38
Resource Groups

Resource Groups provide a shorthand for creating
ordering and co-location dependencies
Each resource object in the group is declared to
have linear start-after ordering relationships
Each resource object in the group is declared to
have co-location dependencies on each other
This is an easy way of converting release 1
resource groups to release 2

39
Resource Clones

Resource Clones allow one to have a resource
object which runs multiple (n) times on the
cluster
This is useful for managing
load balancing clusters where you want n of
them to be slave servers
Cluster filesystem mount points
Cluster Alias IP addresses
Cloned resource object can be a primitive or a
group

40
Sample clone XML

41
Multi-State (master/slave) Resources(coming in
2.0.3)

Normal resources can be in one of two stable
states
running
stopped
Multi-state resources can have more than two
stable states. For example
running-as-master
running-as-slave
stopped
This is ideal for modeling replication resources
like DRBD

42
Basic Dependencies in Release 2

Ordering Dependencies
start before (normally implies stop after)
start after (normally implies stop before)
Mandatory Co-location Dependencies
must be co-located with
cannot be co-located with

43
Resource Location Constraints

Mandatory Constraints
Resource Objects can be constrained to run on any
selected subset of nodes. Default depends on
setting of symmetric_cluster.
Preferential Constraints
Resource Objects can also be preferentially
constrained to run on specified nodes by
providing weightings for arbitrary logical
conditions
The resource object is run on the node which has
the highest weight (score)

44
Advanced Constraints

Nodes can have arbitrary attributes associated
with them in namevalue form
Attributes have types int, string, version
Constraint expressions can use these attributes
as well as node names, etc in largely arbitrary
ways
Operators
, !, ,
defined(attrname), undefined(attrname),
colocated(resource id), not colocated(resource id)

45
Advanced Constraints (cont'd)

Each constraint is associated with particular
resource, and is evaluated in the context of a
particular node.
A given constraint has a boolean predicate
associated with it according to the expressions
before, and is associated with a weight, and
condition. Weights can be constants or
attribute values.
If the predicate is true, then the condition is
used to compute the weight associated with
locating the given resource on the given node.
Conditions are given weights, positive or
negative. Additionally there are special values
for modeling must-have conditions
INFINITY
-INFINITY
The total score is the sum of all the applicable
constraint weights

46
Sample Dynamic Attribute Use

Attributes are arbitrary only given meaning by
rules
You can assign them values from external programs
For example
Create a rule which uses the attribute fc_status
as its weight for some resource needing a Fiber
Channel connection
Write a script to set the status of fc_status for
a node to 0 if the FC connection is working, and
-10000 if it is not
Now, those resources automatically move to a
place where the FC connection is working if
there is such a place, if not they stay where
they are.

47
rsc_location information

We prefer the webserver group to run on host
node01
groupwebserver score100 operationeq valuenode01/

48
Request for Feedback

Linux-HA Release 2 is a good solid HA product
At this point human and experience factors will
likely more helpful than most technical doo-dads
and refinements
This audience knows more about that than probably
any other similar audience in the world
So, check out Linux-HA release 2 and tell us...
What we got right
What needs improvement
What we got wrong
We are very responsive to comments
We look forward to your critiques, brickbats, and
other comments

49
DRBD RAID1 over the LAN

DRBD is a block-level replication technology
Every time a block is written on the master side,
it is copied over the LAN and written on the
slave side
Typically, a dedicated replication link is used
It is extremely cost-effective common with
xSeries
Worst-case around 10 throughput loss
Recent versions have very fast full resync

50
(No Transcript)
51
Security Considerations

Cluster A computer whose backplane is the
Internet
If this isn't scary, you don't understand...
You may think you have a secure cluster network
You're probably mistaken now
You will be in the future

52
Secure Networks are Difficult Because...

Security is not often well-understood by admins
Security is well-understood by black hats
Network security is easy to breach accidentally
Users bypass it
Hardware installers don't fully understand it
Most security breaches come from trusted staff
Staff turnover is often a big issue
Virus/Worm/P2P technologies will create new holes
especially for Windows machines

53
Security Advice

Good HA software should be designed to assume
insecure networks
Not all HA software assumes insecure networks
Good HA installation architects use dedicated
(secure?) networks for intra-cluster HA
communication
Crossover cables are reasonably secure all else
is suspect -)

54
References

http//linux-ha.org/
http//linux-ha.org/Talks (these slides)
http//linux-ha.org/download/
http//linux-ha.org/SuccessStories
http//linux-ha.org/Certifications
http//linux-ha.org/BasicArchitecture
http//linux-ha.org/NewHeartbeatDesign
www.linux-mag.com/2003-11/availability_01.html

55
Legal Statements

IBM is a trademark of International Business
Machines Corporation.
Linux is a registered trademark of Linus
Torvalds.
Other company, product, and service names may be
trademarks or service marks of others.
This work represents the views of the author and
does not necessarily reflect the views of the IBM
Corporation.

Write a Comment

User Comments (0)