LinuxHA Release 2 An Overview - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

LinuxHA Release 2 An Overview

Description:

High-Availability Best Practices IV October, 2005 ... High-Availability Best Practices IV October, 2005. What Can HA Clustering Do For You? ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0
Slides: 42
Provided by: linu
Category:

less

Transcript and Presenter's Notes

Title: LinuxHA Release 2 An Overview


1
Linux-HA Release 2 An Overview
  • Alan Robertson
  • Project Leader Linux-HA project
  • alanr_at_unix.sh
  • (a.k.a. alanr_at_us.ibm.com)
  • IBM Linux Technology Center

2
Agenda
  • High-Availability (HA) Clustering?
  • What is the Linux-HA project?
  • Linux-HA applications and customers
  • Linux-HA release 1 / Release 2 /Feature
    Comparison
  • Release 2 Details
  • Request for Feedback
  • DRBD an important component
  • Thoughts about cluster security

3
What Is HA Clustering?
  • Putting together a group of computers which trust
    each other to provide a service even when system
    components fail
  • When one machine goes down, others take over its
    work
  • This involves IP address takeover, service
    takeover, etc.
  • New work comes to the remaining machines
  • Not primarily designed for high-performance

4
High Availability Through Redundancy and
Monitoring
  • Redundancy eliminates Single Points Of Failure
    (SPOF)
  • Monitoring determines when things need to change
  • Reduces cost of planned and unplanned outagesby
    reducing MTTR(Mean Time To Repair)

5
Failover and Restart
  • Monitoring detects failures (hardware, network,
    applications)
  • Automatic Recovery from failures (no human
    intervention)
  • Managed restart or failover to standby systems,
    components

6
What Can HA Clustering Do For You?
  • It cannot achieve 100 availability nothing
    can.
  • HA Clustering designed to recover from single
    faults
  • It can make your outages very short
  • From about a second to a few minutes
  • It is like a Magician's (Illusionist's) trick
  • When it goes well, the hand is faster than the
    eye
  • When it goes not-so-well, it can be reasonably
    visible
  • A good HA clustering system adds a 9 to your
    base availability
  • 99-99.9, 99.9-99.99, 99.99-99.999,
    etc.

7
Lies, Damn Lies, and Statistics
  • Counting nines downtime allowed per year

8
The Desire for HA systems
  • Who wants low-availability systems?
  • Why are so few systems High-Availability?

9
Why isn't everything HA?
  • Cost
  • Complexity

10
Complexity
  • Complexity is the Enemy of Reliability

11
(No Transcript)
12
Commodity HA?
  • Installations with more than 200 Linux-HA pairs
  • Autostrada Italy
  • Italian Bingo Authority
  • Oxfordshire School System
  • Many retailers (through IRES and others)
  • Karstadt's
  • Circuit City
  • etc.
  • Also a component in commercial routers,
    firewalls, security hardware

13
The HA Continuum
  • Single node HA system (monitoring w/o redundancy)
  • Provides for application monitoring and restart
  • Easy, near-zero-cost entry point HA system
    starts init scripts instead of /etc/init.d/rc (or
    equivalent)
  • Addresses Solaris / Linux functional gap
  • Multiple Virtual Machines Single Physical
    machine
  • Adds OS crash protection, rolling upgrades of OS
    and application good for security fixes, etc.
  • Many possibilities for interactions with virtual
    machines exist
  • Multiple Physical Machines (normal cluster)
  • Adds protection against hardware failures
  • Split-Site (stretch) Clusters
  • Adds protection against site-wide failures
    (power, air-conditioning, flood, fire)

14
How Does HA work?
  • Manage redundancy to improve service availability
  • Like a cluster-wide-super-init with monitoring
  • Even complex services are now respawn
  • on node (computer) death
  • on impairment of nodes
  • on loss of connectivity
  • for services that aren't working (not necessarily
    stopped)
  • managing potentially complex dependency
    relationships

15
Single Points of Failure (SPOFs)
  • A single point of failure is a component whose
    failure will cause near-immediate failure of an
    entire system or service
  • Good HA design adds redundancy to eliminate
    single points of failure
  • Non-Obvious SPOFs can require deep expertise to
    spot

16
The Three R's of High-Availability
  • Redundancy
  • Redundancy
  • Redundancy
  • If this sounds redundant, that's probably
    appropriate...
  • Most SPOFs are eliminated by redundancy
  • HA Clustering is a good way of providing and
    managing redundancy

17
Redundant Communications
  • Intra-cluster communication is critical to HA
    system operation
  • Most HA clustering systems provide mechanisms for
    redundant internal communication for heartbeats,
    etc.
  • External communications is usually essential to
    provision of service
  • External communication redundancy is usually
    accomplished through routing tricks
  • Having an expert in BGP or OSPF routing is a help

18
Fencing
  • Guarantees resource integrity in the case of
    certain difficult cases (split-brain)
  • Four Common Methods
  • FiberChannel Switch lockouts
  • SCSI Reserve/Release (painful to make reliable)
  • Self-Fencing (like IBM ServeRAID)
  • STONITH Shoot The Other Node In The Head
  • Linux-HA has native support for the last two

19
Redundant Data Access
  • Replicated
  • Copies of data are kept updated on more than one
    computer in the cluster
  • Shared
  • Typically Fiber Channel Disk (SAN)
  • Sometimes shared SCSI
  • Back-end Storage (Somebody Else's Problem)
  • NFS, SMB
  • Back-end database
  • All are supported by Linux-HA

20
Data Sharing Replication
  • Some applications provide their own replication
  • DNS, DHCP, LDAP, DB2, etc.
  • Linux has excellent disk replication methods
    available
  • DRBD is my favorite
  • DRBD-based HA clusters are shockingly cheap
  • Some environments can live with less precise
    replication methods rsync, etc.
  • Generally does not support parallel access
  • Fencing usually required
  • EXTREMELY cost effective

21
Data Sharing ServeRAID et al
  • IBM ServeRAID SCSI controller is self-fencing
  • This helps integrity in failover environments
  • This makes cluster filesystems, etc. impossible
  • No Oracle RAC, no GPFS, etc.
  • ServeRAID failover requires a script to perform
    volume handover
  • Linux-HA provides such a script in open source
  • Linux-HA is ServerProven with ServeRAID

22
Data Sharing Shared Disk
  • The most classic data sharing mechanism
    commonly fiber channel
  • Allows for failover mode
  • Allows for true parallel access
  • Oracle RAC, Cluster filesystems, etc.
  • Fencing always required with Shared Disk

23
Data Sharing Back-End
  • Network Attached Storage can act as a data
    sharing method
  • Existing Back End databases can also act as a
    data sharing mechanism
  • Both make reliable and redundant data sharing
    Somebody Else's Problem (SEP).
  • If they did a good job, you can benefit from
    them.
  • Beware SPOFs in your local network

24
The Linux-HA Project
  • Linux-HA is the oldest high-availability project
    for Linux, with the largest associated community
  • Linux-HA is the OSS portion of IBM's HA strategy
    for Linux
  • Linux-HA is the best-tested Open Source HA
    product
  • The Linux-HA package is called Heartbeat(though
    it does much more than heartbeat)
  • Linux-HA has been in production since 1999, and
    is currently in use on more than ten thousand
    sites
  • Linux-HA also runs on FreeBSD and Solaris, and is
    being ported to OpenBSD and others
  • Linux-HA shipped with every major Linux
    distribution except one.
  • Release 2 shipped end of July more than 6000
    downloads since then

25
Linux-HA Release 1 Applications
  • Database Servers (DB2, Oracle, MySQL, others)
  • Load Balancers
  • Web Servers
  • Custom Applications
  • Firewalls
  • Retail Point of Sale Solutions
  • Authentication
  • File Servers
  • Proxy Servers
  • Medical Imaging
  • Almost any type server application you can think
    of except SAP

26
Linux-HA customers
  • FedEx Truck Location Tracking
  • BBC Internet infrastructure
  • Oxfordshire Schools Universal servers an HA
    pair in every school
  • The Weather Channel (weather.com)
  • Sony (manufacturing)
  • ISO New England manages power grid using 25
    Linux-HA clusters
  • MAN Nutzfahrzeuge AG truck manufacturing
    division of Man AG
  • Karstadt, Circuit City use Linux-HA and databases
    each in several hundred stores
  • Citysavings Bank in Munich (infrastructure)
  • Bavarian Radio Station (Munich) coverage of 2002
    Olympics in Salt Lake City
  • Emageon medical imaging services
  • Incredimail bases their mail service on Linux-HA
    on IBM hardware
  • University of Toledo (US) 20k student Computer
    Aided Instruction system

27
Linux-HA Release 1 capabilities
  • Supports 2-node clusters
  • Can use serial, UDP bcast, mcast, ucast
    communication
  • Fails over on node failure
  • Fails over on loss of IP connectivity
  • Capability for failing over on loss of SAN
    connectivity
  • Limited command line administrative tools to fail
    over, query current status, etc.
  • Active/Active or Active/Passive
  • Simple resource group dependency model
  • Requires external tool for resource (service)
    monitoring
  • SNMP monitoring

28
Linux-HA Release 2 capabilities
  • Built-in resource monitoring
  • Support for the OCF resource standard
  • Much larger clusters supported ( 8 nodes)
  • Sophisticated dependency model
  • Rich constraint support (resources, groups,
    incarnations, master/slave)
  • XML-based resource configuration
  • Coming in 2.0.x (later in 2005)
  • Configuration and monitoring GUI
  • Support for GFS cluster filesystem
  • Multi-state (master/slave) resource support
  • Monitoring of arbitrary external entities (temp,
    SAN, network)

29
Release 2 Credits
  • Andrew Beekhof (SUSE) CRM, CIB
  • Gouchun Shi (NCSA) significant infrastructure
    improvements
  • Sun, Jiang Dong and Huang, Zhen LRM, Stonithd
    and testing
  • Lars Marowsky-Bree (NCSA) architecture,
    leadership
  • Alan Robertson architecture, project
    leadership, original heartbeat code, testing,
    evangelism

30
Linux-HA Release 1 Architecture
31
Linux-HA Release 2 Architecture(add TE and PE)

32
Linux-HA Release 2 Architecture(more detail)
33
Resource Objects in Release 2
  • Release 2 supports resource objects which can
    be any of the following
  • Primitive Resources
  • Resource Groups
  • Resource Clones n resource objects
  • Multi-state (master/slave) resources

34
Classes of Resource Agents in R2(resource
primitives)
  • OCF Open Cluster Framework - http//opencf.org/
  • take parameters as name/value pairs through the
    environment
  • Can be monitored well by R2
  • Heartbeat R1-style heartbeat resources
  • Take parameters as command line arguments
  • Can be monitored by status action
  • LSB Standard LSB Init scripts
  • Take no parameters
  • Can be monitored by status action
  • Stonith Node Reset Capability
  • Very similar to OCF resources

35
An OCF primitive object
typeIPaddr providerheartbeat

nameip value192.168.224.5/

Attribute nvpairs are translated into
environment variables

36
An LSB primitive resource object(i. e., an init
script)
  • typesmb


37
A STONITH primitive resource
  • typeibmhmc providerheartbeat

    value192.168.224.99 /
    ive


38
Resource Groups
  • Resource Groups provide a shorthand for creating
    ordering and co-location dependencies
  • Each resource object in the group is declared to
    have linear start-after ordering relationships
  • Each resource object in the group is declared to
    have co-location dependencies on each other
  • This is an easy way of converting release 1
    resource groups to release 2


39
Resource Clones
  • Resource Clones allow one to have a resource
    object which runs multiple (n) times on the
    cluster
  • This is useful for managing
  • load balancing clusters where you want n of
    them to be slave servers
  • Cluster filesystem mount points
  • Cluster Alias IP addresses
  • Cloned resource object can be a primitive or a
    group

40
Sample clone XML


41
Multi-State (master/slave) Resources(coming in
2.0.3)
  • Normal resources can be in one of two stable
    states
  • running
  • stopped
  • Multi-state resources can have more than two
    stable states. For example
  • running-as-master
  • running-as-slave
  • stopped
  • This is ideal for modeling replication resources
    like DRBD

42
Basic Dependencies in Release 2
  • Ordering Dependencies
  • start before (normally implies stop after)
  • start after (normally implies stop before)
  • Mandatory Co-location Dependencies
  • must be co-located with
  • cannot be co-located with

43
Resource Location Constraints
  • Mandatory Constraints
  • Resource Objects can be constrained to run on any
    selected subset of nodes. Default depends on
    setting of symmetric_cluster.
  • Preferential Constraints
  • Resource Objects can also be preferentially
    constrained to run on specified nodes by
    providing weightings for arbitrary logical
    conditions
  • The resource object is run on the node which has
    the highest weight (score)

44
Advanced Constraints
  • Nodes can have arbitrary attributes associated
    with them in namevalue form
  • Attributes have types int, string, version
  • Constraint expressions can use these attributes
    as well as node names, etc in largely arbitrary
    ways
  • Operators
  • , !, ,
  • defined(attrname), undefined(attrname),
  • colocated(resource id), not colocated(resource id)

45
Advanced Constraints (cont'd)
  • Each constraint is associated with particular
    resource, and is evaluated in the context of a
    particular node.
  • A given constraint has a boolean predicate
    associated with it according to the expressions
    before, and is associated with a weight, and
    condition. Weights can be constants or
    attribute values.
  • If the predicate is true, then the condition is
    used to compute the weight associated with
    locating the given resource on the given node.
  • Conditions are given weights, positive or
    negative. Additionally there are special values
    for modeling must-have conditions
  • INFINITY
  • -INFINITY
  • The total score is the sum of all the applicable
    constraint weights

46
Sample Dynamic Attribute Use
  • Attributes are arbitrary only given meaning by
    rules
  • You can assign them values from external programs
  • For example
  • Create a rule which uses the attribute fc_status
    as its weight for some resource needing a Fiber
    Channel connection
  • Write a script to set the status of fc_status for
    a node to 0 if the FC connection is working, and
    -10000 if it is not
  • Now, those resources automatically move to a
    place where the FC connection is working if
    there is such a place, if not they stay where
    they are.

47
rsc_location information
  • We prefer the webserver group to run on host
    node01
  • groupwebserver score100 operationeq valuenode01/


48
Request for Feedback
  • Linux-HA Release 2 is a good solid HA product
  • At this point human and experience factors will
    likely more helpful than most technical doo-dads
    and refinements
  • This audience knows more about that than probably
    any other similar audience in the world
  • So, check out Linux-HA release 2 and tell us...
  • What we got right
  • What needs improvement
  • What we got wrong
  • We are very responsive to comments
  • We look forward to your critiques, brickbats, and
    other comments

49
DRBD RAID1 over the LAN
  • DRBD is a block-level replication technology
  • Every time a block is written on the master side,
    it is copied over the LAN and written on the
    slave side
  • Typically, a dedicated replication link is used
  • It is extremely cost-effective common with
    xSeries
  • Worst-case around 10 throughput loss
  • Recent versions have very fast full resync

50
(No Transcript)
51
Security Considerations
  • Cluster A computer whose backplane is the
    Internet
  • If this isn't scary, you don't understand...
  • You may think you have a secure cluster network
  • You're probably mistaken now
  • You will be in the future

52
Secure Networks are Difficult Because...
  • Security is not often well-understood by admins
  • Security is well-understood by black hats
  • Network security is easy to breach accidentally
  • Users bypass it
  • Hardware installers don't fully understand it
  • Most security breaches come from trusted staff
  • Staff turnover is often a big issue
  • Virus/Worm/P2P technologies will create new holes
    especially for Windows machines

53
Security Advice
  • Good HA software should be designed to assume
    insecure networks
  • Not all HA software assumes insecure networks
  • Good HA installation architects use dedicated
    (secure?) networks for intra-cluster HA
    communication
  • Crossover cables are reasonably secure all else
    is suspect -)

54
References
  • http//linux-ha.org/
  • http//linux-ha.org/Talks (these slides)
  • http//linux-ha.org/download/
  • http//linux-ha.org/SuccessStories
  • http//linux-ha.org/Certifications
  • http//linux-ha.org/BasicArchitecture
  • http//linux-ha.org/NewHeartbeatDesign
  • www.linux-mag.com/2003-11/availability_01.html

55
Legal Statements
  • IBM is a trademark of International Business
    Machines Corporation.
  • Linux is a registered trademark of Linus
    Torvalds.
  • Other company, product, and service names may be
    trademarks or service marks of others.
  • This work represents the views of the author and
    does not necessarily reflect the views of the IBM
    Corporation.
Write a Comment
User Comments (0)
About PowerShow.com