Title: An Introduction to Computer Systems at EBI
1An Introduction to Computer Systems at EBI John
Livingstone
2Topics
- What does the Systems Group do?
- The growth in demand
- What servers are available
- Desktop Computers
- Data storage
- Backups
- The EBI network
- Questions
3What do the Systems Group provide?
- What is Bioinformatics?
- Application of information technology to the
storage, management and analysis of biological
information. - Systems Group provide
- Servers
- Desktop Computers
- Data storage
- Backups
- Most importantly we manage and continuously
develop our systems to meet our users needs.
4Growth CPU Cores
- December 2003 110 cpu cores on Unix servers
(Tru64and Solaris) 340cpus on Linux - December 2005 160 cpu cores on Unix servers
(Tru64and Solaris) and 700cpus on Linux - December 2006 600 cpu cores on Unix servers
(Tru64and Solaris) and 1800cpus on Linux - NOW 615 cpu cores on Unix servers (Tru64and
Solaris) and 2062cpus on Linux
5Growth - storage
- December 2003 30 TB of storage
- December 2005 100 TB of storage
- December 2006 207 TB of storage
- NOW 270 TB of storage
Storage summary December2006
6Growth - People
- December 2003 250 users
- December 2005 300 users
- December 2006 350 users and rising
7Growth - Challenges
- Managing a large numbers of servers requires
different management methods you can load 10
computers using CDs, but not 2000 - More storage means more disk failures, and
backups also become a real headache - Combine more users with more computers and more
storage and you need a faster network to cope
with much more traffic
8What servers are available?
- There are about 615 cpu cores (Tru64 and
Solaris), plus 2026 cpu cores (Linux) at EBI - EBI servers are split into login servers and
compute servers and comprise mainly of Alpha
clusters, Sun servers, and Linux farms - Solaris, Windows, Linux and Alpha servers are
available for interactive login
A row of servers in one of our Machine Rooms, we
manage 12 similar rows In total (80 19 inch racks)
9EBI Linux Farms
Why move towards PC farms ?
- PC hardware is produced in large volumes pushing
the price down - Good availability of software on Linux
- Good fit for tasks that can be divided into
smaller subtasks Bioinformatics applications
often fall into this category - Good scalability buy more machines to get more
compute power - BUT Lots of CPUs high power, space and
cooling requirements !!! - Which is why we have moved towards 64-bit servers
with better management, much lower power
consumption, and much more computing power and
memory
10Example of the EBI Linux Farms
11How do we manage servers?
- Failover Systems
- TRU64 clusters on Compaq Alphas
- Linux farm nodes arranged in failover pairs for
critical web applications - Failover Systems Administrators
- Load Sharing Facility LSF
- Front end to most servers and farms
- Common images
- ABACuS in-house kickstart management for Linux
servers - Remote consoles
- Remote links to all our server rooms so we manage
from office or home - Automate tasks
- Scripts and web interfaces for everything from
adding new users to sending new software to
machines
12Desktop Computers
Management
- Systems managed Linux PCs
- New image via ABACuS
- Updates fully controlled by Systems
- Ensures stability and commonality
- Systems managed Windows PCs
- Image applied when new
- Updates pushed from Systems via Active Directory
- New software loaded from Active Directory
- SunRay
- Thin Client providing OpenOffice, web browsing
etc. - Smart Card login open your desktop on any
SunRay in the building
13Data Storage
- Mix of local disk storage, Storage Area Network
(SAN) and Network Attached Storage (NAS) - Each one is used where it is most applicable
- SAN where fast writing and access is required to
storage used by a small number of similar
computers - NAS where slightly slower speeds are acceptable,
but access is needed across a number of different
types of computers
14Storage Area Networks (SANs)
Disk array
Switches
Servers
15Linux Farms
NAS
EBI NAS configuration
Sun Servers
Alpha Servers
Linux Desktops
Windows Desktops
NetCaches
Gigabit Switch
NAS Storage
16Data Storage Virtualisation
- Acopia header between storage and Servers
- Makes storage from different sources appear as
one storage volume - We can increase size of storage by combining
storage from different sources - We can move storage from one source to another
on the fly - We can swap data from fast access to slower
access storage as it gets older
17Backups
- There is a full backup of each server file system
every four weeks to tape. - Incremental backups are taken daily to ADIC disk
- Oracle database backups are done in collaboration
with the Database Administrators - Tapes are stored in a backup robot, in fireproof
stores, and off-site - Every day a script runs random recoveries to
verify the system is working - Disaster Recovery Plan
- PCs are not backed up important data saved on
Home or Project directories
18Backup architecture
19EBI Network
20JANET (UK Education and Research
Network) INTERNET
EBI Network
Detailed view of connections and connection speeds
- Redundant internet connections via
Cambridge/London - Separate Subnets for each functional group
London
EBI router
Cambridge
34
100
1000
EBI firewall(also a router!)
Desktop Network
1000
1000
100
1000
1000
1000
100
Site router
Server Networks
WWWredirector
www servers
100/1000
100
www Network
100/1000
21Summary Systems Management at EBI
- Automation, Automation, Automation
- Restrict hardware/OS choice where possible
- Remote control
- Redundancy
- Test to destruction
- Continually investigate new products and question
what we do - Evenings and Weekends free
22Any Questions?
EBI Systems Group 2007