Title: E-Research Infrastructure?
1E-Research Infrastructure?
- Markus.Buchhorn_at_anu.edu.au
- Head, ANU Internet Futures
- Grid Services Coordinator, GrangeNet
- Leader, APAC Information Infrastructure Program
- (PhD Mt Stromlo 1988-1992)
2A gentle (and fast!) overview
- Themes
- What does e-Research mean?
- What kind of infrastructure is involved?
- How is it being developed?
- What are the problems?
3e-Research infrastructure
- The use of IT to enhance research
- and education!
- Access resources transparently
- Make data readily available
- Make collaboration easier
- Is it The Grid ?
- No, and yes the Grid is a tool in the kit
- Who funds it? The Govt when building for a
large community - NCRIS (SIIMNRF), ARC, eResearch-CoordCtee
4ANU Internet Futures
- A cross-discipline, cross-campus applied
research group - e-Research infrastructure development
- Objectives
- To investigate and deploy advanced Internet-based
technologies that support university research and
education missions. - Bring research-edge technologies into production
use - Engage with APAC, GrangeNet, ARIIC/SII, ,
Internet2, APAN, TERENA, - A strong focus on User Communities
- Identify common requirements
5What does Grid mean?
- Analogy with the power grid
- A standard service (AC, 240V, 50Hz)
- A standard connection
- A standard user interface
- Users do not care about
- Various generation schemes
- Deregulated market
- Power auctions
- Synchronised generators
- Transmission switching, fail-over systems
- Accounting and Billing
6What does Grid mean in IT?
- Transparent use of resources
- Distributed, and networked
- Multiple administrative domains
- Other peoples resources become available to you
- Various IT resources
- Computing, Data, Visualisation, Collaboration,
etc. - Hide complexity
- It should be a black box, one just plugs in.
7What are the bits in eRI?
Applications and Users
Grid, Middleware Services Layer
(Advanced) Communications Services Layer
Network Layer (Physical and Transmission)
8Whats in that middle bit?
Applications and Users
Computing
Data
Middle-ware
Collaboration
Instruments
Visualisation
(Advanced) Communications Services Layer
9Networks
- Physical networks are fundamental to link
researchers, observational facilities, IT
facilities - Demand for high-(and flexible) bandwidth to every
astronomical site - Universities, observatories, other research
sites/groups - GrangeNet, AARNet3, AREN, Big city focus
- Today remote sites have wet bits of string, and
station wagons - At least 1-10Gigabit links soon-ish (SSO, ATCA,
Parkes, MSO). - Getting 10-20Gigabits internationally right now,
- including to the top of Mauna Kea in the next
year or so - Canada, US, NL, are building/running some
40Gb/s today - e-VLBI, larger detectors, remote control,
multi-site collaboration, real-time data
analysis/comparisons, - Burst needs, as well as sustained.
- Wavelength Division Multiplexing (WDM) allows for
a lot more bandwidth (80? at 80Gb/s)
10Common Needs - Middleware
- Functionality needed by all the eRI areas
- Minimise replication of services
- Provide a standard set of interfaces
- To applications/users
- To network layer
- To grid services
- Can be built independently of other areas
- A lot of politics, policy issues enter here
11Common Needs - Middleware - 2
- Authentication
- Something you have, something you know
- Somebody vouches for you
- Certificate Authorities, Shibboleth,
- Authorisation
- Granularity of permission (resolution, slices, )
- Limits of permission (time, cycles, storage, )
- Accounting
- Billing, feedback to authorisation
Collectively called AAA
12Common Needs - Middleware - 3
- Security
- Encryption, PKI,
- AAA, Non-repudiation
- Firewalls and protocol hurdles (NATs, proxies,)
- Resource discovery
- Finding stuff on the Net
- Search engines, portals, registries, p2p mesh,
- Capability negotiation
- Can you do what I want, when I want
- Network and application signalling
- Tell the network what services we need (QoS,
RSVP, MPLS, ) - Tell the application what the situation is
- And listen for feedback and deal with it.
13The Computational Grid
- Presume Middleware issues are solved
- Probably the main Grid activity
- Architectural Issues
- CPUs, endian-ness, executable format, libraries
non-uniform networking Clusters vs SMP, NUMA, - Code design
- Master/Slave, P2P Granularity (Fine-grained
parallelism vs (coarse) parameter sweep) - Scheduling
- Multiple owners Queuing systems Economics (How
to select computational resources, and
prioritise) - During execution
- Job Monitoring and Steering Access to resources
(Code, data, storage, ) - But if we solve all these
- Seamless access to computing resources across the
planet. - Harness the power of supercomputers, large-gtsmall
clusters, and corporate/campus desktops
(Campus-Grid)
14Computing facilities
- University computing facilities, within
departments or centrally. - Standout facilities.
- The APAC partnership (www.apac.edu.au)
- Qld QPSF partnership, several facilities around
UQ, GU, QUT - NSW ac3 (at ATP Everleigh)
- ACT ANU - APAC peak facility, upgraded in 2005
(top 30 in the world) - Vic VPAC (RMIT)
- SA SAPAC (U.Adelaide?)
- WA IVEC (UWA)
- Tas TPAC (U.Tas)
- Other very noteworthy facilities, such as
Swinburne's impressive clusters. There are bound
to be others, and more are planned.
15Data Grids
- Large-scale, distributed, federated data
repositories - Making complex data available
- Scholarly output and scholarly input
- Observations, simulations, algorithms,
- to applications and other grid services
- in the most efficient way
- Performance, cost,
- in the most appropriate way
- within the same middleware AAA framework
- in a sustainable and trustworthy way
16Data Grid 101
Directories AAA, Capabilities Workflows, DRM,
Content Archive Interface
Presentation
17Data Grid Issues
- Every arrow is a protocol, Every interface is a
standard - Storage hardware, software file format
standards, algorithms - Describing data metadata, external
orthographies, dictionaries - Caching/replication Instances (non-identical),
identifiers, derivatives - Resource discovery Harvesting, registries,
portals - Access security, rights-management (DRM),
anonymity authsn. granularity - Performance delivery in appropriate form and
size, user-meaningful user interface
(Rendering/presentation by location and
culture) - Standards, and the excess thereof
- Social engineering Putting data online is
- An effort needs to be easier, obvious
- A requirement! but not enforced lacks
processes - Not recognised nor rewarded
- PAPER publishing is!
18Data facilities
- In most cases these are inside departments, or
maybe central services on a university. - ANU/APAC host a major storage facility (tape
robot) in Canberra that is available for the RE
community to make use of - Currently 1.2Petabytes peak, and connected to
GrangeNet and AARNet3. - It hosts the MSO MACHO-et-al data set at the
moment, and more is to come. - To be upgraded every 2 years or so factor of
2-5 in capacity each time - If funding is found, each time. Needs community
input. - Doesnt suit everyone (yet)
- Mirror/collaborating facilities in other cities
in AU and overseas being discussed - Integration with local facilities
- VO initiatives all data from all observatories
and computers - Govt initiatives under ARIIC APSR, ARROW, MAMS,
ADT
19Collaboration and Visualisation
- A lot of intersection between the two
- Beyond videoconferencing - telepresence
- Sharing not just your presence, but also your
research - Examples Multiple sites of
- Large-scale data visualisation, computational
steering, engineering and manufacturing design,
bio-molecular modelling and visualisation,
Education and training - Whats the user interface?
- Guided tour vs independent observation
- Capability negotiation, local or remote rendering
- (Arbitrary) application sharing
- Tele-collaboration (Co-laboratories)
- Revolve around the Access Grid
- www.accessgrid.org
20Access Grid Nodes
- A collection of interactive, multimedia centres
that support collaborative work - distributed large-scale meetings, sessions,
seminars, lectures, tutorials and training. - High-end, large-scale tele-collaboration
facilities - Or can run on a single laptop/PDA
- Videoconferencing dramatically improved
- But not the price
- Much better support for
- multi-site, multi-camera, multi-application
interaction - Flexible, open design
- Over 400 in operation around the world
- 30 in operation, design or construction in
Australia - 4 at ANU
21(No Transcript)
22AccessGrid facilities
- University hosted nodes are generally available
for researchers from any area to use, - you just need to make friends with their hosts.
- Qld JCU-Townsville, CQU-several cities, UQ, QUT,
CQU, SQU, GU (Nathan, GoldCoast) - NSW USyd, UNSW(desktop), UTS
- ACT ANU (4, one at Mt Stromlo. SSO has been
suggested) - Vic UMelb (soon), Monash-Caulfield, VPAC (by
RMIT), Swinburne (desktop), U.Ballarat (desktop) - SA U.Adelaide (1 desktop and 1 room), Flinders
(soon), UniSA (planning) - WA UWA (IVEC)
- Tas UTas (soon)
- NT I wish!
- Another 400 around the world.Development by
many groups, Australia has some leadership - Accessgrid-l_at_grangenet.net
23Visualisation Facilities
- Active visualisation research community in
Australia - OzViz'04 at QUT 6-7 Dec 2004.
- Major nodes with hard facilities include
- ANU-VizLab,
- Sydney-VisLab,
- UQ/QPSF-VisLab,
- IVEC-WA,
- I-cubed (RMIT),
- Swinburne,
- etc.
24Online Instruments
- Remote, collaborative access to unique / scarce
instruments - Telescopes, Microscopes, Particle accelerators,
Robots, Sensor arrays - Need to interface with other eRI services
- Computation analysis of data
- Data for storage, comparison
- Visualisation for human analysis
- Collaboration to share the facility
25So, in summary
- Transparent use of various IT resources
- Research and education processes
- Make existing ones easier and better
- Allow new processes to be developed
- Are we there yet?
- Not even close!!
- But development in many areas is promising
- In some situations, the problems are not
technical but political/social - Some of the results already are very useful
- Astronomy needs to help the processes, to help
Astronomy!