Title: AMS Mass Data Processing Grid
1AMS Mass Data Processing Grid
- Luo Junzhou
- School of Computer Science and Engineering
- jluo_at_seu.edu.cn
2 Outline
- Background of AMS experiment
- Mass data processing grid
- AMS data processing grid platform
- Related research work on SEUGrid
- Future work in AMS data processing
3Whats AMS Experiment (1)
- The AMS(Alpha Magnetic Spectrometer)experiment,
led by Nobel Prize winner Professor Samuel C. C.
Ting, is large-scale international collaborative
project. More than 300 scientists from 15
countries and regions, including USA, Russia,
Germany, France and China, participate in the AMS
experiment. - Among them, there are a lot of world-famous
scholars and universities such as Massachusetts
Institute of Technology, University of Geneva and
University of Perugia, etc.
4Whats AMS Experiment (2)
- AMS experiment is the unique large physics
experiment on the international space station. It
is the first time for human being to measure
accurately in space high-energy electric atom and
particle. - The purpose of the AMS experiment is to look for
the source of the dark matter, source of the
cosmic ray and the universe made of antimatter.
5AMS-02 Detector
6AMS-02 Data Volume
STS91
ISS
7AMS-02 Data Classification
- Health Status Data
- Status of detector (magnet, power, temperature,
DAQ state), Rate lt 1 Kbit/sec, need in Real-Time
(RT) to AMS Payload Operation and Control Center
(POCC), to ISS crew and NASA ground - Monitoring Data
- All slow control data from all slow control
sensors, Data rate 10 Kbit/sec, need in
NearRealTime (NRT) to AMS POCC, complete copy
later (close to NRT) for science analysis - Science Data
- events, subdetector calibrations, samples approx.
10 to POCC to monitor detector performance in
RT, complete copy later (close to NRT) to SOC
for event reconstruction and physics analysis, 2
Mbit/sec orbit average - Flight Ancillary Data
- ISS lattitude, altitude, speed, etc
- Rate 2 Kbit/sec
8Collaboration Between SEU AMS
- Southeast University is the first mainland
university participating in AMS under the
approval of China government in Oct., 2003 - According to the Collaboration Agreement between
SEU and AMS-02 Experiment - Set up the SEU AMS experiment system on the
ground (AMS-C) - Set up the SEU AMS-02 Antimatter Investigation
System (AMS-AIS) - Set up the SEU AMS-02 Science Operator Center
(AMS-SOC).
9SEU AMS-SOC
- The mission of SEU AMS-SOC
- Mass data storage system
- Parallel computing environment
- Data analysis and computing system
- Set up a mass data storage system with the
capability of more than 420TB, and a high
performance data processing system based on the
newly clusters and distributed computing
technologies, which can meet the needs of
real-time or near real-time large scale
observational data processing and off-line
physical analysis.
10Outline
- Background of AMS experiment
- Mass data processing grid
- AMS data processing grid platform
- Related research work on SEUGrid
- Future work in AMS data processing
11Data Grid Overview (1)
- European DataGrid
- DataGrid is led by CERN, together with five
other main partners and fifteen associated
partners. - Aim to enable access to geographically
distributed computing power and storage
facilities belonging to different institutions.
12Data Grid Overview (2)
- GriPhyN (Grid Physics Network)
- GriPhyN is developed by a team of experimental
physicists and IT researchers who plan to
implement the first Petabyte-scale computational
environments for data intensive science. - PPDG(Particle Physics Data Grid)
- PPDG is a collaboration of computer scientists
with a strong record in distributed computing and
Grid technology, and physicists with leading
roles in the software and network infrastructures
for major high-energy and nuclear experiments.
13Data Grid Overview (3)
- iVDGL(International Virtual Data Grid
Laboratory) - A global Data Grid that will serve forefront
experiments in physics and astronomy. Its
computing, storage and networking resources in
the US, Europe, Asia and South America, provide a
unique laboratory that will test and validate
Grid technologies at international scales.
- DataTAG
- Providing high performance networking between
Geneva in Switzerland and Chicago in U.S.(2.5Gbps
leased line), and focusing on interoperability
between these intercontinental Grid domains.
14DataGrid Application Background
- Specific application backgrounds
- High Energy Physics (HEP), led by
CERN(Switzerland) - Biology and Medical Image processing, led by CNRS
(France), - Earth Observations (EO), led by ESA/ESRIN (Italy)
- DataGrid mainly aims to CERNs High energy
physics, solving mass data storage, partition and
processing, and extends to EO and Bio Information
processing. - Applications, especially LHC application in the
future, are basis of developing DataGrid. If
resolving LHC application, the research about
DataGrid will come to a strategic victory.
15DataGrid Hierarchy
16Mass Data Processing in DataGrid
- The Particle Detector produces raw data at the
magnitude of PB/s. After filtered by the on-line
system, and processed by the off-line processors
owning the capability of 20TIPS, the data will be
written to tapes at the speed of 100MB/s at last
.The data in tapes is processed by DataGrid
indeed. - CERN Computing Center is responsible for
dispatching the data to Area Centers in Europe,
North America, Japan by high-speed networks. The
Area Centers makes further partition on such huge
amounts of data and then the data stream will
decrease to about 1MB/S when it reached
physicists desktop, which can be processed
easily.
17DataGrid Architecture
18 Outline
- Background of AMS experiment
- Mass Data Processing Grid
- AMS data processing grid platform
- Related research work on SEUGrid
- Future work in AMS data processing
19Present hardware at SEU
- School of Computer Science Engineering
- Jiangsu Provincial Key Laboratory of Network and
Information Security - State key laboratory of microwave
- Campus network center
- Library information center
- CERNET Eastern China (North) Regional Network
Center - Connection to ChinaGrid
20Hardware List
21Connection to ChinaGrid
- Connection to ChinaGrid with 1 Giga routers and
switchers - ChinaGrid grid middleware CGSP deployed
22ChinaGrid
- China Education and Research Grid
- Funded by Ministry of Education
- Based on CERNET
- Fisrt Phase
- From 2003 to 2006
- 12 key universities as initiative
- More than 6Tflops w/60TB
- 20 key universities by the end of 2004
23ChinaGrid Members
24ChinaGrid Main Tasks
- Campus grid platform
- Common platform for ChinaGrid
- Grid application platform and representative grid
applications - Image processing grid
- Bioinformatics grid
- Course on-line grid
- Computational fluid dynamic grid
- Large scale information processing grid
25ChinaGrid Specific Application Grid
26ChinaGrid Supporting Platform CGSP
Grid Security
27CGSP in Details
Grid Portal
??????
Portal tookits
GridPPI
Resource pack tool
Job define toolkit
Installation packs
Manage GUI
Domain mngmnt
User mngmnt
moniter
Info service
virtual interface
QoS mngmnt
info collect
Fault-tolerant
Service matadata
Service match
Policy negotiate
Inter-domain id map
Service matadata management
Info mngmnt
Info matadata
search tech
Info class
Job submit
Job sched
Service Remote deploy
Job SLA mng
Workflow mngmnt
Job status monitor
Data mngmnt
Unified file interface
SRB
File Matadata mngmnt
Replica policy
Service container
lifecycle
Service deploy
Service registry
Replica directory
Res SLA mng
SOAP
Service monitor
Status report
Data storage proxy
28SEUGrid
- 9 persons were sent to Switzerland, USA, Italy to
work together with foreign experts and to
design the AMS-SOC data processing System for 2
years in CERN. - We attended AMS TIM in Switzerland and USA 19
times. - Professor Samuel C. C. Ting and 9 foreign experts
related to AMS experiment went to SEU and
discussed the requirements and system design of
AMS-SOC at SEU.
Based on above work, we developed a grid
platform called SEUGrid for AMS mass data
processing and analysis.
29SEUGrid
- SEU Gridport
- Portal of the grid platform for mass data
processing and analysis. It is in charge of task
submission, query of computing nodes state,
recollection of computing results. - SEUGrid computing nodes
- Receive tasks from portal and generate
computing results.
30SEUGrid Architecture
31SEUGrid Functions
- Authentication and login of grid user
- Query of computing resources status
- Submission, scheduling and execution of computing
task, tracing of execution status, and real-time
log - Transmission of computing result and distributed
storage and management of mass data - Remote invoke of Commands
32SEUGrid Function (1)
- Authentication and login of grid user
- SEUGrid provides single sign on and
authentication to the remote hosts based on GSI,
which is implemented by providing the MYProxy
host, user name and password for the Portal. - The portal may use this information to consign
trust certificate in the MYProxy. Thus, when a
user logs in, the portal can create an available
proxy for the user to execute tasks. At the same
time, the portal creates a session for the user,
which keeps the users status until log out.
33SEUGrid Function (2)
- Query of computing resources status
- A user can inquire the static information of
computing resources such as CPU, memory, running
process, supporting software and so on, by the
MDS (Monitoring and Discovery Service) based on
LDAP and information providers service. - Besides, the user can also dynamically inquire
the available service status at each computing
node by using Ping grid command.
34SEUGrid Function (3)
- Submission, scheduling and execution of computing
task, tracing of execution status, and real-time
log - A user can submit MC simulation arguments and
start the computing tasks by using MC simulation
Portlet. - The scheduling module at the back-end will
allocate tasks to the suitable computing node,
and then the task will start being executed in
the GT environment of that remote node, and the
real-time log will be returned as a stream. - The portal host will keep connections with the
computing nodes. Therefore, the user can trace
executing status, inquire current executing
results and manage related task logs.
35SEUGrid Function (4 5)
- Transmission of computing result and distributed
storage and management of mass data - SEUGrid can automatically recollect the task
computing results, or users manage the result
files by themselves, both based on GridFTP
service at each computing node. - Remote invoke of Commands
- SEUGrid supports remote invoke of commands.
36MC Production
37Features of MC Simulation Computation
- Huge Computation
- 1,000,000 event simulation,1000 hours(Intel
petium4 single processor ) - Good parallel
- rough granularity, easy to divide
- Large Scale Data
- Totally 206T
38AMS MC Software and Configuration
- ams02mcdb
- ams02mcdb.addon
- bbftp files
- CRC Linux execs
- AMS mysql
- book-keeping execs docs
39AMS MC Simulation Flow
40SEUGrid Portal
41Task Submission
42Task Monitoring
43Result Recollection
44Result Retrieving
45Final Results
46 Current MC Simulation Results
- By filling in the MC JOB REQUEST FORM from CERN,
we completed the MC simulations of 3 datasets of
He, E, Protons. - We submitted MC simulation tasks to computing
nodes by the SEUGrid , and about 50,000
simulation events (up to 1 T) were produced. And
then by bbftp we recollected simulation events to
hosts at CERN. The simulation results were
checked by the MC experts, and each was OK with
AMS experiment.
47Outline
- Background of AMS experiment
- Mass data processing grid
- AMS data processing grid platform
- Related research work on SEUGrid
- Future work in AMS data processing
48Related Research Work on Grid
- Trust-based and QoS-measured scheduling algorithm
- QoS-based grid resource management
- Dividing Grid Service Discovery into 2-Stage
Matchmaking - Predict-based and cost-based replica replacement
algorithm - Semantic access control in grid computing
- Grid security policy implementation model and
dynamic authorization with feedback
49Trust-Based and QoS-Measured Scheduling
- Current scheduling algorithms, such as 2-Phase
Scheduling Strategy, Co-RSPB, Co-RSBF, Co-RSBFR
Algorithms based on priority and Best Fit
Mechanism, are lack of considering together with
QoS requirements, the scheduling efficiency and
the dynamic characteristics of VO or networks. - By trust degree, trust Model optimizes many
factors, such as task quantity, task arriving
rate, length of waiting queue, diversity of
network structure and robustness of computing
nodes, etc. - Deriving from the trust model, this scheduling
strategy always selects nodes whose trust degree
is high to get the best stable and high efficient
computing nodes. This Scheduling decreases actual
response time of task and improves the dynamic
efficiency of the task computing.
50Definition of Trust
Definitions of Direct Trust, Reputation and Trust
from source node i to destination node j
51Comparison TBQMS and NSA
Comparison between TBQMS and NSA
52QoS-based Grid Resource Management
- Application/Grid Service layer
- Providing descriptive QoS parameters, such as
Security QoS, Reliability QoS and Accounting QoS,
and, meeting the needs of end users simple QoS
requests. - Virtual Organization (VO) layer
- Mapping users QoS requirements to Grid QoS,
and integrating similar QoS parameters of
Physical layer as one group according to their
properties. - Physical layer
- Mapping QoS of VO layer to Physical layer.
53Hierarchical Structure of Grid QoS
54Structure of GRAM based on QoS
55Features of GRAM-QoS Model
- Guarantee QoS requirements of users
- By mapping, converting and negotiating the QoS
parameters, GRAM-QoS can set the user's
requirement about QoS in the process of resource
allocation management. - By QoS admission control, GRAM-QoS can avoid
invalid resource allocation and balance the
resources workload. - So GRAM-QoS not only can fulfill the user's QoS
requirement but also can enhance the efficiency
of resource allocation and system performance. - Possess Excellent Scalability and Compatibility
- Scalability All modules of GRAM-QoS can be
customized by grid developers. - Compatibility This model can be compatible of
other models.
56Model of Service Discovery
- Dividing grid service discovery into 2-stage
matchmaking. The service matching process is
divided into 2 stages service type matching and
instance matching, and a Grid Service Discovery
Model Based on 2-Stage Matching is proposed. - In the model, VO is regarded as the managerial
unit for grid services and a two-level
publication architecture is adopted. - The simulation results show that the model can
effectively aggregate the service information and
avoid the workload caused by frequent dynamic
updating.
57Process of Service Discovery
58Data Replica Management
- It predicts the hot spot replica in time window
and only keeps part of copies and not only
improves the speed of accessing but also saves
the storage space. - We focus on the cost of copies replacement, such
as network delay, bandwidth, copy size and system
reliability. - In SEUGrid, the data management service mainly
adopts the predict-based and cost-based replica
replacement algorithm -- PC-based algorithm.
59Simulation Results (1)
The simulation results of LRU?LFU and PC-based
algorithms in mean job time
60Simulation Results (2)
The simulation results of LRU?LFU and PC-based
algorithms in bandwidth consumption
61Semantic Access Control
62Features of Semantic Access Control
- Ontology has ability to specify the
heterogeneous, distributed and semi-structure
information well, which is fit for the high
distribution and dynamic of the Grid. It also can
be expressed by semantics and is good for the
security information exchange among heterogeneous
systems. - The security policies and security attributes of
resources and entities can been clearly expressed
by the lexical description based on the ontology,
which is good for the security information
exchange among heterogeneous systems. - Based on the semantic description, the logic
layer supplies the related rules in the semantic
reasoning. The access control decisions are made
according to the results of semantic reasoning.
That can make the grid access control mechanism
more intelligent and dynamic, and can implement
the access control of fine granularity.
63Grid Security Policy Implementation Model
- The resource sharing in the Grid can be
controlled, not implemented at will. The security
policy implementation model can be established
through the negotiation between the resource
provider and resource user. The security policy
can be implemented in the model. - Because of the dynamic of the grid, the security
mechanism used in the grid is not static . It
should be convenience to modify and configure. - We put up a security policy implemented model
SPIM. The functional entities in the model and
the communication processes are established. The
warrants used in the communication are also
stipulated. - The global security policy of VO and the local
security policy of the resources administrative
domain can been made, modified and implemented
separately in the model. And consistent
implementation of policies at all levels can be
guaranteed.
64Dynamic Authorization with Feedback
- A dynamic authorization mechanism for gird is
proposed in the SPIM. Through the negotiation
between users and resources, the Grid security
management entities bind their requirements
together and form bindings according to security
policies, make the authorization decision.
65Outline
- Background of AMS experiment
- Mass data processing grid
- AMS data processing grid platform
- Related research work on SEUGrid
- Future work in AMS data processing
66Future SEUGrid Structure
67Data Management of SEUGrid
68Deep Processing of AMS Data (1)
- Scheduling strategy based on workload balance
- The raw data transmitted from GSC at a rate about
3 to 4 Mbits/s should be stored in the databases
.In addition, the reconstructed and tagged data
from the raw data above which is about 44T and
0.6T every year separately, and the synchronous
MC data (about 44T every year) also should be
stored. A storage node would be chosen by a
reasonable schedule algorithm based on factors of
storage node capacity, network bandwidth and
latency, and so on. - Mass data index and compression
- 80 of the raw data and 20 of the ESD data
should be kept on disk for direct access. A
research on mass data index and compression is
needed to support real-time access.
69Deep Processing of AMS Data (2)
- Quick data classification and categorization
- All the data should be categorized to support
non-direct access. A research on mass data
classification and categorization is needed. - Task decomposition, schedule and collaboration
- MC computing, which is large in scale, is a
computing of big granularity. As a result of task
decomposition and executing sub-task in the grid
environment, much time would be saved. - 10 of the raw data should be used to reconstruct
the events in half an hour. A research on task
schedule and collaboration in real-time
circumstances is needed.
70Deep Processing of AMS Data (3)
- Fault tolerance
- Eliminate the failure of application computing
which is caused by the loss of network messages,
the fault of software execution and tampered
messages. - The data visualization and virtualization and
data mining - Researches on techniques, such as the distributed
visualization and collaborative visualization of
scientific computation are needed to support
visualization analysis of AMS-02 data. - A research on mining technology of mass data is
needed to get undiscovered phenomena and laws
from AMS-02 mass data.
71Deep Processing of AMS Data (4)
- Monitoring transmitting result of AMS experiment
- Data storage , physical analysis and software
development of AMS experiment can de done in the
different type of computers. To manage and use
these computers more efficiently, developing a
network performance monitoring system is
necessary. - Security and access control
- AMS experiment involves over 300 scientists from
15 countries and institutions. Because of high
degree distribution and the large scale of AMS
data, a series of advanced security monitoring,
defense and control technologies are needed. - Combining with new semantic web technology, a
research on semantic access control is needed to
offer grid security technologies, which fit for
forming dynamic, heterogeneous, distributed
SEUGrid environment.
72