Title: Database File Systems in Support of eScience
1- Database File Systems in Support of eScience
Philip A. Adams LLNL/National Ignition
Facility John C. Hax Oracle Corporation
2Science A product of data analysis
- Science does not result from the launch of a
mission or the collection of data. Rather,
science only occurs through the analysis and
understanding of that data. - - Philosophy of the NASA Science Mission
Directorate (SMD)
3Questions to Ask
- Are we building IT Systems that support Research
and Analysis or Infrastructure that supports the
collection of data?
4Scientific Computing History
Commercial Relational Databases
Scientific Systems
- Scientific (minimal data shared)
- Raw Data
- Decentralized/Desktop Management
- Open source software
- Low quality of support/service
- Best Effort
- Mission critical operations
- Primarily file based HDF5,Lustre
- Millions of Files
- Write once, read many
- Background processing
- Pipelines
- Computationally intensive applications
- Long running transactions
- Output of Large Data Sets
- Single application profile
vs.
- Enterprise (all data shared)
- Metadata
- Centralized management
- Industrial strength software
- High qualities of support/service
- SLA guarantees
- Mission Critical Operations
- Mission critical operations
- Databases files
- Read and Update
- Enforced data integrity
- Interactive processing
- Interactive workflows
- Transactional, intensive applications
- Short running transactions (lt8 hours)
- Output of Individual Rows
- Mixed application profile
5Filesystems and Legacy Databases The Gap
Database Benefits
Filesystem Benefits
vs.
Superior query/search capability over
filesystems SQL standard Easy
manipulation of data Functions
PL/SQL Java, C, PHP, Perl Low
latency, interactive data access suited for
application access Provides a structured way of
storing data and ensuring data integrity
Tables/Constraints Superior backup and
recovery capabilities RMAN,
Redo/Archive logging Block and
Point-in-Time Recovery Block Level
Corruption Detection Institutional Resources
- Provided maximum scalability to meet data
- volume and ingestion requirements
- HDF5
- GFS (Google Filesystem)
- Lustre
- Ubiquity of accessing filesystems
- Number of protocols
- NFS, SMB, CIFS and FTP
- Able to access the data right from the OS
- Windows, Mac, Linux, Solaris, HP/UX
- Application programming interfaces
- support native access
- file open (f_open), file close (f_close)
- importing the java io package
- ifstream/ofstream C file I/O classes
6Data Challenges
- Physical Limitations
- I/O Intensive - limitations on max IOPS
- Network speeds - time to ship data to compute
nodes - Multiple Data Silos
- Governance issues
- Pedigree of the data
- Multiple access policies to get to the data
- Duplicate data stored in each silo
- Need to scale disparate systems as data grows
- Increased effort required for Scientists,
Developers, Administrators - Correlating the data across data silos
- Coordinated backup and recovery plan
- Multiple Data Aggregation Efforts
7The Result The Split Architecture a step in
the wrong direction
- These drawbacks include but are not limited to
- Data curation
- Security
- Availability
- Recoverability
- Manageability
- Because no common database and filesystem access
protocol was available, the burden shifted to the
application developers and scientific researchers
to make sense of the two silos of information
8How much of an issue is this?
- Level 0 (Raw) data is typically enriched with
data from other sources. - What happens when/if a diagnostic is found to
have incorrect calibration data? - Without strict relationships, this could be a
nightmare. It may be easier to rerun analysis to
reproduce the Level 1, 2 and 3 data. However, an
unknown quantity of Level 4 content has been
generated from this data and is stored on many
researchers workstations and file shares.
Lack of pedigree in data analysis can result in
instrument/machine damage, increased financial
costs, or embarrassment to scientific researchers
who rely on the data
9Future of Scientific Computing and Analysis
Collaborative
Data Intensive Collaborative Science
10Data Intensive Collaborative Science
Cost
Complexity
Knowledge Base
Interdependence
Drivers
Collaboration
Enablers
The Web
Network Capacity
Clustering/ Grid Technologies
Moores Law
Standards
11Whats driving the data volumes?
- Better and more diverse instrumentation
- Flexible optics
- Coordinated multi-instrument observatories
- Increased Precision
- Genomics
- Diverse types of data generated SQL/Scalar, XML,
Image, Monte Carlo simulations, Audio/Video,
telemetry, and spectrometers
12Database Filesystems
- Bridge the Gap between Filesystems and Relational
Database Systems - Maintain Filesystem Performance
- Leverage multiple access methods
- Single Security Mechanism
- Unified Administrative Tools
- Data Pedigree
- Unified Architecture and Skill sets
- Leverage Institutional Resources for IT
- Enabling Collaboration around Data
- Optimized for Data Access
13Pedigree with a database filesystem
3/3/2017
13
14Modern databases have much to offer in the realm
of data analysis
- RDF/OWL can allow semantic searching of data
- Predictive Analytics
- Spatial Data Analysis
- Text Mining of Unstructured Content
15Some of the native data mining techniques and
algorithms available
- Algorithms
- Logistic Regression
- Naive Bayes
- Support Vector Machine
- Decision Tree
- Multiple Regression
- Minimum Description Length
- One-Class Support Vector Machine
- Enhanced K-Means
- Orthogonal Partitioning Clustering
- Apriori
- Non-negative Matrix Factorization
- Technique
- Classification
- Regression
- Attribute Importance
- Anomaly Detection
- Clustering
- Association
- Feature Extraction
16Key Components of Secure Files Architecture
- Delta Update
- Write Gather Cache
- Transformation Management
- Inode Management
- Space management
- I/O Management
Finally the database can accept both structured
and non-structured data in an efficient manner
17National Ignition Facility
18UCRL-PRES-236394
National Ignition Facility and 11g SecureFiles
NLIT 2009
Philip A. Adams Sr. Systems ArchitectNational
Ignition Facility Lawrence Livermore National
Laboratory June 1-3 2009
This work performed under the auspices of the
U.S. Department of Energy by Lawrence Livermore
National Laboratory under Contract
DE-AC52-07NA27344
19Overview of the National Ignition Facility
- The National Ignition Facility (NIF) is known as
the worlds largest and most energetic laser - When fully operational, its 192 beams will
converge 1.8 MJ of laser energy onto a single
target to achieve thermonuclear ignition - NIF will enable experiments that produce
temperatures and densities like those in the Sun
or in an exploding nuclear weapon
NIF-1107-14129.ppt
Oracle, 11/12/07
19
20Overview of the National Ignition Facility
- The 192 laser beams of NIF will generate
- A peak power of 500 trillion watts, 1000 times
the electric generating power of the United
States - A pulse energy of 1.8 million joules of
ultraviolet light - A pulse length of three to twenty billionths of a
second
NIF-1107-14129.ppt
Oracle, 11/12/07
20
21The Optics make NIF work
- Optical components
- 7500 large optics including 3072 laser glass
slabs as well as large lenses, mirrors, and
crystals - More than 15,000 small optical components
- Precision optics
- Total area of 33,000 square feet (3/4 of an acre)
- More than 40 times the total precision optical
surface in the worlds largest telescope (Keck
Observatory, Hawaii)
NIF-1107-14129.ppt
Oracle, 11/12/07
21
22Example of Optic Damage
3 ns
2 µm
NIF-1107-14129.ppt
Oracle, 11/12/07
22
23On high quality optical surfaces initiated damage
sites are very small
NIF-1107-14129.ppt
Oracle, 11/12/07
23
24Performance Gains found in NIF with 11g
SecureFiles
- Test Environment
- Database Server
- HP Blade Server w/ 4-way AMD Opteron CPUs
- RHEL 4 32-bit kernel
- 11g Oracle Database 32-bit version
- Single Instance
- ASM
- Dual Port Fibre Channel Mezzanine Card (2 Gbit)
- Application Server
- Dell PowerEdge 2650 w/ 2-way Intel Xeon CPUs
- RHEL 4 32-bit kernel
- 10g Oracle Application Server
- 10g Oracle CMSDK (Content Management Software
Development Toolkit)
NIF-1107-14129.ppt
Oracle, 11/12/07
24
25Performance Gains found in NIF with 11g
SecureFiles
- Test Environment
- SAN Storage
- 3PAR S400
- Production Environment
- 11g RAC Environment
- 10g CMSDK Clustered Application Server
Environment
NIF-1107-14129.ppt
Oracle, 11/12/07
25
26Measure the throughput of the environment
- Perform dd tests to the disks to establish the
theoretical max - WRITE
- gt dd if/dev/zero of/dev/raw/raw6 count10000
bs1M - READ
- gt dd if/dev/raw/raw6 if/dev/null count10000
bs1M - MONITOR
- gt iostat xdk 3 100
- We saw 180 MB/sec Read/Write throughput to the
disks
Warning Be sure not to perform dd write tests on
your ASM configured storage or else youll damage
it
NIF-1107-14129.ppt
Oracle, 11/12/07
26
27Create a few test tables
- Create a test table for BasicFiles and a test
table for SecureFiles
BasicFile Example CREATE TABLE
"FOO_BASICFILE_TABLE" ( "PKEY" NUMBER(4) NOT
NULL , "DOCUMENT" BLOB) TABLESPACE "LOB_DEMO"
LOB ("DOCUMENT") STORE AS BASICFILE (
TABLESPACE "LOB_DEMO") SecureFiles
Example CREATE TABLE "FOO_SECUREFILE_TABLE" (
"PKEY" NUMBER(4) NOT NULL , "DOCUMENT" BLOB)
TABLESPACE "LOB_DEMO" LOB ("DOCUMENT") STORE
AS SECUREFILE ( TABLESPACE "LOB_DEMO")
NIF-1107-14129.ppt
Oracle, 11/12/07
27
28Throughput Results of Table Tests
- Speed tests from database server (Oracle 11.1.0
DB, using Oracle jdk 1.5.0_11 in OH/jdk, using
ojdbc5.jar) - Inserting twenty 32MB image files per test
29SecureFile vs. BasicFile Server Results
NIF-1107-14129.ppt
Oracle, 11/12/07
29
30Measure the throughput of the network
- Used a tool called iperf available at
- http//sourceforge.net/projects/iperf/
- On Server run
- ./iperf -s fM
- On Client run
- ./iperf -f M -c blackstone
- --------------------------------------------------
---------- - Client connecting to blackstone.llnl.gov, TCP
port 5001 - TCP window size 0.06 MByte (default)
- --------------------------------------------------
---------- - 5 local XXX.XXX.XXX.XXX port 58590 connected
with XXX.XXX.XXX.XXX port 5001 - ID Interval Transfer Bandwidth
- 5 0.0-10.0 sec 1120 MBytes 112 MBytes/sec
NIF-1107-14129.ppt
Oracle, 11/12/07
30
31Throughput Results of Client-Server Tests
- Speed tests from database server (Oracle 10.1.2
Client, using jdk 1.5.0_11 and ojdbc14.jar) - Inserting twenty 32MB image files per test
32SecureFile vs. BasicFile Client Results
33SecureFile Performance Benefits
- During our testing, weve seen a 2-20 times
increase in performance using SecureFiles over
traditional BasicFiles - Weve seen equivalent or better performance using
SecureFiles as we see writing the same file to
our NFS mounted NetApp
NIF-1107-14129.ppt
Oracle, 11/12/07
33
34Database Tuning to optimize for SecureFiles
- Create a separate tablespace for your LOB data
- Use Uniform Extents 1M seems best overall
- Tried 32M/64M extents with no performance
increase your mileage may vary - Enable Automatic Segment Space Management on the
tablespace - Create large enough redo log files
- We used 200M 1024M to reduce log file switches
during heavy loads
NIF-1107-14129.ppt
Oracle, 11/12/07
34
35Database Tuning to optimize for SecureFiles
- Utilize the AWR Snapshots before and after a
SecureFile load and note the wait conditions - SQLgt EXECUTE dbms_workload_repository.create_snaps
hot()Â - PL/SQL procedure successfully completed
- Run the AWR report
- ORACLE_HOME/rdbms/admin/awrrpt.sql
NIF-1107-14129.ppt
Oracle, 11/12/07
35
36Conclusion
- The ultimate goal of science is to create new
knowledge and new discoveries. - Database Filesystems have a number of features
which can benefit the scientific community and
ease the burden of pedigree, data management, and
analysis - Using a database filesystem will enable data
intensive collaborative science. - As new discoveries are made and data volumes
increase, it is imperative to have a robust
database system that is not only capable of
managing the pedigree of that data, but also
serve as a knowledge repository for the future.
37For More Information
http//search.oracle.com
SecureFiles
or http//www.oracle.com/