Title: Design and implementation of XMLbased Linux file system runner
1Design and implementation of XML-based Linux file
system runner
- Presenter Qian Zhang
- Date October 2nd,2006
- Major professor Shashi K. Gadia
2Outline
- Introduction
- Motivation
- Obstacles and Objectives
- XML-LFS architecture
- Design and implementation of XML-LFS
- Performance experiments
- Conclusion and future work
- System demo
3Introduction organization of a typical file
system (1)
- Files form a rooted tree called file tree
- A file can have multiple names (links)
- Files are nodes in the tree
Figure 1. File tree
4Introduction organization of a typical file
system (2)
- Every file has
- An inode (index node) containing the metadata of
the file - A sequence of data blocks containing the file
data - Data blocks are pointed to by its inode
- A disk inode and an in-core inode
- Files vs. directories
- A directory is also considered a file
- Directories are internal nodes in the file tree
- Non-directory files are terminal nodes
- A directory is a container for files
- Data blocks for directories contain
(i-number,filename) pairs of its child files
5Introduction content of dir
- . and .. represent the directory itself and
its parent directory - Inode numbered 0 means that the file was
deleted but once represented the file file55 - The data entry can be 16 bytes, 257 bytes or
other sizes which limits the maximum file name
length
Figure 2. Contents of a data block for directory
/myDir
6Introduction inode
Figure 3. Unix inode structure
7Introduction - Linux file system architecture
Figure 4. Linux file system architecture
8Introduction disk layout
- Disk layout of a Linux file system instance
Figure 5. Disk layout of UNIX-like file system
instances
- Boot block contains bootstrap code to boot the
machine - Super block contains metadata of the file system
- Inode list contains all disk inodes
- Datablock list contains all data blocks which
hold the file data
9Introduction how OS uses file systems
Figure 6. Interactions between the OS kernel and
the file system
10Introduction XML
Figure 8. Element hierarchy of the XML document
Figure 7. Simple XML document example
11Outline
- Introduction
- Motivation
- Obstacles and Objectives
- XML-LFS architecture
- Design and implementation of XML-LFS
- Performance experiments
- Conclusion and future work
- System demo
12Motivation limitations
- Limitations of Linux file system instances
- File size is limited (4G)
- Hard to find files with given properties
- Difficult to reorganize files
13Motivation potential
- XML can help
- The file system can be xml-ized by using xml tags
- Can be queried, traversed and manipulated by
using XQuery, DOM or other tools - This can be virtual or materialized
- In principle they provide the same capability
- Performance may differ and may be studied
14Motivation what we can get
- If we represent the whole file system within one
XML document, - We can generate various inode structures
- We have the ability to find files of given
properties very easily - Only involve searches on one flat document
- We have the ability to reorganize files in the
file system - Simple XQuery statements
15Motivation XML-based file system
- The file system hierarchy is kept by XML nested
elements - The meta-data is described as the attribute lists
of the element - File content can be wrapped or linked into the
element
Figure 9. XML-based file system example
16Motivation file search diagram (1)
Figure 10. Directory tree example
17Motivation file search diagram (2)
- Search ./dir1/dir4/file8 in Linux file system
Figure 11. File search process in Unix-like file
systems
18Motivation XQuery example (1)
- In Linux file system, we cant do this in a
single search, we need to recursively search each
directory within the entire file system
19Motivation XQuery example (2)
- We can list all files in certain directory in one
search, but we cant get all files in a
system-wide scope in a single search
20Outline
- Introduction
- Motivation
- Obstacles and Objectives
- XML-LFS architecture
- Design and implementation of XML-LFS
- Performance experiments
- Conclusion and future work
- System demo
21Obstacles and objectives - obstacles
- We have not found any previous work on
xml-ization of file systems - Linux virtual file system is rigid
- It supports a fixed set of functions
- New functions cannot be added
- For XML-ization we used Java and jdom
- Comparison with existing benchmarks is difficult
- The performance of XML navigation tools such as
DOM and parser is not high
22Obstacles and objectives- objectives
- Find a way to remove or mitigate the limitations
of existing file systems - Generate various internal file representations
- Remove the limitations of maximum file size
- Obtain good performance for querying files
- Show the feasibility of the application of XML in
operating systems
23Outline
- Introductions
- Motivations
- Obstacles and Objectives
- XML-LFS architecture
- Design and implementation of XML-LFS
- Performance experiments
- Conclusions and future works
- References
- System demo
24XML-LFS architecture overview of Linux file
system architecture
Figure 12. Exploration of Linux file system
25XML-LFS architecture XML-LFS architecture (1)
Figure 13. XML-LFS architecture
26Outline
- Introduction
- Motivation
- Obstacles and Objectives
- XML-LFS architecture
- Design and implementation of XML-LFS
- Performance experiments
- Conclusion and future work
- System demo
27Design and implementation of XML-LFS design
file operations (1)
- Create file system
- Load XML configuration file and retrieve file
system parameters - Create the RandomAccessFile instance on disk
- Format this random access file according to file
system parameters - Initialize system data structures
- Create XML file system layer (XML-LFS.xml)
28Design and implementation of XML-LFS design
file operations (2)
- Mount/(unmount) file system
- Load necessary information about the system into
the memory - Superblock page
- First inode bitmap page
- First data block bitmap page
- Set the current working directory to be the root
directory - Parse XML-LFS.xml and get the root element of
this XML document
29Design and implementation of XML-LFS design
file operations (3)
- Create/(delete) a file
- Generate the full pathname
- Filename current working directory
- Check whether this file exists or not
- Hash( pathname, i-number)
- Generate the corresponding XML element according
to the file type - Allocate the disk inode and in-core inode
- Initialize the in-core inode
- Write in-core inode to disk inode
- Update its parents inode
- Append the XML element of this file to the right
position in the XML file system layer - Update the parent XML element
30Design and implementation of XML-LFS design the
disk space format (1)
Figure 14. Disk layout of XML-LFS
- Total number of disk inodes 32768
- Number of inodes in one bitmap page 7968
- Number of bitmap pages 5
- Size of the random access file 32M
- Size of the data block 1k
- Total number of data blocks 32768
31Design and implementation of XML-LFS design the
page format (2)
Figure 15. Superblock page
Figure 16. Inode bitmap page Datablock bitmap is
similar
Figure 17. Disk inode page
32Outline
- Introduction
- Motivation
- Obstacles and Objectives
- XML-LFS architecture
- Design and implementation of XML-LFS
- Performance experiments
- Conclusion and future work
- System demo
33Performance experiments Boonie and Andrew
benchmarks
- Boonie benchmark
- Measure real I/O speed to see whether it becomes
the bottleneck of the system - Andrew benchmark
- To evaluate the internal interactions within
Andrew File System
34Performance experiments Boonie-like experiment
- Measure the read/write speed of files with
different file size as well as different
directory level
Table 1. Boonie-like experiment result
- For the same file, it takes more time to
read/write if its directory level is larger
- It takes more time to write the file than read
the file - For files of different size at the same directory
level, it takes more time to read/write larger
file than smaller one
35Performance experiments Andrew-like experiment
(1)
- Measure the interactions among the system
internal parts - Step1 MakeDir
- Create a directory hierarchy.
- Step2 Write
- Write content to each file.
- Step3 ScanDir
- Recursively examine the status of each file.
- Step4 ReadAll
- Read each byte of each file
36Performance experiments Andrew-like experiment
(2)
Table 2. Andrew-like experiment result
Table 3. Andrew-like experiment result (without
XML layer)
37Outline
- Introduction
- Motivation
- Obstacles and Objectives
- XML-LFS architecture
- Design and implementation of XML-LFS
- Performance experiments
- Conclusion and future work
- System demo
38Conclusion and future work - conclusion
- We examined some limitations of existing
Unix-like file systems - We explored the Linux file system architecture
and instances - We showed the ability to apply XML at the system
level - We designed and implemented an XML-based
Linux-like file system runner - Generate various inode structures
- mitigate some existing limitations
- The system performance needs to be improved
39Conclusion and future work future work (1)
- XML file system layer
- Materialization
- Current XML-LFS implementation
- Keep record of file system meta-data in a single
XML document - Slow I/O speed
- Virtualization
- View binary format of file system as XML
document, data entry as XML elements - XML document can be incrementally computed on the
fly when needed - Save time on processing XML document all the time
- Security
- Meta-data and file data protection
- Multi-level security of each document portion
40Conclusion and future work future work (2)
- Change the 3-level data block address
representation inside the Linux inode structure
to B tree structure
Figure 18. B tree inode structure
- The logical order of data blocks within this file
serves as the key - The key is generated on the fly for the newly
inserted data block
- The root of B tree is cached
- System daemon will clean the keys frequently
41References
- 1 R. Card, T. Ts'o, and S. Tweedie, Design and
implementation of the second extended file
system, in 1st Dutch International Symposium on
Linux, 1994 - 2 Modern Skinning Tutorial - 2.2 XML
introduction, 2004, http//www.winamp.com/nsdn/wi
namp/skinning/modern/tutorials/2.2-xml.php - 3 A. Russell Jones, XML we aint seen
nothins yet, December 17, 2003,
http//www.devx.com/xml/Article/18112. - 4 Maurice J. Bach, The Design of the UNIX
Operating System. Prentice-hall, 1986 - 5 Simon St. Laurent, Bring the file system
into the file making information more accessible
through object stores, 1998, http//www.simonstl.
com/articles/filesyst.htm. - 6 Ronald Schmelzer, Breaking XML to optimize
performance, 24 Oct, 2002, http//searchwebservic
es.techtarget.com/originalContent/0,289142,sid26_g
ci858888,00.html. - 7 Q. Zhang and G. Lin, Implementation of a
relational database system on XML platform,
class project for COMS 562, Department of
Computer Science, Iowa State University, Fall
2005. - 8 Nikolai Joukov, Avishay Traeger, Charles P.
Wright, and Erez Zadok. Benchmarking file system
benchmarks, Technical Report FSL-05-04b. - 9 Tim Bray and Lauren Wood, Introduction of
Boonie, 1996, http//www.textuality.com/bonnie/. - 10 A. Tanenbaum, Operating Systems Design and
Implementation. Prentice Hall, 1987. - Etc. (please see thesis for a full reference list)
42Outline
- Introduction
- Motivation
- Obstacles and Objectives
- XML-LFS architecture
- Design and implementation of XML-LFS
- Performance experiments
- Conclusion and future work
- System demo
43