Title: File Systems
1File Systems
2NTFS
- New Technology File System (NTFS) was built to
provide features like - Reliability introduced ideas like transactions
(grouping certain updates together to maintain
integrity) - Security and Access Control built-in features to
manage who can access files and what type of
access they have
3NTFS Features (Cont.)
- Large-capacity partitions allows large
partitions and even RAID (Redundant Array of
Inexpensive Disks, treating multiple disk as one
large disk) - Slack reduction allocates space differently from
FAT - Allows for long file names (not limited to
8-character names with 3 character extensions) - Networking built with networking in mind
4A more structured file system
- In NTFS files are more than just pools of data,
they have structure - The difference between FAT and NTFS is somewhat
analogous to the difference between a flat file
and a database. - This file system as database idea is taken
further in WinFS - Just as in databases where one has data and
metadata (the data about the data), NTFS has
metadata files (files that contain data about
other files).
5Partition/Volume Boot Sector/Record
- One of the first things made when an NTFS
partition is created is the volume boot sector,
which contains - BIOS parameter block identifies the partition,
how big it is, etc. - Volume boot code code that starts to load the
operating system
6All else is files
- After the volume boot sector, just about
everything else is a file. There are - metadata files files about files
- Created automatically when the partition is
formatted - Placed at the beginning
- (Actual or real) Data files
7MFT
- Think of the Master File Table (MFT) as a
database containing records about all of the
files (both data and metadata, including itself).
- Each files record holds the values of its
attributes. - The actual data in a data file is simply one of
its attributes.
8The first several records
- The first several records in the MFT are about
other important metadata files, including - MFT itself
- MFT Mirror (1st 16 records)
- Log file (keeps account of transactions)
- Attribution Definition Table (names file
properties and says what they are) - Root Directory Folder
- Bad cluster file
- Etc.
9MFT Zone
- There will be a record in the MFT for every file
on the partition. - Thus the MFT needs room to grow.
- Some space in the partition, called the MFT Zone,
is reserved for this purpose. - If one needs part of the MFT zone for storage, it
will eventually be used. - On the other hand, the MFT can grow to be larger
than the MFT zone. It is then fragmented which
could affect performance.
10Resident vs. Non-Resident Attributes
- The MFTs record size is fixed (between 1KB and
4KB), but the attributes may be of any size
(especially since a data files data is an
attribute). - Attributes that are contained in the MFT are
called resident. - A small file may be entirely resident.
- Attributes that are linked to but not actually
contained in the MFT are called non-resident.
11Some File Attributes
- File name (can be up to 255 characters, allows a
file to have aliases) - Standard Information read-only, hidden,
archived, time stamps, etc. - Security Descriptor Access Control Lists (ACLs)
who owns the file, who has what privilege, etc. - Data the actual data
12Security
- NTFS was designed with the idea of multiple users
and security in mind. - The features necessary to implement a security
policy are built directly into the file system. - In FAT32 a file may be hidden or read-only, but
in NTFS a file can be hidden from user1,
read-only to user2 and fully accessible to user3.
13Security Concepts
- Ownership some user owns a file/folder and he or
she grants permissions to other users. - Permissions what a user can do with a
file/folder (read, read-write, delete, etc.) - Users are placed in groups (possibly more than
one) and permissions are assigned to groups - Permissions can be inherited, e.g. new files gets
permissions of folder it was created in - Auditing tracking information about users
access to and modification of files
14(No Transcript)
15(No Transcript)
16ACLs
- An important security attribute of a file is its
Access Control List (ACL). - The ACL specifies which users can access the file
and in what way they can access the file - There are two types of ACL
- System ACL used for auditing purposes
- Discretionary ACL explicit assigning of
permissions to users or groups
17Permissions
18Reparse points
- Reparse Points One can associate an action or
actions with a file. So that if the file is
accessed, the action is performed. - Analogous to a trigger in a database
- Reparse points is very flexible, one example is
redirection sending one to another file or
directory, it may be on another drive or even
have been archived.
19Other features
- Improved Security and Permissions one change is
from static to dynamic permission inheritance. - Static a child inherits the parents permissions
when it is created but is unaffected by
subsequent changes in the parents permissions - Dynamic a change to the parents permission will
affect the childs permissions - Change Journals improved auditing (journaling)
of file/folder access activity. - Encryption Automatic encryption/decryption of
files (when accessed by users with the
appropriate permissions).
20Improvements
- Disk Quotas Users or groups of users can be
limited in the amount of disk space they can use. - Sparse File Support A sparse file is one that
may be big but hold very little data (relative to
its size). NTFS has utilities to help store
sparse files more efficiently. - Disk Defragmenter Strictly speaking part of the
operating system, it affects the file system.
21Transactions
- Dont forget NTFS is database like.
- Almost any activity involving the drive in anyway
is going to affect a number of files. - NTFS introduces the notion of a transaction the
grouping together of various operations to form
an atomic unit. - In other words these operations should be viewed
as all or nothing in order to maintain the file
systems integrity. - Recall the ACID test from databases?
22Logging and Committing
- There is a special metafile for logging all
activity. - When all of the components of a transaction are
complete, this completion is indicated in the log
file and the transaction is said to be committed.
- If something goes wrong (e.g. power failure)
before a transaction is completed, the file
system can undo the partially enacted transaction
to return the file system to a consistent state.
Doing so is said to be rolling back the
transaction. - It is also called transaction recovery.
23Effect on Performance
- Logging each activity which is great for security
and integrity of the file system but does have
some negative effects on performance. - Each file access now requires another file access
(writing to the log file). - One way to save on performance but risk somewhat
integrity is to cache the activity log changes
rather than write to disk every time. - The cached log results are written to disk
periodically but not continuously.
24Recovery
- Recovery then involves three passes over the log
file - Analysis pass determine the part of the disk
affected - Redo pass perform any transaction that was
completed since the last checkpoint - Undo pass roll back any incomplete transactions
25Change Journal
- NTFS can record changes to files, these are kept
in the Change Journal. - Each change is assigned an ID, an Update Sequence
Number (USN). - It will record that a file was written to but not
what was written. Otherwise it would be
gargantuan.
26Fault Tolerance
- NTFS has a fault tolerance disk driver known as
FTDISK. - Thats where one can find the transaction
recovery features. - Also where one finds support for RAID (redundant
array of inexpensive (or is that independent)
disks). - And where youll find dynamic bad cluster
remapping. - Basically the drive reads immediately after
writing to ensure that the cluster written to was
OK. If it was not, it writes it somewhere else
and marks the cluster as bad.
27Compression
- NTFS has build-in utilities for file compression
- File compression takes advantage of patterns in
data to reduce the amount of space required to
store it. - E.g. instead of ASCII code for text (each
character 8 bits) one might use a variable length
code with short codes for common letters like e
and longer codes for uncommon letters like q or
j. On average the files are much smaller. - In NTFS one can compress any part of the
partition.
28POSIX support
- NTFS offers POSIX support.
- POSIX stands for Portable Operating System
Interface for UNIX - It allows software developers to make sure that
their code can be ported to a POSIX-compliant
operating system, which includes most versions of
UNIX.
29Supports Encryption
- NTFS supports Encrypting File System (EFS).
- EFS is really part of the operating system
(Windows 2000). But the operating system works
with the file system to make this feature easy to
use.
30Disk Quota support
- As a genuinely multi-user file system, NTFS
support disk quotas - A quota can be set for a particular user or on a
particular partition or the combination. - Allows for limits and warnings. The user is
warned when he or she exceeds the warning amount.
The user is blocked (from writing?) when he or
she exceeds the limit amount. - Monitor and log events that cause a user to go
over the "limit" or "warning" levels.
31WinFS
- Microsofts next version of the filesystem is
WinFS. (supposedly?) - While NTFS brings many of the concepts of a
database to the filesystem, WinFS is a database. - With the database ideas built in rather than
overlaid, searching and querying should be
enhanced. There will be metadata about the files.
The dominating idea will become the properties
and logical relationships of the file rather than
its position in some hierarchical system of
folders and files. - http//en.wikipedia.org/wiki/WinFS
- http//msdn.microsoft.com/data/WinFS/default.aspx
32DMA
33Transfer Mode
- The transfer mode describes the way in which the
data moves from the hard disk through the
interface (IDE/ATA) and to the memory. - For example, it tells how fast data is
transferred or what device is in charge of the
transfer. - There are two basic categories
- PIO (Programmed I/O) Mode
- The processor micro-manages data transfer
- DMA (Direct Memory Access) Mode
- The processor delegates data transfer
34Programmed I/O (PIO) Modes
- In the PIO category, the processor controls the
data transfer. - There are various PIO modes which differ mainly
by speed. - Through the early to mid-90s PIO was the standard
way to transfer data to the hard disk. - The original ATA standards document defined the
first three modes. - With ATA-2, two faster modes were introduced.
35Standard PIO Modes
3.3 MB/s 3.3 ? 106 bytes / second
(2 bytes / 600 ? 10-9 s) Two bytes are
transferred every 600 nanoseconds.
36External rates
- The PIO rates on the previous slide are external
rates meaning that they reflect the rate that
data in the hard disks buffer/cache can be
transferred. - Recall that access times to locate and read from
a random sector are of the order of milliseconds. - Reading a sector (512 bytes) in 20 ms would
correspond to a rate of 25 KB/s. - If one were not buffering and transferring
consecutive data, the PIO mode rates would be
sufficient. - But we do transfer buffered data and the PIO
transfer rates are considered prohibitively slow
by todays standards.
37PIO is too, too slow
- PIO is slow in two ways
- One does not achieve the same data transfer rates
as with Ultra DMA, which is the standard transfer
mode used for IDE/ATA today. - Because the processor controls the details of the
transfer in PIO, the processor is distracted from
performing other tasks. - Despite its slowness, PIO is still around
because - PIO is simple (built into the BIOS so it does not
require drivers). - Backward compatibility
- Can be used as a backup when something goes wrong
with DMA.
38DMA
- The alternative to PIO is DMA, Direct Memory
Access. - In DMA, a device transfers information to or from
the memory directly rather than in a
processor-controlled fashion. - DMA has been around awhile but it was not always
well supported early on. But speed requirements
have made it preferred over PIO.
39Various Modes
- As with PIO, DMA has various modes differing
mainly by speed. - DMA modes split into two categories
- Single-word modes which send one word (two bytes)
at a time - Multiple-word modes which send several words in
rapid succession (rather like the idea of
bursting that accounts for improved memory
speeds).
40(No Transcript)
41Assume its multiword
- The single-word DMA modes are too slow, today it
is understood that DMA is multiword DMA and the
term is rarely mentioned and usually implied. - In fact, the single-word DMA modes were dropped
from the standards with ATA-2. - Ultra DMA is multiword.
42First Party
- The party of the first part shall be known in
this contract as the party of the first part
43First Party vs. Third Party
- In Third-Party DMA there is a third device, the
DMA controller mediating the transfer between the
hard disk and memory. - Third party DMA is slow and old fashioned.
- In First-Party DMA, a.k.a. bus mastering, the
middle man is eliminated. The hard drive
controls the transfer of data between itself and
memory. - The device (hard drive in this case) takes
control of (masters) the bus along which the
information is sent.
44Ultra DMA
- DMA only became the norm with the introduction of
Ultra DMA. - It just wasnt well supported before.
- DMA gained the advantage over PIO when Ultra
DMA/33 doubled the interface transfer rate. - Support for it also improved.
- Ultra DMA became an industry standard.
45Ultra DMA uses DDR and CRC
- One feature that made Ultra DMA ultra was that
it transferred data on both the positive and
negative edges of the clock. - Same idea as in DDR (Double Data Rate) memory
- As Ultra DMA pushed the limit on transfer rate,
it made the occurrence of errors somewhat more
likely. Thus it introduced a CRC (Cyclic
Redundancy Check) as part of the standard. - Recall CRC is error detection. If an error
occurs in transmission the data is retransmitted.
46Ultra DMA transfer rates
DMA finally beats PIO Mode 4s 16.7 MB/s with the
added bonus of freeing up the processor. The
modes are usually named after their maximum
transfer rate and the interface they use Ultra
ATA/100 instead of Ultra DMA/Mode 5.
47New cable needed
- The faster speeds did require a change in the
cable used to connect the drive. The 80-conductor
ATA/EDE cable. - Color Code
- Blue connects to the host (motherboard or
controller). - Gray connects to the slave drive (if there is
one) - Black connects to the master drive
4880 wires and 40 pins?? (pre SATA)
- Signals are varying currents.
- Currents produce magnetic fields, varying
currents produce varying magnetic fields. - Varying magnetic fields produce currents.
- Oops, the current in wire 1 begins to change the
current in wire 2. - There is interference (a.k.a. cross talk)
- The extra wires shield the signal-carrying wires
from each other.
49Block Mode
- Certain BIOSs allow for a Block Mode setting.
Block Mode allows 16 or 32 sectors (512 bytes
each) to be handled using a single interrupt
(even if the processor is not running the show it
needs to know something has happened).
50References
- PC Hardware in a Nutshell (Thompson and Thompson)
- http//www.pcguide.com
- All-in-One A Certification, Meyers and Jernigan
- http//www.webopedia.com
- http//www.serialata.org