CENG 340 Data Management and File Structures - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

CENG 340 Data Management and File Structures

Description:

Most computers are used for data processing (over $80 billion/year) ... Data is not scattered hither and thither on disk. Instead, it is organized into files. ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 22
Provided by: atili
Category:

less

Transcript and Presenter's Notes

Title: CENG 340 Data Management and File Structures


1
CENG 340 Data Management and File Structures
  • (Fall 2007)

2
Introduction to File management
3
Motivation
  • Most computers are used for data processing (over
    80 billion/year). A big growth area in the
    information age
  • This course covers data processing from a
    computer science perspective
  • Storage of data
  • Organization of data
  • Access to data
  • Processing of data

4
Data Structures vs File Structures
  • Both involve
  • Representation of Data
  • Operations for accessing data
  • Difference
  • Data structures deal with data in main memory
  • File structures deal with data in secondary
    storage

5
Where do File Structures fit in Computer Science?
Application
DBMS
File system
Operating System
Hardware
6
Computer Architecture
data is manipulated here
- Semiconductors - Fast, expensive, volatile,
small
Main Memory (RAM)
data transfer
Secondary Storage
- disks, tape - Slow,cheap, stable, large
data is stored here
7
  • Advantages
  • Main memory is fast
  • Secondary storage is big (because it is cheap)
  • Secondary storage is stable (non-volatile) i.e.
    data is not lost during power failures
  • Disadvantages
  • Main memory is small. Many databases are too
    large to fit in MM.
  • Main memory is volatile, i.e. data is lost during
    power failures.
  • Secondary storage is slow (10,000 times slower
    than MM)

8
How fast is main memory?
  • Typical time for getting info from
  • Main memory 12 nanosec 120 x 10-9 sec
  • Magnetic disks 30 milisec 30 x 10-3 sec
  • An analogy keeping same time proportion as above
  • Looking at the index of a book 20 sec
  • versus
  • Going to the library 58 days

9
Normal Arrangement
  • Secondary storage (SS) provides reliable,
    long-term storage for large volumes of data
  • At any given time, we are usually interested in
    only a small portion of the data
  • This data is loaded temporarily into main memory,
    where it can be rapidly manipulated and
    processed.
  • As our interests shift, data is transferred
    automatically between MM and SS, so the data we
    are focused on is always in MM.

10
Goal of the file structures
  • Minimize the number of trips to the disk in order
    to get desired information
  • Grouping related information so that we are
    likely to get everything we need with only one
    trip to the disk.

11
Physical Files and Logical Files
  • physical file a collection of bytes stored on a
    disk or tape
  • logical file a "channel" (like a telephone line)
    that connects the program to a physical file
  • The program (application) sends (or receives)
    bytes to (from) a file through the logical file.
    The program knows nothing about where the bytes
    go (came from).
  • The operating system is responsible for
    associating a logical file in a program to a
    physical file in disk or tape. Writing to or
    reading from a file in a program is done through
    the operating system.

12
Files
  • The physical file has a name, for instance
    myfile.txt
  • The logical file has a logical name (a varibale)
    inside the program.
  • In C
  • FILE outfile
  • In C
  • fstream outfile

13
Basic File Processing Operations
  • Opening
  • Closing
  • Reading
  • Writing
  • Seeking

14
Opening Files
  • Opening Files
  • links a logical file to a physical file.
  • In C
  • FILE outfile
  • outfile fopen(myfile.txt, w)
  • In C
  • fstream outfile
  • outfile.open(myfile.txt, iosout)

15
Closing Files
  • Cuts the link between the physical and logical
    files.
  • After closing a file, the logical name is free to
    be associated to another physical file.
  • Closing a file used for output guarantees
    everything has been written to the physical file.
    (When the file is closed the leftover from the
    buffer is flushed to the file.)
  • In C
  • fclose(outfile)
  • In C
  • outfile.close()

16
Reading
  • Read data from a file and place it in a variable
    inside the program.
  • In C
  • char c
  • FILE infile
  • infile fopen(myfile.txt,r)
  • fread(c, 1, 1, infile)
  • In C
  • char c
  • fstream infile
  • infile.open(myfile.txt,iosin)
  • infile gtgt c

17
Writing
  • Write data from a variable inside the program
    into the file.
  • In C
  • char c
  • FILE outfile
  • outfile fopen(mynew.txt,w)
  • fwrite(c, 1, 1, outfile)
  • In C
  • char c
  • fstream outfile
  • outfile.open(mynew.txt,iosout)
  • outfile ltlt c

18
Seeking
  • Used for direct access an item can be accessed
    by specifying its position in the file.
  • In C
  • fseek(infile,0, 0) // moves to the beginning
  • fseek(infile, 0, 2) // moves to the end
  • fseek(infile,-10, 1) //moves 10 bytes from
  • //current position
  • In C
  • infile.seekg(0,iosbeg)
  • infile.seekg(0,iosend)
  • infile.seekg(-10,ioscur)

19
File Systems
  • Data is not scattered hither and thither on disk.
  • Instead, it is organized into files.
  • Files are organized into records.
  • Records are organized into fields.

20
Example
  • A student file may be a collection of student
    records, one record for each student
  • Each student record may have several fields, such
    as
  • Name
  • Address
  • Student number
  • Gender
  • Age
  • GPA
  • Typically, each record in a file has the same
    fields.

21
Properties of Files
  • Persistance Data written into a file persists
    after the program stops, so the data can be used
    later.
  • Sharability Data stored in files can be shared
    by many programs and users simultaneously.
  • Size Data files can be very large. Typically,
    they cannot fit into MM.
Write a Comment
User Comments (0)
About PowerShow.com