Intelligent File System - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Intelligent File System

Description:

Title: Intelligent File System Author: Customer Last modified by: Customer Created Date: 4/2/2002 4:29:42 PM Document presentation format: – PowerPoint PPT presentation

Number of Views:100
Avg rating:3.0/5.0
Slides: 41
Provided by: umk3
Category:

less

Transcript and Presenter's Notes

Title: Intelligent File System


1
Blessed are the poor in spirit for theirs is
the kingdom of heaven. ltMatthew 53gt
2
Intelligent File Sharing Framework
  • A THESIS IN
  • Computer Science
  • Changgyu Oh
  • 5/2/2002

3
Contents
  • Title Page
  • Motivations 3
  • Network Topologies 4
  • Problem Domains 5
  • Research Goal 8
  • Related Works 9
  • Intelligent File
  • Sharing Framework 12
  • Framework Figure 13
  • Query Service
  • Using Reasoning 14
  • IS-A/Contained-In
  • Hierarchies 15
  • File Association Rules 16
  • The Benefits
  • of IFS Search 17
  • Grouping Service 18
  • IFS P2P V.S.
  • Title Page
  • Dynamic Group Partition 21
  • IP Clue Mechanism 22
  • File Transaction in IFS 24
  • QUERY SERVICE TYPES 25
  • IFS System Architecture 26
  • Client View 27
  • Server View 28
  • IFS Prototype
  • Implementation 29
  • IFS Query Interface 30
  • Experimental Results 32
  • Comparative Analysis 33
  • Contributions 34
  • Conclusion 35
  • Future Work 36
  • References 37

4
Motivations
  • Why P2P?
  • Limitations of Client/Server
  • Increasing interest in sharing and collaborative
    computing
  • Improving P2P technologies
  • Why P2P File Sharing?
  • FILE Reusability
  • Share available resources
  • Significance of this research
  • Increase Network scalability
  • Anonymity
  • Flexible and powerful query

5
Network Topologies
6
Problem Domains (1)
  • Limitations of P2P Network
  • Scalability
  • Utilization of Network resources
  • P2P Network Topology
  • Broadcast
  • Logical Mesh network

7
Problem Domains (2)
  • Limitation of Resource Sources Anonymity
  • Resource sources IP address in queryHit message
  • Privacy and security
  • How can source node send it to destination
    without revealing its IP address in public?

8
Problem Domains (3)
  • Limitation of Keyword Based Query
  • Primitive and limited
  • Only one file searching
  • Not flexible
  • Not satisfy users requests

9
Research Goal
To increase P2P network scalability Message flow control (Dynamic Group Partition and Caching)
To protect the publisher anonymity IP-Clue mechanism (Encoding/Decoding)
To increase the capacity of file querying File querying using intelligent reasoning, caching, dynamic peer group
10
Related Works-I Anonymous Publication Service
  • The Publius system Marc W., 2000
  • document-anonymity because the key is split
    between the n servers, and without sufficient
    shares of the key a server is unable to decrypt
    the document that is stores.
  • Anonymity based on static, system-wide list of
    available servers.
  • Not support the adding of new server
  • The Eternity system Ross J., 1996
  • Provides publishers anonymity by using one-way
    anonymous re-mailers
  • Server anonymity is not provided
  • Reader anonymity is not provided by open public
    proxies
  • Query and Advertising System Heimbigner D.,
    2000
  • Arbitrary name is placed at the first level
    server for each client.
  • First level server has actual IP address of
    clients
  • Freenet Ian C., 2000
  • Provides document-anonymity
  • Server-anonymity is not provided.

11
Related Works- IIMeta Search Methods
  • Efficient and Effective Metasearch Yu C.,1999
  • representatives for each database optimizing
    relationship hierarchy
  • Efficient Transitive Closure Reasoning Lee
    Y.,2001
  • inheritance, classification transitive closure
    reasoning
  • Class/Part/Containment Hierarchy
  • Browsing Large Digital Library Collections
  • Geffner S., 1999
  • classification hierarchies to increase
    capabilities of the data browsing in digital
    libraries.

12
Related Works-IIIFile Sharing Systems using
Caching
  • The Distributed File System Burns, R.C , 2000
  • Detecting network failures ensures that caches
    are consistent.
  • Network File System Palmer J., 1996
  • Clients poll the server to find out when the file
    was last modified
  • Determines the cached version is valid.
  • Hint-Based Cooperative Caching file system
    Sarkar, P., 2000
  • Help clients make decisions based on the
    computers local state
  • Reduce overhead and access latency

13
Intelligent File Sharing Framework
  • Major Building Blocks
  • Query Service using Reasoning
  • IP-Clue Mechanism
  • Encoding/Decoding
  • Dynamic Grouping and Caching Service

14
(No Transcript)
15
Query Service Using Reasoning
  • Goal
  • Fast search using the file relation hierarchy Set
  • More flexible query and directory services
  • Approach
  • Relationships
  • IS-A
  • Contained-In
  • Run-With
  • File Relation Hierarchy Set lt?,R,O,?gt
  • Set of Number pairs (?),
  • Relation Type (R),
  • Constraint Rule (O),
  • Hierarchy Identifier (?).
  • File Association Rules
  • Generalized Association Rule
  • Aggregated Association Rules
  • Constrain-based Association Rule

16
IS-A/Contained-In Hierarchies
17
File Association Rules
  • Generalized Association Rule
  • Subtype relationship between files
  • E.g., If Window multimedia application X is a
    multimedia application Y and if a multimedia file
    Z is running with the Multimedia application Y,
    then X runs Z.
  • Aggregated Association Rule
  • directory contains multiple sub-directories or
    files
  • E.g., Find the files on CS101 homework
  • Constrain-based Association Rule
  • File association based on constraints such as
    file size, Network capacity, etc.
  • E.g., Find a file whose size is less than 1
    MBtype and can be opened with MS Word.

18
The Benefits of IFS Search
Method IFS Search Keyword Based Search
Keyword Search Yes Yes
File Extension Search Yes No
Application Search Yes No
Directory Search Yes No
Keyword Search in a certain directory Yes No
File Extension Search in a certain directory Yes No
File Search with Constraints Yes Yes
Combination Yes No
19
Grouping Service
  • Goal Increase Scalability
  • Control Maximum hop
  • Control a number of replicas of message generated
    by peer hosts
  • Control a number of peer hosts for message
    forwarding in a routing table of each peer host.
  • Approach
  • Group partition
  • Brother relationship
  • Caching

20
IFS P2P V.S. P2P Network
21
Benefits of Dynamic Group Partition
  • Broadcast in a same group
  • Robust Search against node failure
  • Ensure a shortest path
  • Increase Network Scalability by grouping peers
  • Server-less and Decentralized manner
  • Dynamic partition
  • Reduce network traffics
  • Requires only one hop per a group

22
Dynamic Group Partition
23
IP Clue Mechanism
  • Goal Protect identity of resource publisher in
    P2P file sharing
  • Approach
  • IP Encoding/Decoding
  • Encoding the IP in source peers
  • Decoding the encoded IP in destination peers
  • Formula
  • Assume that IP address of A is represented in
    W.X.Y.Z
  • (e.g., 255.122.25.5)
  • (1) W the size of query
  • (2) X the first character of a query
  • (3) Y the file extension size
  • (4) Z the last character of a query message
  • ? Only the destination peers can recognize the IP
    Clue!!!

24
IP-Clue Mechanism
25
File Transaction in IFS
26
QUERY SERVICE TYPES
27
IFS System Architecture
  • Component-based Architecture
  • Servant Component
  • Highest level of component
  • Server Client Components
  • Manager Components
  • Control work flow
  • Assign tasks to worker components
  • Worker Components
  • Perform actual tasks
  • Service (Entity) Components
  • Task description

28
Client View
29
Server View
30
IFS Prototype Implementation
  • IFS prototype is built on top of Gnutella Phex
    System
  • Developing System Environment
  • Need at least 25 Mbyte free Memory Space
  • JAVA Virtual Machine
  • Pentium III 500MHz CPU
  • Event Driven Methods
  • Each task is performed based on events
  • Components based Programming
  • Manager Components
  • Worker Components
  • Service Components

31
IFS Query Interface
32
IFS Query Interface
33
Experimental ResultsDynamic Group Partition and
Cache
34
Comparative Analysis
Measure Napster Gnutella IFS
Topology Client/Server Logical Mesh Logical Mesh
Design Purpose MP3 file sharing File sharing in a decentralized manner Enhanced Gnutella
Size of Routing table Need a servers IP address O(N) O(K) Where K ltlt N
Node Join Operation O(1) O(1) O(1)
Node failure Severe Tolerable Tolerable
Search Mechanism File indexing based on keyword search File indexing based on keyword search Fast Reasoning based on file association rules
Description Client/server based P2P network. Heavy traffics on servers Node failure is severe Decentralized Heave traffics due to the exponentially increased replicas of query messages Decentralized Control the network traffics Flexible query mechanism
35
Contributions
  • Proposed a conceptual framework for decentralized
    P2P file sharing.
  • Dynamic group partition and caching
  • Query using fast reasoning
  • IP-clue mechanism (encoding/decoding)
  • Designed a component-based architecture
  • Implemented to extend an existing file sharing
    system (Gnutella Phex)

36
Conclusion
  • The IFS system
  • Supports decentralized P2P File Sharing.
  • Increases high Network scalability.
  • Provides flexible file searching and querying.
  • Protect resource sources anonymity.

37
Future Work
  • Further Research on the latency due to the
    grouping
  • File registration strategy on heterogeneous
    environment
  • Discover advanced mechanism to reasoning file
    relationships file association rules
  • Research on the grouping policies
  • Grouping by peer hosts network capacity
  • Grouping by interests
  • Grouping by context
  • Grouping by location

38
References
  • C. T. Yu, W. Meng, K.-L. Liu, W. Wu, and N.
    Rishe. Efficient and effective metasearch for a
    large number of text databases. In CIKM, pages
    217--224, 1999
  • Y. Lee and J. Geller, Efficient Transitive
    Closure Reasoning in a Combined
    Class/Part/Containment Hierarchy, Journal of
    Knowledge and Information System, 2002
  • S. Geffner, D. Agrawal, A. Abbadi and T. Smith,
    Browsing Large Digital Library Collections Using
    Classification Hierarchies, CIKM, 195-201,
    1999

39
References (Continue)
  • M. Waldman, A. Rubin, and L. F. Cranor. Publius
    A robust, tamperevident, censorship-resistant,
    web publishing system. In Proc. 9th USENIX
    Security Symposium, page 59-72, August 2000
  • R. J. Anderson, The Eternity service, in
    Proceedings of the 1st International Conference
    on the Theory and Applications of Cryptology
    (PRAGOCRYPT '96), Prague, Czech Republic 1996.
  • J. Palmer, R. Strong, and E. Upfal. Nonblocking
    membership protocols with asymmetric safety.
    Technical Report RJ10096 (91912), IBM Research
    Division, December 1997.

40
References (Continue)
  • I. Clarke, O. Sandberg, B. Wiley, and T. Hong.
    Freenet A distributed anonymous information
    storage and retrieval system. In Proceedings of
    the Workshop on Design Issues in Anonymity and
    Unobservability, pages 46-66, July 2000.
  • D. Heimbigner, Adapting Publish/Subscribe
    Middleware to Achieve Gnutella-like
    Functionality. Technical Report CU-CS-909-00,
    Department of Computer Science, University of
    Colorado, Sept. 2000
  • P. Sarkar, J. H. Hartman ACM Transactions on
    Computer Systems (TOCS) November 2000 Volume 18
    Issue 4
Write a Comment
User Comments (0)
About PowerShow.com