nfsv4 and linux - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

nfsv4 and linux

Description:

... current working directory ... fs exported directories are mounted on. their ... no delegation for directories. server delegation state. associates ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 33
Provided by: peter683
Category:

less

Transcript and Presenter's Notes

Title: nfsv4 and linux


1
nfsv4 and linux
  • peter honeyman
  • linux scalability project
  • center for information technology integration
  • university of michigan
  • ann arbor

2
open source reference implementation
  • sponsored by sun microsystems
  • part of citis linux scalability project
  • ietf reference implementation
  • 257 page spec
  • linux and openbsd
  • interoperates with solaris, java, network
    appliance, hummingbird, emc, ...
  • september 1 code drop for linux 2.2.14

3
whats new?
  • lots of state
  • compound rpc
  • extensible security added to rpc layer
  • delegation for files - client cache consistency
  • lease-based, non-blocking, byte-range locks
  • win32 share locks
  • mountd gone
  • lockd, statd gone

4
nfsv4 state
  • state is new to nfs protocol
  • nfsv3 lockd manages state
  • compound rpc - server state
  • dos share locks - server and client state
  • delegation - server and client state
  • server maintains per-thread global state
  • client and server maintain file, lock, and lock
    owner state

5
server state per global thread
  • compound operations often use result of previous
    operation as arguments
  • nfs file handle is the coin of the realm
  • current file handle ? current working directory
  • some operations (rename) need two file handles -
    save file handle

6
compound rpc
  • hope is to reduce traffic
  • complex calling interface
  • partial results used
  • rpc/xdr layering
  • variable length kmalloc buffer for args and recv
  • want to xdr args directly into rpc buffer
  • want to allow variable length receive buffer

7
rpc/xdr layering
  • rpc layer does not interpret compound ops
  • replay cache locking vs. regular
  • have to decode to decide which replay cache to
    use

8
example mount compound rpc
putrootfh lookup getattr getfh
9
nfsv4 mount
  • server pseudofs joins exported subtrees with a
    read only virtual file system
  • any client can mount into the pseudofs
  • users browse the pseudofs (via lookup)

10
nfsv4 pseudofs
  • access into exported sub trees based on users
    credentials and permissions
  • client /etc/fstab doesnt change with servers
    export list
  • server /etc/exports doesnt need to maintain an
    ip based access list

11
mounting a pseudo file system
Local FS
Pseudo FS
nfsv4 client
/
/
b
a
c
b
a
user creds
d
e
f
the server boots, parses /etc/exports, creates
the pseudo fs, mirroring the local fs up to the
exported directories. the local fs exported
directories are mounted on their pseudo fs
counterparts.
user has read-only access to the pseudo fs, and
traverses the pseudo fs until encountering an
exported directory.
the users permissions in the negotiated security
realm determine access to the exported directory.
the client boots and mounts a directory of the
pseudo fs with the AUTH_SYS security flavor.
the first nfsv4 procedure that acts on the
exported directory causes nfsd to return
NFS4ERR_WRONGSEC, causing the client to call
SECINFO and obtain the list of security flavors
on the exported directory.
before the first open, the client calls
SETCLIENTID to negotiate a per-server unique
client identifier.
g
h
i
local fs directory
pseudo fs directory
exported directory
12
rpcsec_gss
  • mit krb5 gssrpc and sesame are open source, but
    neither is really rpcsec_gss
  • sun released their rpcsec_gss, a complete rewrite
    of onc
  • gss sun onc a tough match
  • both are transport independent
  • gss channel bindings / onc xprt
  • overloading of programs null_proc

13
kernel rpcsec_gss
  • rpc layering had to be violated
  • gss implementations are not kernel safe
  • security service code not kernel safe (kerberos
    5)
  • kernel security services implemented as rpc
    upcalls to a user-level daemon, gssd
  • but only some services - e.g. encryption -need to
    be in the kernel

14
rpcsec_gss where are we now?
  • (mostly) complete user-level kerberos 5
    implementation
  • linux kernel implementation with kerberos 5
  • mutual authentication
  • session key setup
  • no encryption
  • gssd

15
kerberos 5 security initialization
kerberos 5 kdc
3
2
gssd
gssd
user
user
4
7
kernel
kernel
1
6
5
8
nfsd
nfs client
9
10
2,3 kerberos 5 tcp/ip
1,4,6,7 gssd rpc interface
5,8 nfsv4 overloaded null procedure
9,10 nfsv4 compound procedure
16
locking
  • lease based locks
  • no byte range callback mechanism
  • server defines a lease for per client lock state
  • server can reclaim client state if lease not
    renewed
  • open sets lock state, including lock owner
    (clientid, pid)
  • server returns lock stateid

17
locking
  • stateid mutating operations are ordered (open,
    close, lock, locku, open_downgrade)
  • lock owner can request a byte range lock and
    then
  • upgrade the initial lock
  • unlock a sub-range of the initial lock
  • server is not required to support sub-range lock
    semantics

18
server lock state
  • need to associate file, lock, lock owner, lease
  • per lock owner lock sequence number
  • per file state in hash table
  • may move file state into struct file private area

19
server lock state
  • lock owners in hash table
  • server doesnt own the inode
  • lock state in linked list off file state
  • stateid handle to server lock state
  • per client state in hash table - lock lease

20
client lock state
  • lock owners in hash table
  • per lock owner lock sequence number
  • use struct file private data area
  • client owns the inode, use private inode data area

21
client lock state
  • use inode file_lock struct private data area for
    byte range lock state
  • (eventually) store same locking state as the
    server for delegated files
  • use the super block private data area to store
    per server state (returned clientid)

22
delegation
  • intent is to reduce traffic
  • server decides to hand out delegation at open
  • if client accepts, client provides callback
  • many read delegations, or one write delegation

23
delegation
  • when client delegates a cached file it handles
  • all locking, share and byte range
  • future opens
  • client cant reclaim a delegation without a new
    open
  • no delegation for directories

24
server delegation state
  • associates delegation with a file
  • delegation state in linked list off file state
  • stateid separate from the lock stateid
  • client call back path

25
linux vfs changes
  • shared problem open with o_excl described by
    peter braam
  • nfsv4 implements win32 share locks, which require
    atomic open with create
  • linux 2.2.x and linux 2.4 vfs is problematic

26
linux vfs changes
  • to create and open a file, three inode operations
    are called in sequence
  • lookup resolves the last name component
  • create is called to create an inode
  • open is called to open the file

27
xopen
  • inherent race condition means no atomicity
  • we partially solved this problem
  • we added a new inode operation which performs the
    open system call in one step
  • int xopen(struct file filep, struct inode
    dir_i, struct dentry dentry, int mode)
  • if the xopen() inode operation is null, the
    current two step code is used
  • nfsv4 open subsumes lookup, create, open, access

28
user name space
  • local file system uses uid/gid
  • protocol specifies ltuser namegt_at_ltrealmgt
  • different security can produce different name
    spaces

29
user name space
  • unix user name
  • kerberos 5 realm
  • pki realm - x500 or dn naming
  • gssd resolves ltuser namegt_at_ltrealmgt to local file
    system representation

30
open issues
  • local file system choices
  • currently ext2
  • acl implementation will determine fs for linux
    2.4
  • kernel additions and changes
  • rpc rewrite
  • crypto in the kernel
  • atomic open

31
next steps
  • march 31 - full linux 2.4 implementation, without
    acls
  • june 30 - acls added
  • network appliance sponsored nfsv3/v4 linux
    performance project

32
any questions?
http//www.citi.umich.edu/projects/nfsv4 http//ww
w.nfsv4.org
Write a Comment
User Comments (0)
About PowerShow.com