The Sun Network Filesystem: Design, Implementation and Experiences
--- Sandberg, 1985.


Overview
--------

* NFS uses RPC and XDR to provide a system-independent protocol for
* accessing a remote filesystem. It uses a stateless, idempotent
* protocol to obviate the need for crash recovery. NFS is
* implemented in the kernel, and is transparent to existing
* applications; programs need do nothing different to access remote
* files

- protocol stack: NFS / XDR / RPC / UDP / IP

- XDR (eXternal Data Representation) - used to describe protocols in a
  machine and system independent way;

- NFS is implemented below the VFS/vnode interface;

- VFS defines the operations available over filesystems, while the
  vnode defines the operations over files, independently of its filesystem;

- main design goal of NFS: provide a way of making remote files
  available without having to modify or relink existing programs;

- overall design goals of NFS:

	- machine and OS independence;
	- crash recovery;
	- transparent access (no special pathname parsing, libraries,
	  or recompiling);
	- UNIX semantics maintained on UNIX client;
	- reasonable performance;

- three major pieces: the protocol, the server side, and the client
  side;

NFS Protocol:
------------

- uses the Sun RPC mechanism - RPC helps simplify the definition,
  organization, and implementation of remote services;

- RPC calls are synchronous - the client blocks while servers process
  requests;

- NFS uses a stateless protocol, to facilitate crash recovery;

- the parameters  to each call contain all of the information
  necessary to complete the call;

- Sun's RPC is designed to be transport independent;

- as NFS is stateless, UDP losses are not a big problem;

- a NFS file handle is provided by the server and used by the client
  to reference a file.  The handle is opaque, i.e., the client never
  looks at its contents;

- the paper shows a subset of NFS procedures, including: lookup,
  create, remove, read, write, and getattr.  It is interesting to note
  that read and write calls pass as a parameter the file position (this
  is needed to make the operations idempotent, i.e., they may be
  repeated with unexpected result);

- the first remote file handle (for the root of a filesystem) is
  obtained by the client using the mount protocol;

- the reason for making mount a separate protocol is that it makes
  easier to plug in different access control methods (such as using
  Kerberos for authentication);

- the mount protocol is the only place that pathnames are passed to
  the server;


Server side:
-----------

* as it is stateless, a server must commit any modified data to stable
* storage before returning results;  this may call data blocks,
* indirect blocks and i-node blocks to be modified and written back to
* disk;

- the file handle is composed of a filesystem ID, an i-node number,
  and a generation number;

* the i-node generation number is necessary because the system may
* decide to reuse a given i-node after a file is deleted.  So, to
* differentiate between different files that happen to have the same
* i-node, the generation number is incremented every time a given
* i-node number is freed;


Client side:
-----------

- provides an interface to NFS which is transparent to applications;

- instead of using a fixed hierarchy to identify remote filesystems,
  NFS decided to do the binding during the mount command.  The
  disadvantage is that filesystems are not accessible before they are
  mounted;


Filesystem interface:
--------------------

- NFS keeps the VFS interface; there is one VFS structure per mounted
  filesystem in the kernel and one vnode structure for each open file;

- a root operation is provided in the VFS to return the root vnode of
  a mounted filesystem;

- pathname traversal is done in the kernel by breaking the path into
  directories and doing a lookup call through the vnode for each component;

	* the main reason for this is that any component of the path
	* could be a mounting point for another filesystem, and the
	* mount information is kept above the vnode implementation
	* level;

- cache of directory vnodes may alleviate this problem;

* once RPC and the vnode kernel were in place, the implementation of
* NFS was simply a matter of writing the XDR routines to do the NFS
* protocol, implementing the RPC server for the NFS procedures, and 
* implementing a filesystem interface which translates vnode operations
* into NFS remote procedure calls;

- a hard mounted FS will retry NFS requests forever, if the server
  goes down;  a soft mount gives up after a while and returns an error; 


Security:
--------

- in order to use UNIX authentication (uid, gid, etc), the mapping
  from uid and gid to user must be the same on the server and client.
  To achieve this, they used the yellow page (YP) service to keep the 
  password files consistent;

! also, it is not clear that a root user in one machine should behave
! as such in another (not clear??);

* NFS does not support remote file locking. Instead, they have a
* separate, RPC based file locking facility;

- since the server keeps no locks between requests, two clients
  writing to the same remote file may get intermixed data;

- differently from UNIX (which checks access attributes only when the
  file is open), NFS checks attributes on every call;

- timestamps saved in different parts of the system may cause problems
  if the time skew between client and server is big;

- read-ahead and write-behind caches are implemented in both clients
  and servers, to add performance;

- a performance problem is that writes (5%) are synchronous;


* at this point, the paper becomes a fight against AT&T's Remote
* Filesystem (RFS).  They basically talk about all advantages of NFS
* over RFS;


Security focus:
--------------

- As mentioned above, the only implemented security is the mapping of
  root to nobody. All this does is prevent root on a client machine from
  accessing files that only root on the server machine
  can access. If any other uid can access a file on the server, root on
  the client can just change to that user, and access the file. More
  recently, tiny improvements have been made. For example,
  most versions of mountd have an option (though it's often not used, as
  it disallows valid mounts from some older systems, like Ultrix) to
  only allow mount requests from port numbers reserved
  for root. This prevents regular users on a machine that happens to be
  able to mount a file system from a remote server from being able to
  find out the root file handle of the file system. Also, very
  very recently (I've only seen the Linux nfsd do this), nfsd will
  reject packets from clients that aren't listed in the exports file;
  before this, it was assumed that if a client knew a file handle, it
  must have been previously verified by mountd, so it was OK