Chapter 14: Replication
---

- motivations for replication: performance, availability, and
  fault-tolerance;

- replication transparency is desirable;

- each logical object contains multiple distributed replicas; each
  replica is handled by a replica manager;

- the front end (FE) hides the location of the replicas from the
  clients;

- FIFO ordering, causal ordering, total ordering;

- group membership service has 4 main tasks:

	- providing an interface for group membership changes;
	- implementing a failure detector;
	- notifying members of group membership changes;
	- performing group address expansion;

- group management services can be primary-partition or partitionable:

	- primary-partition: only one subgroup (majority) survives a
	  partition;

	- partitionable: all subgroups may continue working
	  independently;

- passive (primary-backup) replication: one primary server and other
  slaves or backups; primary always handles FE requests (used for
  example in the Harp distributed filesystem);

- active replication: FE multicasts requests to the group of replica
  managers.  FE uses totally ordered, reliable multicast; this system
  achieves sequencial consistency (SC);

- The gossip architecture
  -----------------------

- two guarantees:

	- each client obtains a consistent service over time. Replica
	  managers only ever provide a client with data that reflects
	  at least the updates that the client has observed so far,
	  even though clients may communicate with different replica
	  managers over time;

	- relaxed consistency between replicas: all replica managers
	  eventually receive all updates, and a total order exists.
	  It can be made to satisfy SC, but its use focuses on weaker
	  consistency models;

- the replica manager that receives a request does not process it
  until it can apply the request according to the required ordering
  constraints (this is a kind of lazy, on-demand update);

- it seems from the description that a client always use the same FE;


- Bayou system
  ------------

- data replication for high availability with weaker guarantees than
  sequential consistency;

- Bayou replica managers cope with variable connectivity by exchanging
  updates in pairs;

- the state that Bayou replicates is held in the form of a database,
  supporting queries and updates;  Bayou may undo and redo updates to
  the database as execution proceeds;

- the Bayou guarantee is that, eventually, every replica manager
  receives the same set of updates and applies those updates in a 
  way to achieve the same result in all replicas;

- updates are marked as tentative when they are first applied to a
  database.  The system arranges that tentative updates are
  eventually placed in a canonical order and marked as committed;

- while updates are tentative, the system may undo and reapply them as
  it produces a consistent state;

- total ordering can be achieved by selecting a primary server, which
  serializes the updates

- every Bayou update contains a dependency check and a merge
  procedure, which are domain-specific;  a replica manager calls the
  dependency check procedure before applying the update;  if the check
  indicates a conflict, Bayou calls the merge procedure;

- in Bayou, replication is not transparent to the application, and
  achieves what the book calls eventual sequential consistency;

- disadvantages:

	- programmer needs to supply dependency checks and merge
	  procedures;

	- results after a conflict may not agree with the user view
	  before the conflict (appointment schedule example in the
	  book);


- Coda File System
  ---------------- 

- descendant of AFS (Andrew File System);

- AFS limited replication to read-only filesystems;

- appearance of mobile users creates the need for constant data
  availability;

- Coda relies on the replication of file volumes to achieve a higher
  throughput of file access operations and a greater degree of fault 
  tolerance;

- Coda relies on an extension of the mechanism used in AFS for caching
  copies of files at client computers to enable disconnected operation;

- Coda is like Bayou, in that it uses an optimistic strategy;

- different from Bayou, the dependecy check is not application-specific;

- on a close call, copies of modified files are broadcast in parallel
  to all the servers in the available volume storage group (AVSG);

- disconnected operation is said to occur when the AVSG is empty;

- to detect conflicts and keep the state of files, each file version
  contains a Coda version vector (CVV), which is a vector timestamp
  with an entry for each server in the relevant VSG;

- the purpose of the CVV is to provide sufficient information about
  the update history of each file replica to enable potential conflicts
  to be detected and submitted for manual intervention and for stale
  replicas to be updated automatically;

- Coda does not, in general, resolve conflicts automatically;

- statistics (Mary's papers, I think) show that for most files
  accessed by users, there is not possible conflict (not shared);

- when a file fetch is completed as a result of an open on a uncached
  file, a callback promise is established at the preferred server (the 
  one with the most up-to-date version of the file in the AVSG);

- the preferred server contacts the client if a modification to the
  file (visible to the server) occurs while the client is working on it;

- a frequently sent probe is used to define the AVSG;


- Transactions with replicated data
  ---------------------------------

- lazy vs. eager approaches to update propagation.  Lazy propagation
  is usually used in primary copy replication systems, while the eager
  approach is used to guarantee serialized access when different
  replica managers are used by different clients;