Virtual Memory, Processes, and Sharing in MULTICS

Robert C. Daley and Jack B. Dennis

One-line summary: Segments, segments, everywhere.

Overview/Main Points

   * Multics VM from the User's PoV:
        o Large address space per process, machine-independent VM mechanism.
          Overlays, buffering techniques, secondary storage management are
          all obviated. Users create and manipulate segments.
        o Dynamic linking on a per-procedure basis and data-sharing across
          segments, but with access control (on a per-segment granularity).

   * VM structure:
        o One-to-one mapping between process and address space. Kernel
          ("supervisor") module called traffic controller is process
          scheduler. There are 2^14 segments, each with 2^18 32 bit words.
        o Each segment has length and access priviledge attributes, and can
          grow/shrink dynamically. Procedure segments are pure
          (non-self-modifying). A directory structure associates symbolic
          name (path-name) with each segment.

   * Generalized address:
        o Addresses that processes see are "generalized addresses"; a
          generalized address is like a virtual address as we know it, but
          even more abstract. Generalized address: (segment number, word
          number). Example - on instruction fetch, generalized address is
          (procedure base register, program counter). Example - data
          reference, generalized address is (base register seg #, base
          register word # + direct address in instruction + direct address
          in register).
        o Indirect addressing: A generalized address points at 2 36 bit
          words. If the address-mode field in 1st word contains "its"
          (indirect-through-segment), then segment number from first word
          and word number from second word are combined to form a new
          generalized address. (You can iterate on this for multiple levels
          of indirection.)
        o Given a generalized address, the segment number is used as an
          index into the descriptor segment (table of segment attributes,
          including physical location, access priviledges, length, etc.).
          Each process has a descriptor segment, which is located through
          the descriptor base register (DBR). On context switch, only have
          to change processor registers and DBR. Because each process may
          need different segments at different times, (segment number,
          segment) bindings will be different for each process.

          And by the way, segments and descriptor segments are paged - via
          page tables - but paging is orthogonal to this discussion.

   * Dynamic linking:
        o Requirements: pure procedure segments (why not make unpure, and do
          copy-on-write sharing?), procedures are known by symbolic names so
          processes can call on them without warning, recompilation of one
          segment will not necessarily affect other segments.
        o "Making a segment known": on first reference to a procedure in an
          unknown segment, that segment must be introduced into the process'
          descriptor segment (ie. segment number must be assigned for future
          references). Order of segment references is unknown, so in general
          not possible to assign unique segment number to each shared
          routine or data object.
        o Before segment is known, it is referred to by the path name in
          directory structure; procedures refer to segments by symbolic
          "segment reference name". Segment reference names must be
          translated into path names via directory search.
        o Consider reference to segment D, symbolic word address x - denoted
          as <D> | [x]. After segment is known, want to use generalized
          address instead of this symbolic reference. Add level of
          indirection so don't have to modify referring segment - a linkage
          section.
        o Collection of link data for all external references from segment P
          is called linkage section. Link data is private to processes,
          hence process makes copy of linkage section of a procedure segment
          when it encounters an external reference; this copy becomes a new
          segment in the process's address space. Link data are simply
          indirect addresses. Link data is established at trap time on first
          external reference; link sections are initialized with special
          trap flag in the link data (code ft) to cause this trap to happen
          on the first use of the link data.
        o To keep segments self-contained, uninitialized link data for
          segment P points back into the symbolic address within the
          segment; on trap, the supervisory routine retrieves this symbolic
          address by following the flagged link data back into the segment
          P, then translating the symbolic address into a generalized
          address via directory search.
        o The set of associations between symbolic word addresses x and word
          numbers within a segment is the symbol table of that segment.

   * Procedure calls:
        o The generalized address of the link table for a segment P is
          located in the link pointer (a base register in the processor).
          Thus the link pointer must be updated whenever a new segment is
          executed via procedure call. Convention is to have to two
          instructions in the linkage segment for the newly called segment
          that prep the link pointer for that segment and jump to the real
          entry point within the segment.
        o Each process has a stack segment that is used to store stack
          frames of procedures, similar to in C. Only fixed (constant) args
          can appear in procedure segment to maintain its purity, so all
          pointers and variable arguments must go into the stack segment,
          linkage segment (!!), or elsewhere. An argument pointer base
          register is set before procedure entry that contains the
          generalized address of the argument list.

Relevance

Dynamically loadable libraries and segmented, paged memory in 1968! The
details of the mechanism are interesting, and more so if Smith is on the
prelims committee.

Questions/Flaws

   * segments are files are segments. When do segments get paged out to
     secondary storage? (Ie. how do I do fsynch()?)
   * At worst, seems like 5 separate addresses plus a directory search to
     call an external procedure. This only on the first call, of course.
   * I assume segments and linkage sections become "unknown" to a process -
     how often, when, why?

  ------------------------------------------------------------------------