Virtual Memory, Processes, and Sharing in MULTICS Robert C. Daley and Jack B. Dennis One-line summary: Segments, segments, everywhere. Overview/Main Points * Multics VM from the User's PoV: o Large address space per process, machine-independent VM mechanism. Overlays, buffering techniques, secondary storage management are all obviated. Users create and manipulate segments. o Dynamic linking on a per-procedure basis and data-sharing across segments, but with access control (on a per-segment granularity). * VM structure: o One-to-one mapping between process and address space. Kernel ("supervisor") module called traffic controller is process scheduler. There are 2^14 segments, each with 2^18 32 bit words. o Each segment has length and access priviledge attributes, and can grow/shrink dynamically. Procedure segments are pure (non-self-modifying). A directory structure associates symbolic name (path-name) with each segment. * Generalized address: o Addresses that processes see are "generalized addresses"; a generalized address is like a virtual address as we know it, but even more abstract. Generalized address: (segment number, word number). Example - on instruction fetch, generalized address is (procedure base register, program counter). Example - data reference, generalized address is (base register seg #, base register word # + direct address in instruction + direct address in register). o Indirect addressing: A generalized address points at 2 36 bit words. If the address-mode field in 1st word contains "its" (indirect-through-segment), then segment number from first word and word number from second word are combined to form a new generalized address. (You can iterate on this for multiple levels of indirection.) o Given a generalized address, the segment number is used as an index into the descriptor segment (table of segment attributes, including physical location, access priviledges, length, etc.). Each process has a descriptor segment, which is located through the descriptor base register (DBR). On context switch, only have to change processor registers and DBR. Because each process may need different segments at different times, (segment number, segment) bindings will be different for each process. And by the way, segments and descriptor segments are paged - via page tables - but paging is orthogonal to this discussion. * Dynamic linking: o Requirements: pure procedure segments (why not make unpure, and do copy-on-write sharing?), procedures are known by symbolic names so processes can call on them without warning, recompilation of one segment will not necessarily affect other segments. o "Making a segment known": on first reference to a procedure in an unknown segment, that segment must be introduced into the process' descriptor segment (ie. segment number must be assigned for future references). Order of segment references is unknown, so in general not possible to assign unique segment number to each shared routine or data object. o Before segment is known, it is referred to by the path name in directory structure; procedures refer to segments by symbolic "segment reference name". Segment reference names must be translated into path names via directory search. o Consider reference to segment D, symbolic word address x - denoted as | [x]. After segment is known, want to use generalized address instead of this symbolic reference. Add level of indirection so don't have to modify referring segment - a linkage section. o Collection of link data for all external references from segment P is called linkage section. Link data is private to processes, hence process makes copy of linkage section of a procedure segment when it encounters an external reference; this copy becomes a new segment in the process's address space. Link data are simply indirect addresses. Link data is established at trap time on first external reference; link sections are initialized with special trap flag in the link data (code ft) to cause this trap to happen on the first use of the link data. o To keep segments self-contained, uninitialized link data for segment P points back into the symbolic address within the segment; on trap, the supervisory routine retrieves this symbolic address by following the flagged link data back into the segment P, then translating the symbolic address into a generalized address via directory search. o The set of associations between symbolic word addresses x and word numbers within a segment is the symbol table of that segment. * Procedure calls: o The generalized address of the link table for a segment P is located in the link pointer (a base register in the processor). Thus the link pointer must be updated whenever a new segment is executed via procedure call. Convention is to have to two instructions in the linkage segment for the newly called segment that prep the link pointer for that segment and jump to the real entry point within the segment. o Each process has a stack segment that is used to store stack frames of procedures, similar to in C. Only fixed (constant) args can appear in procedure segment to maintain its purity, so all pointers and variable arguments must go into the stack segment, linkage segment (!!), or elsewhere. An argument pointer base register is set before procedure entry that contains the generalized address of the argument list. Relevance Dynamically loadable libraries and segmented, paged memory in 1968! The details of the mechanism are interesting, and more so if Smith is on the prelims committee. Questions/Flaws * segments are files are segments. When do segments get paged out to secondary storage? (Ie. how do I do fsynch()?) * At worst, seems like 5 separate addresses plus a directory search to call an external procedure. This only on the first call, of course. * I assume segments and linkage sections become "unknown" to a process - how often, when, why? ------------------------------------------------------------------------