Exokernel: An Operating System Architecture for Application-Level Resource Management Engler, Kaashoek, O'Toole Overview * move management of physical resources away from the kernel and into untrusted user-level libraries * the kernel simply multiplexes resources * provides much more flexibility and opportunity for domain-specific optimizations * separate protection from management; e.g. protect framebuffers without understanding windowing systems, protect disks without understanding filesystems Design An exokernel must * track ownership of resources * ensure protection by guarding all resource usage or binding points * revoke access to resources when necessary Secure bindings * preform authorization only at bind time * can do this with hardware (e.g. ownership tags on pixels in framebuffer), software caching (e.g. software TLB), or downloading code into the kernel (e.g. packet filters) Visible resource revokation * inform an application if some resource (e.g. CPU, memory) is about to be taken away * the application knows better than the kernel which page it is least likely to use * the application can also track its resource usage so that it can, for example, manage threads (switch threads when one blocks on a page fault) Abort protocol * if the application in uncooperative in releasing a resource, the kernel breaks the secure bindings to the resource and sends an exception to the application * an application can keep a handful of "vital" resources (e.g. memory pages) that the kernel will not take by force, unless absolutely necessary The paper gives the example of Aegis, a sample exokernel. It shows that exokernels can be implemented very efficiently. Library Operating Systems ExOS is a library operating system that sits on top of Aegis. By obviating the need to trap to the kernel in order to do some tasks, like IPC, it achieves performance considerably better than Ultrix and other research operating systems. ASHs Library operating systems can install application-specific safe handlers (ASHs) into the kernel. These are pieces of code that are examined and sandboxed to make sure they are safe, and that execute on certain asynchronous events (such as packet arrival). An ASH executes in the kernel's context, so it does not need to wait until some particular process gets a time slice. It can copy data directly into user memory (to save on multiple copies), and initiate messages (for low-latency replies), as well as perform general computation.