Experienced architect/engineer seeking interesting problems in distributed systems, including “big data” applications (data engineering and machine learning). The ideal position involves learning new skills/problem domains while working as part of a team.
Dedicated to clean, maintainable abstractions and interfaces in code that is readable, continuously integrated, and automatically tested. Advocate of unit testing and test-first development, comfortable with distributed version control and agile practices.
Familiar with:
[* "Consulting Engineer" is a title for a regular full-time position.]
In the DSSD flash storage unit, worked on adapting the Hadoop Distributed File System (HDFS) to work with DSSD's native API.
Worked in Scala on log data search engine (large scale AWS cloud deployment).
Analyzed failure causes and significantly improved reliability of scheduled search. Added detection and remediation for search tasks that waste resources. Enhanced tools for managing scheduled searches. Diagnosed customer issues and worked with customer support on improving search performance.
Rotations: rapid response team for customer support/operational issues, primary on-call responder, and continuous integration (Hudson/Jenkins) monitor.
Investigated technology options for highly available/scalable message storage. This included gaining domain understanding (IMAP mail storage model) and learning about various distributed non-relational storage schemes like Google BigTable and Amazon Dynamo, and available implementations such as Voldemort, Cassandra, and HBase.
Led experimental efforts using a benchmarking framework (YCSB), evaluated technical and pragmatic concerns for various options and selected implementation technology. Gained familiarity with Apache HBase programming model and Thrift interfaces, and designed mail storage implementation. Worked with a small team on initial implementation and integration with legacy code.
Responsible for all the programmatic control interfaces of the video server. This included interaction with partners to specify interfaces and protocols used for integration with partner systems such as asset management and network resource management, as well as session management and stream control. It also included interpreting specifications provided by standards groups, partners, or others and deciding how to interpret those specifications in the context of the video server implementation.
Designed an XML/HTTP control interface based on ReST principles.
Guided team members on design and programming issues.
Was part of a small group responsible for making architectural changes that included significant redesign of major components of the product.
Used dynamic tracing tools (custom and DTrace) to understand system behavior and diagnose problems at both user and kernel level. Developed guidelines for statically-defined trace data and implemented support for statically defined probes in the build system. This also provided a means for passing critical, otherwise unavailable information about disk I/O to the user-level program without requiring kernel modifications.
Maintained and enhanced a C++ library/framework that was the foundation of all components. Prototyped and guided transition from 32-bit to 64-bit environment.
Worked on a wide variety of efforts related to software reliability of embedded code. Evaluated tools for code comprehension, error detection, and embedded documentation. Advocated various methods for improving software development methodology. Implemented several tools to measure and record data, and a tool for visualizing object-code-coverage data in source-code form. Also developed tools to perturb normal execution to force exercise of uncommon code paths.
Served on front-line response team diagnosing system-level failures found by in-house test lab. Developed scripts to aid less-technical test personnel in gathering information for diagnosis.
Took over maintenance of a large C++ program for static analysis of embedded object code to verify that various hardware and software design rules were obeyed. Adapted the program to incremental changes in instruction-set architecture. Implemented new rule-checking modules as needed. Guided the development of a new version of the same program for a new instruction-set architecture and implemented various rule-checking modules.
Converted a self-contained test program into a general, reusable environment for writing directed and random tests of new CPU. Diagnosed failures found by RTL simulations, including sometimes making sense out of Verilog.
Designed and led the implementation of Simile, an object-oriented framework supporting collaboration and shared information.
Evaluated (with a small team) the costs and benefits of using a commercial version of the Mach microkernel as a foundation for a future version of the MacOS, based on both qualitative and quantitative studies of the product.
Consulted to the MovieTalk/Quicktime TV/Quicktime Streaming project on Internet multicast backbone (MBONE) protocols. Contributed to evaluation process to choose a core operating system for Rhapsody OS.
Evaluated various Java class libraries, including JGL, IFC, and JFC. Developed a Java application to monitor Web pages and alert the user when interesting pages change.
Provided consulting for another project on TCP/IP networking issues. Provided consulting and debugging support to a third project, including issues with the Java language, the IFC class library, and working within the security constraints of Java applets.
Collaborated with HP on Distributed Object Management Framework. Co-designed the Interface Definition Language originally submitted by HP and Sun to the Object Management Group (became CORBA IDL).
Contributed substantially to architecture and design of federated naming system (this became X/Open's XFN standard).
Helped define requirements and explore technological solutions for security in distributed systems. Served on department-level technology review and strategy body.
Ph.D. Computer Science, Stanford University (1989)
B.S. Computer Science, University of Maryland at College Park (1983)
Continuing education: Coursera MOOCs on big data and machine learning techniques