- Rob Knauerhase, Intel Corporation, firstname.lastname@example.org
- Vivek Sarkar, Rice University, email@example.com
Future many-core platforms will impose a fresh set of challenges on runtime systems that include targeting nodes with hundreds of homogeneous and heterogeneous cores, as well as energy, data movement and resiliency constraints within and across nodes. The Open Community Runtime (OCR) provides a runtime system framework within which to explore how fine-grained event-driven tasks, movable data blocks and dynamic resource adaptions can address these challenges.
OCR is an open-source project that includes components for task scheduling and resource mapping in homogeneous, heterogeneous, and distributed environments. In addition to native support on current platforms, OCR includes the ability to emulate different processor and memory features (e.g. scratchpad memories, cache policies, and deep hierarchical arrangements of memory), as well as processor/memory interconnects and communication pathways. The runtime includes facilities for introspection of system behavior, and a language with which a programmer (or a tuning expert familiar with the machine) can express hints about at-runtime optimizations.
The tutorial will introduce OCR’s concepts and provide a demonstration of the latest open source release (https://01.org/projects/open-community-runtime). We will show how to use OCR for existing systems and upcoming processor simulators, and point to future areas in which members of the community can contribute (or are already contributing) components for academic and industrial research. Our goal for OCR is to help enable community-wide innovation in programming systems above the OCR level, in hardware designs below the OCR level, and in runtime systems at the OCR level.
Partial support for the OCR project was provided through the XStack program of the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research (ASCR). Additional sources of support include the UHPC program of the U.S. Department of Defense’s Advanced Research Projects Agency (DARPA), and Intel Corporation/Intel Labs.
The objectives of this tutorial are as follows:
- To educate the ASPLOS community on the OCR framework and how it can beused for systems research
- To start a new strand of discussion around software-hardware co-design related toOCR primitives (events, tasks, data blocks)
- To open up new research directions based on the OCR paradigm
- To develop experience and ideas about the directions this paradigm should take inthe future
The specific topics to be covered in the tutorial include:
- Introduction and motivation for OCR
- Summary of key concepts in OCR
- Event-Driven Tasks (EDTs)
- Movable Data Blocks
- Globally unique ids (guids) for the above
- Machine description (real and simulated hardware) o Distribution and tuning annotations
- OCR’s software architecture and APIs
- Demo and walkthrough of sample applications mapped onto OCR
- Use of OCR to emulate different processor-architectural configurations
- Native execution with PIN instrumentation
- Integration with custom simulators (demonstrated using the Intel Sunshine simulator)
The content for these topics come in part from our experiences with OCR in the UHPC Runnemede project and the Xstack Traleika Glacier project.This tutorial should be relevant to practitioners and researchers interested in future “extreme scale” systems where requirements arising from concurrency, resilience, and energy constraints are forcing system designers to take a fresh look at the foundational primitives for computation and data. Attendees will leave the tutorial with sufficient information on OCR to write small-scale programs to the OCR APIs. In addition, the semantic foundations and implementation techniques described in the tutorial will be useful for programming system designers and implementers (who may choose to build tools that can map larger-scale programs onto these APIs), as well as OS and hardware designers and implementers (who may choose to explore design choices that improve the efficiency of OCR primitives). There is a set of open research problems that arise at the software and hardware boundaries that should be of interest to many researchers in the field.The prerequisite knowledge assumed is familiarity with the foundations of multithreaded programming in C (e.g., with Pthreads) as well as basic components of modern processors (processor cores, memory hierarchy).
Rob Knauerhase is a Research Scientist with Intel Labs. His current focus is observation-driven dynamic adaptation (resource mapping and scheduling) in system software, especially as applied to very-high-core-count supercomputers and heterogeneous systems. Most recently, he was the software architect and co-PI for system software in the Runnemede project, joint work between Intel (with partners) and DARPA’s UHPC program. Rob’s other professional interests include distributed systems, machine virtualization, and information privacy in the digital world. His professional service includes being active in university collaboration (joint projects, shepherding Intel grants, etc.) and serving on various workshop and conference program committees. Rob received the Masters in Computer Science from the University of Illinois at Urbana-Champaign, and the Bachelor of Science in Engineering from Case Western Reserve University. He holds 33 patents (with approximately 30 patents pending) and is a Senior Member of the IEEE. Outside of work, Rob enjoys volunteering with the First Baptist Church of Portland, global travel, libertarian thought and debate, clever puns, and spicy foods.
Vivek Sarkar is the E.D. Butcher Professor of Computer Science at Rice University. He conducts research in multiple aspects of parallel software including programming languages, program analysis, compiler optimizations and runtimes for parallel and high performance computer systems. He currently leads the Habanero Multicore Software Research project at Rice University, and serves as Associate Director of the NSF Expeditions Center for Domain-Specific Computing. Prior to joining Rice in July 2007, Vivek was Senior Manager of Programming Technologies at IBM Research. His past projects include the X10 programming language, the Jikes Research Virtual Machine for the Java language, the ASTI optimizer used in IBM’s XL Fortran product compilers, the PTRAN automatic parallelization system, and profile-directed partitioning and scheduling of Sisal programs. In 1997, he was on sabbatical as a visiting associate professor at MIT, where he was a founding member of the MIT RAW project. Vivek became a member of the IBM Academy of Technology in 1995, the E.D. Butcher Chair in Engineering at Rice University in 2007, and was inducted as an ACM Fellow in 2008. He holds a B.Tech. degree from the Indian Institute of Technology, Kanpur, an M.S. degree from University of Wisconsin-Madison, and a Ph.D. from Stanford University. Vivek has been serving as a member of the US Department of Energy’s Advanced Scientific Computing Advisory Committee (ASCAC) since 2009. He has given tutorials at several past conferences including PLDI 1993, POPL 1996, ASPLOS 1996, PLDI 2000, OOPSLA 2003, ECOOP 2004, OOPSLA 2006, PPoPP 2007, PLDI 2007, PLDI 2008, PLDI 2009, PLDI 2011 and has taught many short courses and full-length courses.