Introduction

Introducing X10

For many years, the increase in the number of transistors on a chip, as predicted by Moore’s Law, has resulted in higher clock speeds, and thus, greater performance; without any changes to software.  This has enabled the creation of advanced, performance-hungry applications, as well as an increase in software developer productivity through high-level programming languages that are enabled by these performance increases.

Due to power constraints, mainstream computer venders have announced two significant changes in their future architectures. First, the dramatic increase in clock speeds they have provided in the past will no longer continue and the relative amount of cache memory per processor will decrease. Second, there will be an exponentially increasing number of processor cores on a chip.

These changes present two challenges to the software stack. Namely, how does the software deal with the stagnation of single threaded performance and cache memory, and how can the software utilize the additional capabilities provided by multiple cores on a chip?

For some classes of applications, such as transaction-based systems, these trends are not problematic.  These applications have natural parallelism and thus, can easily adapt to the multicore trend by having appropriate middleware map their parallelism to the multicore chips.  However, for other classes of applications, the shift to requiring parallelism to obtain performance is a significant unwanted challenge.

Utilizing  the cloud has emerged as an attractive and viable application development and deployment framework for commercial applications that must process vast amounts of data, utilizing hundreds of (possibly heterogeneous) cores. For these applications, parallelism – once an option – is now a requirement – and must be exploited to achieve historical increases in application performance that have also led to developer productivity improvements, which are key for developing more robust, sophisticated software applications.

These challenges are one of the hardest problems facing the computer industry today.

 

The X10 Project

IBM Research is developing the open-source X10 programming language to provide a programming model that can address the architectural challenge of multiples cores, hardware accelerators, clusters, and supercomputers in a manner that provides scalable performance in a productive manner.  The project leverages over six years of language research funded, in part, by the DARPA/HPCS program.

The X10 programming language is organized around four basic principles of asynchrony, locality, atomicity, and order that are developed on a type-safe, class-based, object-oriented foundation. This foundation is robust enough to support fine-grained concurrency, Cilk-style fork-join programming, GPU programming, SPMD computations, phased computations, active messaging, MPI-style communicators, and cluster programming. X10 implementations are available on Power and x86 clusters, on Linux, AIX, MacOS, Cygwin and Windows.

IBM Research has significantly increased its investment in the language design and implementation, tooling, developing applications, as well as working closely with over 25 universities around the world to improve the language and its implementation, developing course materials, and applications.  In the past three years, IBM Research awarded over $750K to grow the community around the language.  An X10 Birds-of-a-Feather session at the October 2010 ACM SPLASH conference drew over 100 researchers (video).  Courses and tutorials have been taught based on X10 at universities and major conferences in the US and abroad.  The first X10 workshop was held at PLDI'11 in San Jose, CA on June 4, 2011.  Videos, slides, and papers are available. The 2nd X10 workshop will be co-located with PLDI'12 and held in Beijing, China in June, 2012.