The Human Brain Project (HBP), a 10-year neuroscience research project, has been making its rounds in the headlines of every reputable technology, science and mainstream publication since it launched in 2013. A key component of the project going forward will be the ability for researchers to access interactive, digital brain model simulations.
Supercomputer manufacturer Cray was recently contracted to provide a pilot system as part of the third and final phase of a research and development project. This system will provide the platform on which interactive simulation and analysis techniques will be designed and tested and is currently being developed at Cray's newly-formed EMEA Research Lab in Bristol.
The pilot system will be managed similarly to how other large experimentals are managed - rather than submitting work to a queue, users reserve time slots on the system using an advance reservation system. Cray has also been developing an alternative solution, which is to suspend running applications to memory or to fast swap space via a DataWarp filesystem. This enables repetitive production cycles to run at the same time each day. Although several Cray sites already employ these techniques, further exploring these ideas could make using Cray supercomputer systems interactively a possibility.
Of course, limited memory capacity presents a barrier to brain simulations. The data sets being envisioned by researchers exceed current limitations, often requiring tens of petabytes of main memory, much more than what is available on even the largest supercomputers available - with future plans expected to require hundreds of petabytes in order to run a full brain-scale simulation - meaning that storing full data sets in memory at one time will be impossible. The solution to this problem will be for users to select interactively the most relevant data to visualise and analyse along with a small subset of results to store. These simulation codes utilise "recording devices" to store the selected results for further analysis. Initial simulations use default record settings, with the potential to utilise more detailed settings for a subset of data objects should the simulation reveal pertinent and interesting results.
It is critical to provide researchers with the ability to couple their applications - simulation, analysis and visualisation - into a single workflow. While an HPC flow would involve a simulation job to write results, either intermittently or after the job is completed, which would then be analysed by a post-processing job, a coupled workflow would allow both the simulation and analytics applications to run concurrently. The analysis job could run on either its own dedicated resources or on the same nodes as the simulation, delivering the data it gathers to the visualisation system in a pool of dedicated GPU nodes. For either method, efficient and fast techniques for transferring data between the applications and synchronising are required.
HPC workflows, particularly on projects that involve massive amounts of data being transferred between distinct applications in the workflow, typically communicate via the filesystem. By providing a tier of shared bandwidth-optimised storage, these processes can be vastly accelerated. While intermediate data would be stored in this tier, the final results of the experiment would be sent to the enterprise storage.
DataWarp - Cray's revolutionary flash-based storage solution - supports both private and shared uses. In the case of shared use, a high-bandwidth filesystem - built on an array of flash-based storage servers - provides a platform for large simulations and visualisation jobs running on different nodes to communicate with one another. In the case of private use, local high-bandwidth communication between analytics and simulation applications is provided via memory or storage on every node. Both of these strategies are currently being evaluated for the HBP pilot.
An additional element of the HBP project is the capacity to guide simulations while they are running. This is a way of looking at how we perform simulations rather than a specific technology. The brain network wiring is constructed in memory, typically a very time-consuming task that can result in over a petabyte of data. While the network is retained in memory, the researcher can run a succession of quick virtual experiments, with each experiment being guided by the results of the former. This steering data can be transferred via socket connections or the filesystem back to the simulation.
The HBP pilot system will see a range of interesting and progressive memory and processor technologies being previewed. Details on these technologies will be made public as the project progresses.