Q&A with ORNL’s Bronson Messer, an HPCwire particular person to look at in 2022

HPCwire options our interview with Bronson Messer, Scientist Emeritus and Director of Science at Oak Ridge Management Computing Facility (OLCF), ORNL, and HPCwire 2022 One to Watch. Messer recaps ORNL’s journey to exascale and highlights highlighting how all of the items align to help a very powerful science. The function of the exascale computing undertaking, data on architectural orientations and the evolution of HPC-AI synergies are additionally mentioned. This interview was carried out by way of electronic mail earlier this 12 months.

Bronson, congratulations on being named HPCwire’s 2022 Particular person to Watch! Are you able to give us a quick overview of your tasks on the Oak Ridge Management Computing Facility and what your function entails?

Bronson Messer

As Scientific Director of the OLCF, I’m chargeable for mobilizing all of our sources to make sure that the science that solely management computing can allow is achieved. This work begins earlier than an allocation is made on machines, continues all through compute campaigns and actually has no formal finish as I proceed to speak the affect of those tasks to all kinds of audiences even years after their launch. extra. That is nice work for a science junkie like me: I can develop a greater than pedestrian understanding of the complete vary of science we help at OLCF (i.e., nearly all science disciplines) whereas “residing shut” to a few of the strongest computer systems. The little Appalachian boy programming a TRS-80 Mannequin 1 that I used to be within the early 80s can be very jealous.

Please spotlight a few of Oak Ridge’s successes on the street to exascale. (HW, SW, apps, individuals – something!)

I believe our best successes on the trail to exascale should do with the prospect we took with Titan at first of the final decade. There was appreciable skepticism after we first embraced hybrid CPU-GPU computing, all-in with Titan. We continued down this path with Summit, a path that has confirmed fruitful as we now stand on the precipice of exascale.

This journey is as a lot concerning the individuals we now have deployed across the machines and their experience as it’s concerning the {hardware}. I’ve been lucky to work alongside a few of the most educated and skilled individuals in HPC over the previous decade and a half, in all of the completely different points of the hassle which can be required to deploy sources on the size we now have. Specifically, our bonding mannequin – pairing area scientists with world-class HPC abilities with particular person tasks – is a strategy that has enabled the exascale arrival on the street to hybrid node computing in a approach actual.

How did your group interface with the Exascale Computing (ECP) undertaking? What are you able to share concerning the function of the ECP in supporting exascale readiness out of your perspective?

We’re shut companions with ECP. There’s hardly any aspect of the undertaking that OLCF is just not deeply concerned in, from utility improvement to {hardware} and integration. We’ve got offered the first improvement and check platform for all ECP software program know-how and utility improvement groups within the type of Summit at just a few million node hours per 12 months for the previous few years . We additionally perceive that ECP groups are a part of our conventional early science groups. We instantiated the third model of our Middle for Accelerated Software Readiness (CAAR) to organize a bunch of functions for Frontier, and we take into account the ECP improvement groups to be a part of it. Certainly, most of the similar OLCF individuals working with our CAAR groups additionally work on ECP functions and different software program. The ECP groups are additionally a part of the primary group of customers on our check and improvement system for Frontier. I anticipate that the ECP functions will present a few of our first scientific outcomes on Frontier.

Milestones are inspiring and thrilling. What excites you probably the most about getting into the exascale period? What are some examples of the science and hopefully breakthroughs that can be unlocked? How will having exascale programs – and I imply the entire ecosystem, not simply the {hardware} – be a recreation changer?

The good factor (for me, anyway) about supercomputing is that there is not only one “killer app”. Supercomputing is helpful throughout the entire scientific enterprise, so the checklist of latest concepts and questions that can be gleaned from exascale computing is…uncountable. However I’ve just a few locations the place I believe the impact can be significantly sharp and deep. The primary issues the design cycle for engineering in aerospace, CFD and associated fields. The power to do design simulations with the bodily constancy required to deploy actual machines and to take action on human timescales (i.e. roughly a day or night time) is an actual game-changer for a lot of researchers, academia and trade. Associated to that is the persevering with quest to grasp turbulence, the final nice drawback in classical physics. Resolve – aka reminiscence – is required to make progress on this entrance, and Frontier will present a major leap. The power to resolve convection within the environment at scales of a few kilometer is one place the place that additional decision would not simply come totally free. Quite the opposite, it results in new physics and new understanding.

Moreover, we’re commissioning large storage programs as a part of Frontier. The power to shortly question very giant collections of information and carry out non-trivial calculations on that knowledge will yield insights in various areas, drug discovery being a vital instance.

Heterogeneous computing structure, largely depending on accelerators (primarily GPUs), has develop into the dominant strategy to supercomputing (with the notable exception of Top500 chief Fugaku) ​​and types the spine of the US exascale program. The place do you see headed computing structure? What would be the continuation of as we speak’s dominant heterogeneous panorama (CPU plus accelerator)?

I believe the define of CPU + accelerator computing most likely nonetheless has some fuel. Most essential for builders is the abstraction of the reminiscence hierarchy into “close to and quick” and “far and gradual” reminiscence areas. This sample has been with us for some time, it simply grew to become extra evident and, maybe, essential, with hybrid node computing. Compute engines could change a bit, however having this type of construction and having heterogeneity on the node might be going to persist for some time. That does not imply we would not arrange a number of partitions of various HWs sooner or later (i.e. improve heterogeneity from the node degree), however I believe it could be extra of a query of timeliness to advance science: to make sure that all steps within the strategy of gaining perception right into a computational experiment, knowledge evaluation, or inference are carried out as effectively as doable.

What’s the alternative to convey collectively HPC and AI capabilities in a single structure? I heard (I overlook from whom!) that Summit is (already) the primary giant HPC-AI supercomputer on this planet. What’s the adoption/implementation standing of converged AI-HPC workflows? Do you additionally see a necessity for purpose-built AI architectures (like Cerebras, SambaNova, Groq, and so on.)?

We lately checked out this concept of ​​HPC and AI coming collectively, primarily based on what we see in our consumer applications. This confluence is already there. Lots of the tasks we help in Summit use each “conventional” simulation (I actually hate that nickname for that) and AI and ML strategies. These tasks additionally use AI/ML in various levels of their computational campaigns, from earlier than the primary simulation to coach surrogate fashions, to design of experiments, and to evaluation afterwards. knowledge era.

If purpose-built architectures may be made adept at collaborating in all of those levels – by means of coverage or software program or each – then I believe the acceleration they hope to realize may be as impactful as, for instance, the Tensor Cores on Summit turned out to be.

It has been proposed that within the not so distant future, quantum accelerators can be built-in into an HPC structure or workflow. How do you see these applied sciences coming collectively? Is that this one thing the OLCF is getting ready for?

OLCF has an asset Quantum Computing User Program the place we handle entry to various industrial quantum computing suppliers. We’re additionally actively soliciting proposals for our Director’s Discretionary Allocation Program for “hybrid” proposals that want to leverage these sources related to an allocation on Summit.

I am very excited concerning the promise of quantum computing to assist resolve issues which can be already ‘quantum’. A few of these issues are actually handled within the classical approach as a result of we have no idea tips on how to write software program able to fixing “actual” quantum equations quick sufficient. One which I am significantly all for is the thought of ​​quantum neutrino kinetics in dense astrophysical environments like neutron stars and core-collapse supernovae. I believe we’re years away from having “quantum accelerators” hanging from HPC nodes, fixing the quantum kinetic equations that can inform us how neutrinos change taste in these explosive environments, however perhaps a pupil I coach will see this occur.

Are there another IT tendencies you’d wish to touch upon? Are there any areas that concern you or want extra consideration/funding?

Shifting numbers to and from reminiscence is probably the most important bottleneck for scientific computing. This has been identified to HPC practitioners for a very long time, and our present companions in AI and ML are additionally shortly arising in opposition to this actuality. There aren’t any simple technical solutions to rising reminiscence bandwidth and limiting the quantity of power wanted to maneuver these bits, however that ought to maybe be probably the most motivating notion as we transfer ahead.

Exterior of the skilled sphere, what are you able to inform us about your self – distinctive hobbies, favourite locations, and so on. ? Is there something about you that your colleagues could be shocked to be taught?

I put on my Appalachian background on my sleeve, so most individuals who know me know I grew up within the Nice Smoky Mountains. A little bit of an obsession with fly fishing goes together with this origin story. However not everybody is aware of that I’m an avid lacrosse participant and coach, lastly graduated highschool (honorary) final 12 months, or am a multi-day Jeopardy! champion.

Messer is considered one of HPCwire’s 12 personalities to look at for 2022. You may learn interviews with the opposite winners at this link.

Leave a Reply

Your email address will not be published.