HPC Wire

Subscribe to HPC Wire feed
Since 1987 - Covering the Fastest Computers in the World and the People Who Run Them
Updated: 20 hours 23 min ago

Vintage Cray Supercomputer Rolls Up to Auction

Tue, 11/28/2017 - 14:08

Where do you go to scratch your itch for a vintage Cray? Why eBay of course.

Our search wizards were up early this morning and spotted a listing for a “Vintage Cray C90/J916 Super Computer” which includes the 48-foot Ellis & Watts trailer that the supercomputer is mounted to. [Note: this is a J90 series system, not the earlier C90, as the included images confirm.]

Source: eBay auction item

Codenamed “Jedi” during development, the Cray J90 series was first sold by Cray Research in 1994. It was an entry-level, air-cooled vector processor supercomputer that evolved from the Cray Y-MP EL minisupercomputer. It is compatible with Y-MP software and runs the same UNICOS operating system, Cray’s version of Unix.

As Wikipedia notes, “the J90 supported up to 32 CMOS processors with a 10 ns (100 MHz) clock.” The J916 is the 16 processor model. There was also the J98 with up to eight processors, and the J932 with up to 32 processors. Fully configured with 4 GB of main memory and up to 48 GB/s of memory bandwidth, the J90 offered “considerably less performance than the contemporary Cray T90,” but was “a strong competitor to other technical computers in its price range,” according to the Wikipedia entry.

The seller reports that the unit, which comes equipped with cooling systems (that “need some restoring work”), is untested and requires a 480v connection to hook up the trailer.

Source: eBay auction item

Currently, one person has bid $3,999 on the auction. Shipping is listed at $3,000 with free local pickup in San Jose, California.

If you want a piece of Cray history in time for the holidays but aren’t looking to spend quite that much, there’s other Cray memorabilia to choose from, like this vintage Cray Research champagne glass (buy it now: $27.50), a Cray Y-MP C90 bomber jacket direct from Wisconsin (buy it now: $99), or, if it’s hardware you crave, a Cray X-MP memory board (buy it now: $146).

As for the “portable” Cray J90, the eBay lister doesn’t say much about its provenance, stating only, “Pickup from company been storing for many years. Please send me an email for more info.”

The post Vintage Cray Supercomputer Rolls Up to Auction appeared first on HPCwire.

CH3 Data and Midas Green Technologies Announce High Density, High Performance Computing Partnership

Tue, 11/28/2017 - 08:49

AUSTIN, Texas, Nov. 28, 2017 — CH3 Data and Midas Green Technologies today announced a strategic partnership to deliver high density compute workloads at record efficiencies. With the partnership, Midas’ immersion cooling technology will power CH3’s data center expansion.

CH3’s data center expansion will enable its customers to power the latest High Performance Computing (HPC) and cryptocurrency applications at unparalleled efficiencies. Utilizing Midas’ XCI technology, customers will be able to deploy over 100kW in 50 rack units of space all while maintaining a Power Utilization Efficiency (PUE) under 1.08, compared to the industry average of 1.7.

“Everyone from the oil and gas industry, to the military, to commercial data centers are learning the value of immersion cooling.  It’s here to stay and growing world-wide!”, says Jim Koen, CEO of Midas Green Technologies.

Midas XCITM utilizes a dielectric fluid immersion system that is over 1000 times more efficient than conventional air-cooling methods. With the elimination of air conditioning, Midas XCI can reduce data center operational expenses by up to 90%. Immersion cooling also eliminates the most common causes of maintenance issues, such as fan failure and dust accumulation. Finally, this technology also enables high density installations that provide significant reduction in the physical footprint previously required for similar capacities.

“We are excited for our newest data center expansion. As we see the demand for HPC and crypto currency applications continue to grow, CH3 Data will be positioned to take advantage of Midas’ immersion cooling technology for the benefit of our customers by saving them money, increasing the life of their equipment and giving them a safe, secure, constant environment for their servers.”  says Chris Laguna, Director of Operations for CH3 Data.

CH3 Data delivers the highest value data center services while providing clients with solutions to meet their unique needs. In addition to traditional colocation services, CH3 Data offers dedicated, HPC and cryptocurrency hosting solutions in their immersion-cooled data center.

Midas Green Technologies provides Midas XCI, an innovative and efficient data center cooling system. Founded in March 2011, Midas is headquartered in Austin, Texas. Their mission is to slash the data center carbon footprint and provide customers with compelling savings and benefits based on the superior design and efficiency of the Midas XCI immersion cooling solution.

Source: Midas Green Technologies

The post CH3 Data and Midas Green Technologies Announce High Density, High Performance Computing Partnership appeared first on HPCwire.

V100 Good but not Great on Select Deep Learning Aps, Says Xcelerit

Mon, 11/27/2017 - 13:58

Wringing optimum performance from hardware to accelerate deep learning applications is a challenge that often depends on the specific application in use. A benchmark report released today by Xcelerit suggests Nvidia’s latest V100 GPU produces less speedup than expected on some finance applications when compared to Nvidia’s P100 GPU.

Specifically, the V100’s new Tensor cores are not best suited for recurrent neural networks (RNN) broadly and a specialized version of them, long-short term memory models (LSTMs), according to Xcelerit; both are widely in finance applications for handling time series inputs.

“For the tested RNN and LSTM deep learning applications, we notice that the relative performance of V100 vs. P100 increases with network size (128 to 1024 hidden units) and complexity (RNN to LSTM). We record a maximum speedup in FP16 precision mode of 2.05x for V100 compared to the P100 in training mode – and 1.72x in inference mode. Those figures are many-fold below the expected performance for the V100 based on its hardware specifications (spec below, click to enlarge),” reports Xcelerit, an Ireland-based provider of software tools for quantitative finance, engineering, and research.

The reason for this less than expected performance, according to Xcelerit, is the powerful Tensor Cores in the V100 are only used for matrix multiplications in half-precision (FP16) or mixed-precision mode. “Profiling the tested applications showed that matrix multiplications only account for around 20% of the overall training time in the LSTM case, and even lower in the other configurations. The other operations (e.g. softmax, scalar products, etc.) cannot use the powerful Tensor Cores. This is in contrast to the convolutional networks used for image recognition for example, where the runtime is dominated by large matrix multiplications and hence they can optimally leverage the Tensor Cores,” reports Xcelerit (training and inference comparisons below, click to enlarge).

It’s worth noting that both the P100 and V100 have been wildly successful and there has been a flood of systems with the newer V100 introduced since mid-summer (see, Nvidia, Partners Announce Several V100 Servers). Xcelerit reports, “While V100 displays impressive hardware improvements compared to P100, some deep learning applications, such as RNNs dealing with financial time series, might not be able to exploit the very specialized hardware in the V100, and hence will only get a limited performance boost.”

Link to Xcelerit report (Benchmarks: Deep Learning Nvidia P100 vs. V100 GPU): https://www.xcelerit.com/computing-benchmarks/insights/benchmarks-deep-learning-nvidia-p100-vs-v100-gpu/

Charts and V100/P100 specs: Xcelerit

The post V100 Good but not Great on Select Deep Learning Aps, Says Xcelerit appeared first on HPCwire.

Nuance and NVIDIA to Advance AI for Radiology

Mon, 11/27/2017 - 13:17

BURLINGTON, Mass., and SANTA CLARA, Calif., Nov. 27, 2017 –Nuance Communications, Inc. (NASDAQ: NUAN) and NVIDIA (NASDAQ: NVDA) today announced that they are working together to bring the power of machine learning to radiologists and data scientists working across the entire healthcare system.

Unveiled today at the Radiological Society of North America conference (RSNA) in Chicago, the Nuance AI Marketplace for Diagnostic Imaging combines the power of NVIDIA’s deep learning platform with Nuance’s PowerScribe radiology reporting and PowerShare image exchange network, used by 70 percent of all radiologists in the United States. This combination creates a unique end-to-end methodology that enables widespread development and rapid deployment of imaging AI models into the existing workflow of thousands of radiologists, helping them quickly detect key clinical findings and improve patient care.

“We stand on the edge of a new age in radiology, where artificial intelligence and machine learning will become a necessity in every radiologist’s essential toolkit,” said Dr. Luciano Prevedello, Division Chief of Medical Imaging Informatics at The Ohio State University Wexner Medical Center. “It is critical for the state of AI adoption and its potential to improve patient outcomes and operations that AI-based tools are more than just available – they must be valuable, validated and valued by the institution of radiology.”

With the AI Marketplace, Nuance is the first company that will bring together an ecosystem of researchers, developers, medical associations, hospitals and health IT companies, revolutionizing medical imaging with AI. The Marketplace will become a hub for thousands of medical-imaging AI applications that help radiologists interpret images, auto-populate reports and focus on the most important cases. It will create a ready market for researchers and developers, while making it easy for radiology departments to seamlessly integrate multiple AI applications seamlessly into their existing workflow.

“Transforming the delivery of patient care and combating disease starts with the most advanced technologies being readily available when and where it counts – in every reading room, across the United States,” said Peter Durlach, senior vice president, Healthcare at Nuance. “Our AI Marketplace will bring together the leading technical, research and healthcare minds to create a collection of image processing algorithms that, when made accessible to the wide array of radiologists who use our solutions daily, has the power to exponentially impact outcomes and further drive the value of radiologists to the broader care team.”

“Medical imaging is an essential tool for delivering the best healthcare, and now we have the opportunity to massively enhance it with AI,” said Kimberly Powell, vice president of Healthcare at NVIDIA. “By working closely with Nuance, we are connecting the world’s AI developers to scalable and seamless deployment of AI applications for radiology.”

NVIDIA’s deep learning platform will power the training and publishing of applications to the Nuance AI Marketplace, as well as the deployment in the medical imaging workflow. NVIDIA’s DIGITS developer tool has a new feature to directly publish to the AI Marketplace, while NVIDIA’s TensorRT will provide low-latency, high-throughput inference for medical imaging. NVIDIA’s AI computing platform is available everywhere, which gives the AI Marketplace maximum flexibility, allowing hospitals to keep their data securely on premises to ensure strict confidentiality, or to take advantage of AI computing in the cloud.

Deep learning capabilities from NVIDIA augment intelligence and allow for faster, more accurate analysis and diagnosis. For example, NVIDIA’s deep learning platform can ingest and learn from normal and abnormal chest X-ray data and create an algorithm to detect and identify which images display pneumonia. Nuance can then integrate the images into different worklists, alerting radiologists to cases that should be prioritized.

Nuance and NVIDIA will show a live demo of the Nuance AI Marketplace for Diagnostic Imaging powered by NVIDIA at the RSNA conference. Nuance will be in South Hall A, Booth 2700, and NVIDIA will be in North Hall 3, Booth 8543.

About Nuance Communications, Inc.

Nuance Communications, Inc. (NASDAQ: NUAN) is a leading provider of voice and language solutions for businesses and consumers around the world. Its technologies, applications and services make the user experience more compelling by transforming the way people interact with devices and systems. Every day, millions of users and thousands of businesses experience Nuance’s proven applications. For more information, please visit www.nuance.com.


NVIDIA’s (NASDAQ: NVDA) invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI — the next era of computing — with the GPU acting as the brain of computers, robots and self-driving cars that can perceive and understand the world. More information at http://nvidianews.nvidia.com/.

Source: Nuance Communications

The post Nuance and NVIDIA to Advance AI for Radiology appeared first on HPCwire.

NIH Supercomputer Ranks No. 66 on Top500 List

Mon, 11/27/2017 - 13:04

FALLS CHURCH, Va., Nov. 27, 2017 – CSRA Inc. (NYSE:CSRA) announced today that the National Institutes of Health (NIH) Biowulf cluster has achieved a TOP500 ranking of No. 66. This distinction makes Biowulf the first supercomputer dedicated to health and biomedical research listed among the top 100 most powerful computers in the world.

Biowulf is designed to process a large number of simultaneous computations that are typical in genomics, image processing, statistical analysis, and other biomedical research areas. This achievement is the product of multiple phases of CSRA compute expansion for NIH. Additionally, CSRA has added an additional 17 petabtyes of storage. This addition more than doubles the storage capacity and increased input/output (I/O) bandwidth to 200 gigabytes per second to support data-intensive applications.

“We are proud to expand NIH’s supercomputing capability in this latest update to Biowulf,” said Vice President Kamal Narang, head of CSRA’s Federal Health Group. “CSRA’s world-renowned HPC experts partnered with NIH to achieve this milestone and continue to empower NIH research. Together we can support the important mission of NIH to discover new cures and save lives.”

The Biowulf cluster’s dramatic increase in power enables NIH researchers to make important advances in biomedical fields. This research relies heavily on computation, such as whole-genome analysis, simulation of pandemic spread, analysis of human brain MRIs, and machine learning algorithms to study Alzheimer’s disease. Results from these analyses may enable new treatments for diseases including cancer, diabetes, heart conditions, infectious disease, and mental health.

Biowulf profiled at 1.966 petaflops (1.966 thousand trillion floating-point operations per second) on the November 13, 2017 TOP500 list of supercomputing sites. It features:

  • Compute nodes from Hewlett Packard Enterprise
  • Intel processors and NVIDIA Graphics Processor Units (GPU)
  • Large-scale storage from Data Direct Networks
  • Infiniband interconnect components from Mellanox Technologies
  • Ethernet switches from Brocade Communication Systems
  • Cooling solution from the Capitol Power Group, including Motivair chilled doors

Biowulf includes:

  • 2,448 compute nodes
  • 96 GPU nodes
  • Enhanced Data Rate (EDR)/Fourteen Data Rate (FDR) Infiniband fabric
  • 30 petabytes of GPFS online storage

As the prime contractor, CSRA procured all of the components and managed the integration and installation of the equipment into the Biowulf system while collaborating with several industry partners. CSRA is also helping support the ongoing operation of the Biowulf cluster.

A leader in High Performance Computing (HPC) services, CSRA offers a wide variety of solutions for government customers to achieve important mission objectives. Last April, the company was awarded a $51 million contract to support the Environmental Protection Agency’s (EPA) HPC systems. CSRA is also a strong partner with NASA, supporting its Center for Climate Simulation (NCCS) HPC needs since 2000 – and NASA Ames’ HPC for an even longer period.

In addition to NIH, NASA, and the EPA, CSRA supports supercomputers used by NOAA, CDC, and the Department of Defense. This technology is used for applications ranging from aerospace system design, climate and weather modeling, astrophysics, ecosystems modeling, to health and medical research.

For more information on CSRA’s High Performance Computing and storage projects, contact HPC.Sales@csra.com

About CSRA Inc. 

CSRA (NYSE: CSRA) solves our nation’s hardest mission problems as a bridge from mission and enterprise IT to Next Gen, from government to technology partners, and from agency to agency.  CSRA is tomorrow’s thinking, today. For our customers, our partners, and ultimately, all the people our mission touches, CSRA is realizing the promise of technology to change the world through next-generation thinking and meaningful results. CSRA is driving towards achieving sustainable, industry-leading organic growth across federal and state/local markets through customer intimacy, rapid innovation and outcome-based experience. CSRA has over 18,000 employees and is headquartered in Falls Church, Virginia. To learn more about CSRA, visit www.csra.com. Think Next. Now.

Source: CSRA Inc.

The post NIH Supercomputer Ranks No. 66 on Top500 List appeared first on HPCwire.

WekaIO’s High-Performance File System Now Available on AWS Marketplace

Mon, 11/27/2017 - 07:33

SAN JOSE, Calif., Nov. 27, 2017 — WekaIO, a leader in high performance, scalable file storage for data intensive applications, announced the latest version of its cloud-based scalable file system, WekaIO Version 3.1, which delivers an enterprise-class, feature-rich, high-performance file system for compute-intensive applications and technical compute workloads. The new snapshot functionality in Matrix Version 3.1 is designed to help users fully realize the promise of the cloud with a single vendor solution to provide remote backup to Amazon S3, provision resources for cloud bursting workflows or utilize a Pause/Resume feature to underpin a hybrid cloud strategy for cost-effective data archiving.

With Amazon S3, Matrix Version 3.1 supports cloud bursting, allowing users with on-premises compute clusters to elastically grow their environment in response to peak workload periods, or move their entire high performance workloads to the cloud. The snapshot to the Amazon S3 functionality simplifies periodic backup, delivering a single-vendor backup solution, whereby an entire file system and its data can be backed up on Amazon S3, without the need to purchase additional backup tools. The remote cloud copy can be automatically updated as frequently as desired, without impacting application performance. It also can serve as an easy way to perform a “fire-drill” audit of the backup by simply rehydrating the file system on Amazon Web Services (AWS) as a test copy.

“There continues to be a strong trend within IT and research organizations to adopt a cloud strategy to accommodate the demands of high-performance applications,” said Sabina Joseph, Head of Global Storage Partnerships & Alliances at Amazon Web Services, Inc. “WekaIO has already demonstrated that high performance workloads can achieve outstanding performance on AWS through its SPEC SFS 2014 results. Leveraging AWS, customers can now benefit from a solution developed specifically to scale their businesses, making it easier to move high-performance applications to the cloud.”

WekaIO is fully self-service and provisionable on most Amazon Elastic Compute Cloud (Amazon EC2) instances with locally attached SSDs, and is ideal for analytics, genomics and HPC customers looking for a feature-rich, high-performance file system for their compute intensive applications running on AWS. The software can be automatically provisioned as a hyperconverged solution where applications and storage share the same instances, or as dedicated storage servers that connect other lower cost Amazon EC2 instances without local SSD storage.

Matrix Version 3.1 with snapshot to Amazon S3 functionality incorporates a Pause/Resume feature that is an ideal cost-effective cloud bursting and data archive strategy. Users may run Matrix Version 3.1 on Amazon EC2 instances for high application performance, stop instances, and pay only for cost-effective Amazon S3 storage, then resume future computations by re-spinning the Amazon EC2 instances.

“Using WekaIO Matrix Version 3.1 in conjunction with AWS allows enterprises to develop a true utility compute and storage model,” said Omri Palmon, Co-founder and Chief Product Officer at WekaIO. “The combined solution is designed to deliver extreme performance, high bandwidth and cloud scalability while delivering significant cost savings in CAPEX and OPEX.”

WekaIO will be demonstrating Matrix Version 3.1 at Booth #507 at AWS re:Invent 2017 in Las Vegas on Nov 27-Dec 1.

Get started by using WekaIO’s cluster planning calculator.


WekaIO Version 3.1 is available now on AWS Marketplace and through select resellers.

About WekaIO

WekaIO leapfrogs legacy infrastructures and improves IT agility by delivering faster access to data with software-centric data storage solutions that accelerates application performance and unlocks the true promise of the cloud. WekaIO Matrix software is ideally suited for performance intensive workloads such as Web 2.0 application serving, financial modeling, life sciences research, media rendering, Big Data analytics, log management and government or university research. For more information, visit www.weka.io, email us at sales@weka.io, or watch our latest video here.

Source: WekaIO

The post WekaIO’s High-Performance File System Now Available on AWS Marketplace appeared first on HPCwire.

How you can Boost Acceleration with OpenCAPI, Today!

Mon, 11/27/2017 - 01:01

The challenges in silicon scaling and the demands of today’s data-intensive Artificial Intelligence (AI), High Performance Computing (HPC), and analytics workloads are forcing rapid growth in deployment of accelerated computing in our industry.  In addition, the next several years will see a new wave of disruptive memory technologies that will transform the economics of large memory deployments in support of these applications.  But this new wave of accelerators and disruptive technologies won’t add much value if the platform they’re running on wasn’t designed to unleash their potential.

OpenCAPI was developed to fuel this heterogeneous computing revolution by unleashing the potential of these new technologies!  New trends are emerging as accelerators become more commonplace and workloads are being re-written or developed from scratch with acceleration in mind.  Accelerators need improved access to the capacity and low cost per Gigabyte of system memory without the inefficiency of the IO subsystem; OpenCAPI achieves this by providing very high bandwidth with coherency.  The portion of the application running on the accelerator often requires fine-grained interaction with the portion of the application running on the CPU.  The programming complexity and CPU overhead required to communicate with a traditional I/O-attached accelerator makes this impractical, but OpenCAPI places the accelerator natively into the application’s user space to bring about this fine-grained interaction.  These trends led to the development of the OpenCAPI architecture.

To facilitate broad adoption, OpenCAPI was architected to minimize the amount and complexity of circuitry required in an accelerator. In the case of an FPGA, less than five percent of the logic is consumed. Placing the complexity in the CPU instead of the accelerator also allows OpenCAPI to be supportable across all CPU architectures.  Programing with OpenCAPI was also made easier with virtual addressing.  The OpenCAPI architecture also enables a heterogeneous data center environment by enabling not only accelerators but also coherent network controllers, and coherent storage controllers.  In addition, OpenCAPI enables both advanced memory with a wide range of access semantics from load/store to user-mode data transfer models, and access to classic DRAM memory with extremely low latencies.

With products launching now, OpenCAPI is becoming the open standard interface for high performance acceleration today.  As seen at SC17 in the OpenCAPI Consortium and development partners’ booths, there are a wide variety of OpenCAPI based products ranging from systems to components and additional hardware is being tested today in various laboratories.

Join a team that is driving to make a difference in our industry today!  Start by visiting the OpenCAPI Consortium website at www.opencapi.org to learn more including information about membership.  You can also download the protocol specifications after a simple registration process.  Visit the website or stop by the OpenCAPI Consortium booth #1587 at SC17 for more details.  The OpenCAPI Consortium is an open standards forum that is home to the OpenCAPI specifications, enablement, and has operating work groups including the TL and DL protocols, PHY, enablement, and more.  There are now over 30 members of which many are engaged in product development that leverages OpenCAPI technology.

Development Partner Quotes and Blog Links

Companies that are making a difference today include the following:

Mellanox Technologies recently announced the Innova-2 FPGA-based Programmable Adapter, an OpenCAPI based solution which will result in delivering innovative platforms for high-performance computing and deep learning applications.  “We are happy to demonstrate our Innova-2 FPGA-based programmable adapter supporting the OpenCAPI interface at the supercomputing conference 2017,” said Gilad Shainer, vice president of marketing at Mellanox Technologies. “The deep collaborations among the OpenCAPI consortium members enables Mellanox to introduce OpenCAPI based solutions to the market in a short time, which will result in delivering innovative platforms for high-performance computing and deep learning applications.” SC17 booth #653.

Molex Electronic Solutions showcased at SC17 the Flash Storage Accelerator (FSA) development platform which supports OpenCAPI natively and brings hyper converged accelerated storage to Google’s/Rackspace’s Zaius/Barreleye-G2 POWER9 OCP Platform.  “FSA is designed to natively support the benefits of OpenCAPI by providing the lowest possible latency and highest bandwidth to NVMe Storage with the added benefits of OpenCAPI Flash functionality and near storage FPGA acceleration,” said Allan Cantle, founder of Nallatech which was recently purchased by Molex.  “HPDA applications such as Graph Analytics, In-Memory Databases and Bioinformatics are expected to benefit greatly from this platform.” SC17 booth #1263.

Alpha Data Inc. offers OpenCAPI enabled FPGA accelerator boards, featuring Xilinx UltraScale+ FPGAs on POWER9 for high-performance, compute intensive cognitive workloads. “Alpha Data’s Xilinx® FPGA based boards provide the highest performance per watt, lowest latency and simple programming methods for heterogeneous processing systems using POWER architecture,” said Adam Smith, Director of Marketing.  SC17 booth #1838.

Xilinx, Inc. is the leading accelerator platform for OpenCAPI enabled All-Programmable FPGAs, SoCs, and 3DICs.  “Xilinx is pleased to be the accelerator of choice furthering the adoption of the OpenCAPI interface which enables new datacenter and high performance computing workloads,” said Ivo Bolsens, Senior Vice President and Chief Technology Officer, Xilinx. SC17 booth #681.

Amphenol Corporation announced their OpenCAPI FPGA Loop Back cable and OpenCAPI cable.  These enable testing and OpenCAPI accelerators to be connected to standard PCIe while signaling to the host processor through sockets attached to the main system board.  “We are excited to work on OpenCAPI solutions by leveraging our interconnect technology for improved signal integrity and increased bandwidth,” said Greg McSorley, Business Development Manager.

IBM rolled out the CORAL Program at SC17, demonstrating how acceleration is being leveraged by the U.S. Department of Energy’s Summit supercomputer. Coral is equipped with the POWER9 based AC922 system and NVIDIA’s newest Volta-based Tesla GPU accelerator.  “This system will be one of the fastest supercomputers in the world when fully operational next year,” said Brad McCredie, Vice President and IBM Fellow, Cognitive Systems Development. “It will push the frontiers of scientific computing, modeling and simulation.” SC17 booth #1525.

Western Digital is tracking OpenCAPI standards development while exploring OpenCAPI prototype memory and accelerator devices to standardize the process for key storage, memory and accelerator interfaces.  “OpenCAPI standardizes high speed serial, low latency interconnect for memory devices and accelerator devices, the key enablement technologies for new data center workloads focused on machine learning and artificial intelligence,” said Zvonimir Bandic, Sr. Director, Next Generation Platform Technologies, Western Digital.  SC17 booth #643.

Micron is working to unlock the next generation of memory technology with the development of new interface standards such as OpenCAPI in their current and future products. “Unlocking next generations of acceleration and machine learning will require the development of new interface standards such as OpenCAPI,” said Jon Carter, VP-Emerging Memory, Business Development.  “Micron continues to support these standards-setting activities to develop differentiated platforms that leverage Micron’s current and future products.” SC17 booth #1963.

Rackspace is at center stage in the OpenCAPI ecosystem, working with Google to make its two socket POWER9 server, Zaius/Barreleye G2, an appealing development platform for accelerators in the Open Compute community. “The OpenCAPI accelerator and software ecosystem is growing rapidly” said Adi Gangidi, Senior Design Engineer with Rackspace. “With design, manufacturing and firmware collateral available via the Open Compute website, accelerator developers find it easy to design and test their solutions on our platform.”

Tektronix offers test solutions that are applicable to OpenCAPI’s physical layer standards capable of testing at 25Gbps and beyond.  Tektronix offers best-in-class solutions for receiver and transmitter characterization, automation, and debug supporting data rates through 32Gb/s.  “We are excited to be a Contributing Member of the OpenCAPI Consortium,” said Joe Allen, Market Segment Lead at Tektronix.  “Tektronix provides unique tools that are widely used within the Data Center and are scalable for OpenCAPI and we look forward to deliver comprehensive test and characterization solutions in this emerging market.”

Toshiba Electronic Devices & Storage Corp. is working on custom silicon design platforms that enable users to rapidly develop and deploy OpenCAPI based accelerator solutions in the computing, storage and networking space. “We are excited about the tremendous interest in custom silicon for machine learning and AI accelerators,” said Yukihiro Urakawa, Vice President, Logic LSI Division. “By offering pre-verified OpenCAPI sub-system IP in our custom silicon portfolio, we look forward to seeing very high performance, power optimized accelerators that take full advantage of the OpenCAPI interface.”

Wistron unveiled their POWER9 system design at SC17 that incorporated OpenCAPI technology through the 25Gbps high speed links.  “In order to provide the best backend architecture in AI, Big Data, and Cloud applications, Wistron POWER9 system design incorporates OpenCAPI technology through 25Gbps high speed link to dramatically change the traditional data transition method. This design not only improves GPU performance, but also utilizes next generation advanced memory, coherent network, storage, and FPGA. This is an ideal system infrastructure to meet next decade computing world challenges,” said Donald Hwang, Chief Technology Officer and President of EBG at Wistron Corporation.

Inventec introduced and demonstrated the Lan Yang system based on the Open Compute Project (OCP) platform.  “Two POWER9 processors and state of the art bus technology including OpenCAPI and PCIe Gen4 provides the basis for the most advanced technologies for 48V Open Power solutions,” said Lynn Chiu of Inventec.  “We took in one step further and added AST2500 for smart management and OCP Mezz 2.0 slots for expansion and heterogeneous infrastructure to support dedicated customer’s requirement for data center applications.”

The post How you can Boost Acceleration with OpenCAPI, Today! appeared first on HPCwire.

SC17 US Student Cluster Competition Teams: Defending the Home Turf

Fri, 11/24/2017 - 16:54

Nine US universities showed up to the SC17 Student Cluster Competition in an attempt to keep the trophies in the United States. Let’s use our video lens to get to know them a bit better….

Georgia Tech is a newbie team composed of budding computer scientists and other tech types. They’ve dubbed themselves “Team Swarm” in a nod to their school mascot, the yellow jacket. They’re mostly juniors and seniors, with the exception of one 5th year senior who took a redshirt year in order to, assumedly, gain some more experience.

The team seems to be adjusting to the competition fairly well, but is facing some of the usual problems. The Yellow Jacket configuration isn’t quite what they need in order to compete in the upper echelon, but they’re moving ahead with determination. Check out the video for more details.

On another note, their team song still sucks….and it got stuck in my head again when they mentioned it.

University of Illinois Urbana-Champaign is a second time competitor at the big SC17 cluster dance. This team has some great experience, one member actually works on Blue Waters, and the others (with one exception) are seniors who are just about ready to graduate into the world of professional clustering.

The team feels pretty good about their performance, although they concede that they won’t have the highest benchmark scores. They’re happy that they put their best foot forward and feel that they got the most out of their box.

They also have been working on training their next team, with nine other members on the way to cluster stardom next year. What’s really gratifying about this team is that they all seem like they’re committed to careers in HPC, which is fantastic. We definitely need new blood in this industry and these kids are supplying it – hmm….that didn’t come out exactly like I hoped, but you get my point, right?

Northeastern University had to sit through two of my inane interviews because I screwed up the first one. Needless to say, they’re a patient bunch of students. Things have been going pretty smoothly with the NEU team, but they did do a last minute hardware change.

They were originally decked out with Radeon MI25 GPUs, which are pretty good at single-precision computation and great at machine learning. However, all of the benchmarks and applications at SC17 require double-precision. I did have an idea for the team, as you’ll see on the video. My idea? Run the apps in single precision but do them twice. The team wasn’t very impressed with my shortcut.

They ended up swapping their Radeon GPUs for four NVIDIA P100s, which was quite the last minute fire drill. Team Boston provided the NVIDIA cards out of their hardware stash. Nice job, Team Boston, way to help out another team.

San Diego State University/SW Oklahoma State University is another combination team that is competing together for the first time. When we catch up with the team in the video, they’re doing ok….but not great.

They’ve having a problem getting MrBays running on their GPUs, but they’re trying various work arounds to see if they can get it to work. In the meantime, they’re running the application on their CPU, although they’re pretty sure they won’t get close to finishing if they don’t get it running on GPUs.

The team had originally figured they’d be able to run a Power 8 based head node, storage node, and a couple of compute nodes. However, there’s no way that amount of hardware will run under the 3,000 watt power cap. So the team had to slim down their cluster considerably, as they discuss in the video.

William Henry Harrison High School is the first high school only team to compete at any major Student Cluster Competition. Who would have figured that we’d get to a place where a high school can compete credibly against university cluster teams? But WHHHS is doing just that, holding their own against older students who have much more experience and expertise.

When we got our cameras to their booth, the team is feeling good about their progress. They had just completed MrBayes and were driving on. The team was currently having a few problems with the Reproducibility task and MPAS, but felt confident that they would overcome the challenges.

One bad note was that the team wasn’t able to submit a HPCG score due to some unnamed problem. That’s not a good thing, but it’s not an unrecoverable error, they still have time and apps to make up for the shortfall on HPCG.

The Chicago Fusion Team is composed of students from the Illinois Institute of Technology, Maine South High School, and Adlai Stevenson High School. We couldn’t find the time to film a video introduction due to both of our busy schedules. However, this is a team that looks like a pretty solid competitor and has had very few problems as near as I can tell. We’ll see how they look when the benchmarks and apps are scored – that’ll tell the tale.

Team Boston is a frequent competitor in the Student Cluster Competitions. As usual, the Big Man, Evan DeNato (or is it DeNado?) is at the helm, leading the team as usual.

When we catch up with Team Chowder, they’re looking at their cloud budget and figuring out how to apportion it for the Born application. Boston is doing well on MrBayes and doesn’t seem to see any problems ahead of them.

In the video, we talk about various ways to present data (my favorite: spider chart). The team is also working on the mystery application (MPAS). In talking to them, we discover their HPL (LINPACK) benchmark would have been a record breaking score just a year ago, but this year? No such luck.

Team Texas is comprised of students from University of Texas, Austin, and Texas State University. The team had just undergone the second of two (planned) power outages and was hard at work bringing up their cluster to complete the competition.

The team feels that their strongest application so far has been MPASS, the mystery application.

One of the students was all dressed up, having attended the job fair earlier in the day. He looks like a great candidate with a great attitude. I promised to post his resume in this video, but forgot. You can find his name in the video, if you have a position open and are looking for a motivated and knowledgeable employee with a great attitude.

Something a student said triggered me, and I went into my IEEE FP rant (it’s the shame of the computer industry) towards the end of the video and almost lose control when a student disagrees with me. But I manage to keep my rage bottled up so we could all part as friends.

Team Utah is looking to build upon their second place finish at SC16 last year. The newbie team burst onto the scene in a home court battle that saw them top all competitors but one. Can they do it again?

When our lenses catch up to them during SC17, the team seems calm and composed. They had a few problems setting up Born, but it’s running smoothly in the cloud at the time of the video. During the video, I discuss some personal GPU-enabling business with their HPL expert; I might be putting him to work at a later date.

The Utes have a couple of new faces, but the rest of the team competed at SC16 and are at least somewhat used to the pressure. I dangle the prospect of dropping out of college to participate in the underground cluster competitions that still exist in out of the way locales. The team wasn’t buying it and figured they’d stay in school.

If you missed any of the other “Meet the Teams” features, you can find the European teams here, and the Asian team here.

Now that we’ve given the teams their fifteen minutes of video fame (more like 5 minutes, but who’s counting?), it’s time to see how they finished. In our next article, we’ll be exhaustively going over the results to see who won and how they won. Stay tuned……

The post SC17 US Student Cluster Competition Teams: Defending the Home Turf appeared first on HPCwire.

Long Flights to Cluster Fights: Meet the Asian Student Cluster Teams

Wed, 11/22/2017 - 18:36

Five teams from Asia traveled thousands of miles to compete at the SC17 Student Cluster Competition in Denver. Our cameras were there to meet ‘em, greet ‘em, and grill ‘em about their clusters and how they’re doing in the competition so far….


Team Nanyang got some great news on the second day of the competition:  They’d won the LINPACK Award and established a new Student Cluster Competition record score of 51.77 TFlop/s. This is way higher than the former record of 37.05 TFlop/s established just a few months before at ISC17.

In the video, we talk about how Nanyang has been steadily improving their cluster competition performance. We also discuss how they pulled off that great LINPACK score (hint:  it has something to do with fans).

Team NTHU (Taiwan) is all smiles as we film their introductory video. We did a full intro of the team and found out  how they’re dividing up the work. They don’t seem to be having any problems with the applications, their system is running fine, and all is good.

NTHU has a heralded history in the SC cluster competition. They participated in the very first student cluster competition way back in 2007, taking home the LINPACK award. Team Taiwan also nabbed a couple of Overall Championship trophies back in 2010 and 2011. Although they haven’t been in the winner’s circle lately, they have to be considered one of the elite teams and a contender for the championship.

University of Peking is competing in their second SC Student Cluster Competition. Team Peking turned in a solid performance in 2016, but is looking to do a whole lot better this year. In talking to the team, they were pleasantly surprised at the strength of their benchmark scores, as their system performed over spec.

The team is sporting 10 NVIDIA V100 accelerators on a single node configuration. Single-node? Huh? What’s the deal with that? How does a single-node system qualify as a cluster? Well, the team has answers for these questions, but you’ll have to watch the video to see it.

Tsinghua University is living in a pressure cooker. If they win the SC17 contest, they’ll be only the second team in history to earn the triple crown of student clustering, winning all three major tournaments (ASC, ISC, and SC) in a single year. The first team to accomplish this feat was another Tsinghua team back in 2015.

As you can see in the video, this is a seasoned and confident team. They’re not running their whole eight node cluster, having decided to run only five nodes (with ten NVIDIA V100 GPUs).

Team Tsinghua sees Born as the most difficult application, with 1,136 ‘shots’ that have to be completed in order to win the task. In a typical Tsinghua move, they ported Born over to their GPUs, then optimized the bejesus out of it to bring the run time for each shot down to minutes rather than hours.

Next up, we’ll take a look at the US based teams participating in the SC17 Student Cluster Competition. Stay tuned for more….

The post Long Flights to Cluster Fights: Meet the Asian Student Cluster Teams appeared first on HPCwire.

Japan Unveils Quantum Neural Network

Wed, 11/22/2017 - 15:53

The U.S. and China are leading the race toward productive quantum computing, but it’s early enough that ultimate leadership is still something of an open question. The latest geo-region to throw its hat in the quantum computing ring is Japan. The nation will begin offering public access to a prototype quantum device over the internet for free starting Nov. 27 at https://qnncloud.com.

As reported by Japanese news outlets this week, Tokyo-based NTT along with the National Institute of Informatics and the University of Tokyo are working on a quantum computing device that exploits the properties of light. Backed with state investment, the quantum neural network (QNN) prototype is reported to be capable of prolonged operation at room temperature. The system consists of a 1km long optical fiber loop, a special optical amplifier called a PSA, and an FPGA. (See video below for a detailed explanation of how it all works.)

Source: NTT (YouTube)

The implementation, a type of Ising machine, is a departure from superconducting quantum efforts which require exotic and expensive cooling apparatus. NTT’s prototype draws just 1kW, close to an ordinary household appliance.

Unlike efforts from Google and IBM, this won’t be a universal quantum computer. The goal of the QNN is to find solutions to combinatorial optimization problems thousands of times faster than classical computers are able to (this is what Ising machines are theorized to excel at). Potential real-world use cases include easing traffic congestion, optimizing smart phone communications, and drug discovery. Project stakeholders are aiming to commercialize the system by March 2020 and are seeking participation from the community for testing and software development purposes.

“We will seek to further improve the prototype so that the quantum computer can tackle problems with near-infinite combinations that are difficult to solve, even by modern computers at high speed,” said project head Stanford University Professor Emeritus Yoshihisa Yamamoto.

Japan has quietly been building its quantum research portfolio and will kick off a ten-year, 30 billion yen ($267 million) quantum research program in April 2018.

For further background, see the Oct. 2016, Japan Science and Technology Agency (JST) announcement here: http://www.jst.go.jp/pr/announce/20161021/index.html

The Japanese news portal Nikkei has photos of the unveiling here.

The post Japan Unveils Quantum Neural Network appeared first on HPCwire.

Oakforest-PACS Ranks Number One on IO-500

Wed, 11/22/2017 - 15:00

Nov. 22 — The Joint Center for Advanced High Performance Computing (JCAHPC) announced that the storage performance of the Oakforest-PACS supercomputer has ranked #1 in the first IO-500 list released in November 2017. The IO-500 is a world ranking list of storage performance, which is evaluated by the IO-500 benchmark that measures the storage performance using read/write bandwidth for large files and read/write/listing performance for small files. Storage performance in supercomputers is critical for large-scale simulation, big data analysis, and artificial intelligence. The IO-500 list facilitates to improve the storage performance that greatly helps to improve the CPU performance.

A storage system of the Oakforest-PACS supercomputer comprises a parallel file system (DataDirect Networks ES14KX) and a file cache system (Infinite Memory Engine). The file cache system is introduced to improve the storage performance. The IO-500 benchmark for the file cache system achieves 742 GiB/s[1] for file per process write access that parallel processes access their own file, and 600 GiB/s for single shared file write access that parallel processes access a single shared file but a different position, which is typical access patterns in high performance computing.

Supporting Resources

More on the Joint Center for Advanced High Performance Computing (JCAHPC) –

More on the Information Technology Center, the University of Tokyo –

More on the Center for Computational Sciences, the University of Tsukuba –

More on IO-500 benchmark and list – http://io500.org/


JCAHPC is a world-leading supercomputer center jointly established by the Information Technology Center at the University of Tokyo and the Center for Computational Sciences at the University of Tsukuba. It operates the Oakforest-PACS supercomputer built by Fujitsu to rapidly promote research and development in science and technology since December 2016. The Oakforest-PACS supercomputer is provided as a shared resource for computational science research under programs of the High Performance Computing Infrastructure (HPCI) and each university. The Information Technology Center at the University of Tokyo and the Center for Computational Sciences at the University of Tsukuba contribute further progress of computational sciences, big data analysis and artificial intelligence by operating the Oakforest-PACS supercomputer.

[1] GiB/s – the unit of file access performance, which accesses 1 GiByte data per second. 1 GiByte  is 10243 Byte that means 1,073,741,824 Byte.

Source: JCAHPC

The post Oakforest-PACS Ranks Number One on IO-500 appeared first on HPCwire.

Argonne Appoints Chief of Staff Megan Clifford

Wed, 11/22/2017 - 14:44

Nov. 22 — Megan Clifford has been named Chief of Staff at the U.S. Department of Energy’s (DOE) Argonne National Laboratory, effective Jan. 1, 2018.

Megan C. Clifford

Clifford joined Argonne in November 2013 as the Director of Strategy and Innovation for the Global Security Sciences division (GSS). She has developed and executed strategies and programs with multi-disciplinary and cross-institutional teams to address a range of energy and global security challenges. Clifford previously held a senior executive position at Booz Allen Hamilton in Washington, D.C., where she served on the leadership team responsible for growth and performance of the firm’s $790 million Justice and Homeland Security business.

Clifford will serve as an advisor to Laboratory Director Paul Kearns, helping him to maintain and grow the laboratory’s vital stakeholder relationships. Clifford’s strong leadership, sponsor engagement, business acumen and program development experience, along with her strategic background are assets that position her to enable growth and achievement of the Argonne mission.

Clifford assumes the Chief of Staff role from Eleanor Taylor, who becomes Director of Board Relations for the University of Chicago on Dec. 4.

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science.

The U.S. Department of Energy’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit the Office of Science website.

Source: Argonne National Laboratory

The post Argonne Appoints Chief of Staff Megan Clifford appeared first on HPCwire.

David Womble Named ORNL AI Program Director

Wed, 11/22/2017 - 14:25

OAK RIDGE, Tenn., Nov. 21, 2017—The Department of Energy’s Oak Ridge National Laboratory has hired high-performance computing leader David Womble to direct its artificial intelligence (AI) efforts.

Womble began as AI Program Director on October 30. His responsibilities include guiding ORNL’s AI and machine learning strategy for high-performance computing (HPC); ensuring broad scientific impacts to the Department of Energy Office of Science mission; and long-range program planning and project leadership.

David Womble

In more than three decades in computing Womble has won two R&D100 awards and the Association for Computing Machinery’s Gordon Bell Prize, awarded each year “to recognize outstanding achievement in high-performance computing.”

Prior to joining ORNL Womble spent 30 years at Sandia National Laboratories, where he served as a senior manager and Program Deputy for Advanced Simulation and Computing (ASC). The ASC program is responsible for developing and deploying modeling and simulation capabilities, including hardware and software, in support of Sandia’s nuclear weapons program.

During his tenure at Sandia, Womble made numerous contributions across the computing spectrum including in HPC, numerical mathematics, linear solvers, scalable algorithms, and I/O. He established the Computer Science Research Institute and led Sandia’s seismic imaging project in DOE’s Advanced Computational Technologies Initiative.

“Artificial intelligence and machine learning represent the next generation in data analytics and have tremendous potential in science and engineering and commercial applications,” Womble said. “I am excited to be part of ORNL’s world-class research team and look forward to spearheading the development and application of the laboratory’s new AI capabilities.”

ORNL currently boasts two R&D 100 awards and 10 patents related to its AI research, and the addition of Womble is one of several moves by the laboratory to build on that foundation to help shape the AI state of the art and apply it to problems of national and international importance.

The laboratory, which is currently home to Titan, the nation’s most powerful computer for science, will soon launch Summit, which thanks to its combination of IBM POWER9 CPUs and NVIDIA Volta graphics processing units (GPUs) is being billed as the world’s “smartest,” or AI-ready, HPC system. ORNL’s partnership with NVIDIA has also resulted in a series of Deep Learning workshops aimed at assisting the laboratory’s researchers in harnessing the power of deep learning to rapidly accelerate breakthroughs across the scientific spectrum. And the laboratory has recently partnered with the University of Tennessee’s Bredesen Center on a joint Data Science and Engineering Ph.D. program aimed at bringing an expanded data expertise to some of DOE’s most pressing problems.

UT-Battelle manages ORNL for the Department of Energy’s Office of Science. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit http://science.energy.gov/.

Source: ORNL

The post David Womble Named ORNL AI Program Director appeared first on HPCwire.

Perspective: What Really Happened at SC17?

Wed, 11/22/2017 - 14:05

SC is over. Now comes the myriad of follow-ups. Inboxes are filled with templated emails from vendors and other exhibitors hoping to win a place in the post-SC thinking of booth visitors. Attendees of tutorials, workshops and other technical sessions will be inundated with requests for feedback (please do provide feedback – this really helps the presenters and the SC committees). Mr Smarty-pants will have already sent all their personal post-SC emails to those who warranted specific feedback, following up on conversations in Denver while they are fresh in the minds of both parties. Ms Clever-clogs reasons that anything sent to US people before Thanksgiving will be ignored and so smugly holds back on follow-up emails until the first week of December.

The post-SC follow-ups are not limited to emails – HPC media and bloggers rush to assert their summaries of a week that is far too packed and varied to be fairly summarized. I’m no different and HPCwire has kindly agreed to publish my personal wrap-up of SC17.

Processor Wars

It got real.

Anticipated, hoped-for, hyped, and foretold for some time now, SC17 is when the competition undeniably broke free in the HPC processor space. There are now several very credible processor options for HPC systems – either available now, or baked into the roadmaps of HPC vendors for upcoming systems. Intel’s Skylake and its successors remain a strong option, GPUs (especially Nvidia) offer distinct advantages in some cases, while IBM’s Power9 and AMD’s EPYC are ready alternatives actively being considered by HPC buyers now. But it was the ARM family – notably Qualcomm and Cavium – that stole the show in terms of mindshare. The first benchmarks of Cavium’s ThunderX2, or TX2 as it became nicknamed, showed it competing well with the market leader Intel on benchmark performance. One set of benchmark charts from the GW4 Alliance was referred to several times a day across diverse meetings. EPYC and Power9 are also reputed to compete well on benchmarks. When taking pricing into account, the genuine choices for HPC processors look more open than for several years. Results from our (NAG’s) impartial HPC technology evaluation consulting for various customers support these conclusions. It will be interesting to see how this emerges into deployment reality as HPC procurements take place over the next few months.

It’s Official

Talking of processors, the rumors around the abandonment of Intel’s Knights Hill processor that have been circulating for a while were confirmed as true by Intel. (It seems to still be a secret how long those rumors have been running and the precise timing of Intel’s decisions.) However, Intel confirmed in various public statements that they still have roadmaps and plans for processors to support exascale supercomputers in the first year or two of the next decade. I’m constrained by confidentiality here, so I won’t say much more. Only that Intel’s sharing of long term roadmaps under strict NDA are very useful to those who get them, and we all understand that those roadmaps – and indeed all roadmaps – are subject to change. When helping our customers plan for large future HPC deployments, best guesses that might change are still much better than those vendors who are reluctant to share anything useful “because it might change”.

It’s Not Official

It has long been true that some of the most valuable parts of the SC week are actually those away from the official program. Vendor events on the first weekend (e.g., Intel DevCon, HPE HP-CAST, and more) have become staple parts of the SC week for some. The endless round of NDA vendor briefings in the hotels and restaurants near the convention center fill substantial portions of schedules for many of us at each SC. The networking receptions, evening and daytime, are essential to find out what is going on in the HPC world, to exchange ideas with peers, to unearth new solutions, and to spark collaborations. Sadly, the clashes of these many off-program events with valuable parts of the official program (e.g., workshops and tutorials) are getting harder to navigate. Simply put, the SC phenomena has such a richness of opportunities, it is fundamentally impossible to participate in all of it.

Airline Points

Probably the second most popular topic of conversation in Denver was not actually HPC technology. I lost track of the number of conversations I was party to or overheard on the topic of travel. Airline and hotel points, “status”, misconnects, travel war stories (I don’t just mean United vs its passengers), and so on. Some of you will know I am personally a bit of a geek on the details of a range of airline and hotel “loyalty” programs and how to get the best out of them. At first this might seem a trivial topic for a supercomputing conference wrap-up, but I mention it because it actually highlights a critical aspect of HPC. The HPC community is a global one. Much of what we achieve in HPC is driven by multi-partner and international approaches. Some of it is (perceived) competition – the international race to exascale, economic competitiveness, etc. Much of the community aspect is collaboration – we deliver better when we work together on common issues – even across areas where there might be competitive undertones (e.g., early exascale capability). Conferences and workshops are a critical element in enabling the community’s diverse expertise and experience to drive the successful delivery and scientific impact of HPC. Plenty of travel thus becomes a natural indicator of the health of the HPC community.


Of course, the underlying reason we travel is to meet people. Ultimately, people are the most necessary and most differentiating component of any HPC capability. For this reason, when I do HPC strategy consulting, we look closely at the organizational and people planning aspects of the HPC services, as well as the technology elements. Thus, I was pleased to see the continued emphasis on the people aspects of SC. Explicit attention to diversity in the committees, presenters, and panels. A strong education and training program (including our three tutorials – HPC finance, HPC business cases, and HPC procurement). The student cluster competition. Mentoring programs. And more. The measure of success is surely when the SC community no longer feels the need for a dedicated Women-in-HPC agenda, nor a dedicated student program, etc., because those groups feel welcome and integral to all parts of the SC agenda. And, equally, the other side of the coin is when we have every part of the week’s activities enjoying a diverse participation. However, we are a long way from that end goal now and we need a clear plan to get there, which we can only achieve by keeping the issues prominent. NAG was delighted to support the Women-in-HPC networking reception at SC17 and I discussed this with several people there. I think the next step is to put in place actions to enable all individuals (whether they associate with the diversity issues, or are early career/student attendees, etc.) to be comfortable participating in the full breadth of the week’s activities, rather than focused into a subset of events.


Personally, I thought Denver was a great host city for SC and I look forward to returning in 2019. Next year SC18 is in Dallas – but don’t leave it that long to engage with your HPC community colleagues – there are plenty of excellent conferences and workshops before then. See you at one of the many HPC meetings in the coming months – and do reach out directly if I might be able to help with any free advice or mentoring.

Andrew Jones leads the international HPC consulting and services business at NAG. He is also active on twitter as @hpcnotes.

The post Perspective: What Really Happened at SC17? appeared first on HPCwire.

NCSA Announces GECAT Funding of Two International Seed Projects

Wed, 11/22/2017 - 11:35

Nov. 22, 2017 — The Global Initiative to Enhance @scale and Distributed Computing and Analysis Technologies (GECAT) project, led by the National Center for Supercomputing Applications (NCSA)’s NSF-funded Blue Waters Project, which seeks to build connections across national borders as a way of improving a global cyberinfrastructure for scientific advancement, has announced the funding of two seed projects that connect researchers on different continents to high performance computing resources that would not otherwise be attainable.

The first, newly-funded project, “High-End Visualization of Coherent Structures and Turbulent Events in Wall-Bounded Flows with a Passive Scalar” is led by Guillermo Araya (University of Puerto Rico) in collaboration with Guillermo Marin and Fernando Cucchietti (both of the Barcelona Supercomputing Center) and focuses on creating time-dependent, three-dimensional flow visualizations. As with most visual simulations, the amount of data required is massive and nearly impossible to sift through without robust parallel computing infrastructure. The GECAT enables this visualization to be possible by connecting Dr. Araya from the University of Puerto Rico with one of Europe’s strongest HPC centers, the Barcelona Supercomputing Center, in order to provide the parallel processes necessary to perform these intricate, data-intensive visualizations at a speed that would otherwise be impossible.

The second, a continuation of a previous GECAT pilot project titled “Performance Scalability and Portability on Advanced HPC Systems”, features William Tang (PI) of Princeton University in collaboration with James Lin (PI) of Shanghai Jiao Tong University (SJTU), who seek to improve the way code (specifically, GTC-P code) is used on modern GPU/CPU systems. Deploying this code in systems like Pi, a supercomputer housed at SJTU or the National Center for Supercomputing Applications’ (NCSA) Blue Waters system, will in-turn allow researchers to explore the code’s utility on GPU/CPU systems, and find ways to make this and similar codes more scalable and portable. This collaboration, enabled by GECAT, could help researchers develop and share associated lessons learned to assist to better develop and deploy their applications codes on GPU/CPU systems such as PI at SJTU and Blue Waters at NCSA.


The Global Initiative to Enhance @scale and Distributed Computing and Analysis Technologies (GECAT) project is part of the National Science Foundation’s Science Across Virtual Institutes (SAVI) program and is an extension of the NSF-funded Blue Waters project, which provides access to one of the world’s most powerful supercomputers and enables investigators to conduct breakthrough computational and big data research. GECAT is led by William Kramer, Blue Waters project director, and a research professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign and John Towns, NCSA’s executive director for science and technology, is a co-principal investigator and will help connect GECAT to the Extreme Science and Engineering Discovery Environment (XSEDE) project.

This GECAT supports International Virtual Research Organizations (IVROs) that promote and enable efficient collaboration and cooperation of researchers and developers in multiple countries by seeding the development of new, innovative cyberinfrastructure features enabling international scientific collaboration for scientific advances. The project directly supports the participation of multiple U.S. participants (senior and junior faculty, postdoctoral fellows, graduate students) in workshops, and to interact, communicate, and collaborate in small research and development “seed” projects with international partners from multiple countries in Europe and Asia. The GECAT budget allows for small awardsto US investigators initially for one year with possible extensions for additional years to be made for collaborative seed projects (travel, salary) who are attempting to create an IVRO. The funding is available to current XSEDE or Blue Waters users/PI’s.

Source: NCSA

The post NCSA Announces GECAT Funding of Two International Seed Projects appeared first on HPCwire.

Turnaround Complete, HPE’s Whitman Departs

Wed, 11/22/2017 - 09:53

Having turned around the aircraft carrier the Silicon Valley icon had become, Meg Whitman is leaving the helm of a restructured Hewlett Packard. Her successor, technologist Antonio Neri will now guide what Whitman asserts is a more “nimble” company through the uncharted waters of unrelenting change and hyper-scale disruption.

Whitman, 61, announced Wednesday (Nov. 21) she is resigning as CEO of Hewlett Packard Enterprise. She will remain on HPE’s board of directors. Neri will succeed Whitman early next year.

Meg Whitman

Given Whitman’s recent acknowledgement of interest in heading the ride-sharing outfit Uber, her departure is abrupt but not unexpected.

Acknowledging her abiding interest in “disruptive” startups going back to her days at eBay, Whitman said in September that the timing of the Uber search was off and “in the end that wasn’t the right thing.” (By contrast, Whitman’s timing was perfect: Uber this week admitted it concealed a massive hack exposing the personal data of 57 million riders and drivers.)

“I have dedicated the last six years of my life to this company and there is more work to do and I am here to help make this company successful and I am excited about the new strategy. So, lots more work to do and I actually am not going anywhere,” she said in response to a query from Toni Sacconaghi, an analyst with Sanford C. Bernstein.

In fact, Whitman was contemplating her departure.

As HPE struggles to make the move from traditional IT hardware manufacturer to a provider of hybrid IT such as software-defined datacenters, edge computing and networking equipment (e.g., Internet of Things), Whitman asserted Tuesday, “The next CEO of this company needs to be a deeper technologist, and that’s exactly what Antonio is.” Neri, who currently serves as HPE’s president, will become CEO on Feb. 1.

Antonio Neri

Whitman’s accomplishment was reorienting a conglomerate that was only getting bigger through disastrous moves like its 2011 acquisition U.K. software vendor Autonomy. The result of Whitman’s financial reorganization is a stable IT vendor capable of competing against rivals like Cisco Systems (NASDAQ: CSCO) and IBM (NYSE: IBM) that are facing the same competitive pressures.

Whitman took credit this week for making HPE “far more nimble, far more agile.”

Neri, an engineer by training, has been with Hewlett Packard for 22 years. “Antonio is going to lead the next phase of value creation,” Whitman asserted during a conference call with analysts.

Neri has lately assumed a higher profile in spearheading acquisitions and cloud collaborations. For example, he announced a partnership with Rackspace earlier this month to offer an OpenStack private cloud service running on “pay-per-use” infrastructure.

Meanwhile, company watchers reckoned HPE had a decent fourth quarter, expanding at a 5-percent clip. All-flash arrays were up 16 percent and networking equipment sales rose a healthy 21 percent over the same period last year. Server sales continued their steady decline, down 5 percent year-on-year. “HPE has gotten lean and mean and now the company needs to drive growth. This is what everyone is looking for,” noted technology analyst Patrick Moorhead.

HPE’s stock (NYSE: HPE) dropped more than one percent in reaction to Whitman’s departure.

The post Turnaround Complete, HPE’s Whitman Departs appeared first on HPCwire.

Inspur Wins Contract for NVLink V100 Based Petascale AI Supercomputer from CCNU

Wed, 11/22/2017 - 08:42

DENVER, Nov. 22, 2017 — On November 16 (US Mountain Time), Inspur announced the news in SC17 that Inspur has been awarded a contract to design and build a Petascale AI Supercomputer based on “NVLink + Volta” for Central China Normal University(CCNU), part of the university’s ongoing research efforts in frontier physics and autonomous driving AI research.

The supercomputer will configure 18 sets of Inspur AGX-2 servers as computing nodes, 144 pieces of the latest Nvidia Volta architecture V100 chips that support NvLink 2.0, and the latest Intel Xeon SP (Skylake) processor. It will run Inspur ClusterEngine, AIStation and other cluster management suites, with high speed interconnection via Mellanox EDR Infiniband . The peak performance of the system will reach 1 PetaFlops. With NVLink2.0 and Tesla V100 GPU, the system will be able to simultaneously support both HPC and AI computing.

Inspur AGX-2 is the world’s first density AI Supercomputer, supporting 8 *NVIDIA® Volta® 100 GPUs with NVLink 2.0 enabled in a 2U form factor. It offers NVLink 2.0 for faster interlink connections between the GPUs with bi-section bandwidth of 300 GB/s. AGX-2 also features great I/O expansion capabilities, supporting 8x NVMe/SAS/SATA hot swap hard drives and up to 4 EDR InfiniBand HCAs. AGX-2 supports both air-cooling and on-chip liquid-cooling to optimize and improve power efficiency and performance.

AGX-2 can significantly improve the computing efficiency of HPC, with 60T double precision flops per server. For VASP software, used extensively in physics and material science, AGX-2’s performance with one P100 GPU equals 8 nodes 2-socket mainstream CPU computing clusters. The Nvlink provided by AGX-2 also features excellent performance in parallel efficiency of multiple GPU cards, with 4x P100 GPU cards in parallel reaching the performance of nearly 20 nodes 2-socket mainstream CPU computing clusters.

For AI computing, the Tesla V100 employed by AGX-2 is equipped with Tensor for deep learning, which will achieve 120 TFLOPS to greatly improve the training performance of deep learning frameworks with NVLink 2.0 enabled.  Based on the Imagenet dataset for deep learning training models, the AGX-2 shows excellent scalability. Configured with 8x V100, the AGX-2 delivers 1898 images/s, which is 7 times faster than a single card and 1.87 times than P100 with the familiar config, when GoogleNet model is trained with TensorFlow.

Central China Normal University plans to further upgrade the AI supercomputer to multi-Peta flops system.

About Inspur

Inspur is a leader in intelligent computing and ranked top four in worldwide server manufacturing. We provide cutting-edge hardware design and deliver extensive AI product solutions. Inspur provides customers with purpose-built servers and AI solutions that are Tier 1 in quality and energy efficiency. Inspur’s products are optimized for applications and workloads built for data center environments. To learn more, http://www.inspursystems.com.

Source: Inspur

The post Inspur Wins Contract for NVLink V100 Based Petascale AI Supercomputer from CCNU appeared first on HPCwire.

InfiniBand Accelerates the World’s Fastest Supercomputers

Tue, 11/21/2017 - 19:52

BEAVERTON, Ore., Nov. 21, 2017 — The InfiniBand Trade Association (IBTA), a global organization dedicated to maintaining and furthering the InfiniBand specification, today highlighted the latest TOP500 List, which reports the world’s first and fourth fastest supercomputers are accelerated by InfiniBand. The results also show that InfiniBand continues to be the most used high-speed interconnect on the TOP500 List, reinforcing its status as the industry’s leading high performance interconnect technology. The updated list reflects continued demand for InfiniBand’s unparalleled combination of network bandwidth, low latency, scalability and efficiency.

InfiniBand connects 77 percent of the new High Performance Computing (HPC) systems added since the June 2017 list, eclipsing the 55 percent gain from the previous six month period. This upward trend indicates increasing InfiniBand usage by HPC system architects designing new clusters to solve larger, more complex issues. Additionally, InfiniBand is the preferred fabric of the leading Artificial Intelligence (AI) and Deep Learning systems currently featured on the list. As HPC demands continue to evolve, especially in the case of AI and Deep Learning applications, the industry can rely on InfiniBand to meet their rigorous network performance requirements and scalability needs.

The latest TOP500 List also featured positive developments for RDMA over Converged Ethernet (RoCE) technology. All 23 systems running Ethernet at 25Gb/s or higher are RoCE capable. We expect the number of RoCE enabled systems on the TOP500 List to rise as more systems look to take advantage of advanced high-speed Ethernet interconnects for further performance and efficiency gains.

“InfiniBand being the preferred interconnect for new HPC systems shows the increasing demand for the performance it can deliver.  Its place at #1 and #4 are excellent examples of that performance,” said Bill Lee, IBTA Marketing Working Group Co-Chair. “Besides of delivering world-leading performance and scalability, InfiniBand guarantees backward and forward compatibility, ensuring users highest return on investment and future proofing their data centers.”

The TOP500 List (www.top500.org) is published twice per year and ranks the top supercomputers worldwide based on the LINPACK benchmark rating system, providing valuable statistics for tracking trends in system performance and architecture.

About the InfiniBand Trade Association

The InfiniBand Trade Association was founded in 1999 and is chartered with maintaining and furthering the InfiniBand and the RoCE specifications. The IBTA is led by a distinguished steering committee that includes Broadcom, Cray, HPE, IBM, Intel, Mellanox Technologies, Microsoft, Oracle and QLogic. Other members of the IBTA represent leading enterprise IT vendors who are actively contributing to the advancement of the InfiniBand and RoCE specifications. The IBTA markets and promotes InfiniBand and RoCE from an industry perspective through online, marketing and public relations engagements, and unites the industry through IBTA-sponsored technical events and resources. For more information on the IBTA, visit www.infinibandta.org.

Source: InfiniBand Trade Association

The post InfiniBand Accelerates the World’s Fastest Supercomputers appeared first on HPCwire.

Five from ORNL Elected Fellows of American Association for the Advancement of Science

Tue, 11/21/2017 - 16:05

OAK RIDGE, Tenn., Nov. 21, 2017 — Five researchers at the Department of Energy’s Oak Ridge National Laboratory have been elected fellows of the American Association for the Advancement of Science (AAAS).

AAAS, the world’s largest multidisciplinary scientific society and publisher of the Science family of journals, honors fellows in recognition of “their scientifically or socially distinguished efforts to advance science or its applications.”

Budhendra Bhaduri, leader of the Geographic Information Science and Technology group in the Computational Sciences and Engineering Division, was elected by the AAAS section on geology and geography for “distinguished contributions to geographic information science, especially for developing novel geocomputational approaches to create high resolution geographic data sets to improve human security.”

Bhaduri’s research focuses on novel implementation of geospatial science and technology, namely the integration of population dynamics, geographic data science and scalable geocomputation to address the modeling and simulation of complex urban systems at the intersection of energy, human dynamics and urban sustainability. He is also the director of ORNL’s Urban Dynamics Institute, a founding member of the DOE’s Geospatial Sciences Steering Committee and was named an ORNL corporate fellow in 2011.

Sheng Dai, leader of the Nanomaterials Chemistry group in the Chemical Sciences Division, was elected by the AAAS section on chemistry for “significant and sustained contribution in pioneering and developing soft template synthesis and ionothermal synthesis approaches to functional nanoporous materials for energy-related applications.”

Dai’s research group synthesizes and characterizes novel functional nanomaterials, ionic liquids and porous materials for applications in catalysis, efficient chemical separation processes and energy storage systems. He is the director of the Fluid Interface Reactions, Structures and Transport (FIRST) Center, a DOE Energy Frontier Research Center, and was named an ORNL corporate fellow in 2011.

Mitchel Doktycz, leader of the Biological and Nanoscale Systems Group in the Biosciences Division, was elected by the AAAS section on biological sciences for “distinguished contributions to the field of biological sciences, particularly advancing the use of nanotechnologies for characterizing and interfacing to biological systems.”

Doktycz is also a researcher at ORNL’s Center for Nanophase Materials Sciences and specializes in the development of analytical technologies for post-genomics studies, molecular and cellular imaging techniques and nanomaterials used to study and mimic biological systems. He holds a joint faculty appointment in the UT-ORNL Bredesen Center for Interdisciplinary Research and Graduate Education and the Genome Science and Technology Program at the University of Tennessee, Knoxville.

Bobby G. Sumpter, deputy director of the Center for Nanophase Materials Sciences (CNMS), was elected by the AAAS section on physics for “distinguished contributions to the field of computational and theoretical chemical physics, particularly for developing a multifaceted approach having direct connections to experimental research in nanoscience and soft matter.”

Sumpter’s research combines modern computational capabilities with chemistry, physics and materials science for new innovations in soft matter science, nanomaterials and high-capacity energy storage. He is the leader of both the Computational Chemical and Materials Science Group in the Computational Sciences and Engineering Division and the Nanomaterials Theory Institute at CNMS, which is a DOE Office of Science User Facility. He was named an ORNL corporate fellow in 2013, is chair of the Corporate Fellows Council and holds a joint faculty appointment in the UT-ORNL Bredesen Center.

Robert Wagner, director of the National Transportation Research Center in the Energy and Transportation Science Division, was elected by the AAAS section on engineering for “distinguished contributions to the fields of combustion and fuel science, particularly for seminal research on combustion instabilities and abnormal combustion phenomena.”

Wagner is the lead of the Sustainable Mobility theme for ORNL’s Urban Dynamics Institute and the co-lead of the DOE’s Co-Optimization of Fuels and Engines Initiative, which brings together the unique research and development capabilities of nine national labs and industry partners to accelerate the introduction of efficient, clean, affordable and scalable high-performance fuels and engines. He also holds a joint faculty appointment in the UT-ORNL Bredesen Center and is a fellow of the Society of Automotive Engineers International and the American Society of Mechanical Engineers.

The new fellows will be formally recognized in February at the 2018 AAAS Annual Meeting in Austin, Texas.

ORNL is managed by UT-Battelle for the Department of Energy’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit http://science.energy.gov/.

Source: ORNL

The post Five from ORNL Elected Fellows of American Association for the Advancement of Science appeared first on HPCwire.

HPE Announces Antonio Neri to Succeed Meg Whitman as CEO

Tue, 11/21/2017 - 15:51

PALO ALTO, Calif., Nov. 21, 2017 — Hewlett Packard Enterprise today announced that, effective February 1, 2018, Antonio Neri, current President of HPE, will become President and Chief Executive Officer, and will join the HPE Board of Directors.  Meg Whitman, current Chief Executive Officer, will remain on the HPE Board of Directors.

“I’m incredibly proud of all we’ve accomplished since I joined HP in 2011.  Today, Hewlett Packard moves forward as four industry-leading companies that are each well positioned to win in their respective markets,” said Meg Whitman, CEO of HPE. “Now is the right time for Antonio and a new generation of leaders to take the reins of HPE. I have tremendous confidence that they will continue to build a great company that will thrive well into the future.”

Meg Whitman was appointed President and CEO of HP in September 2011.  Since then, she has executed against a five-year turnaround strategy that has repositioned the company to better compete and win in today’s environment.  Under her leadership, the company rebuilt its balance sheet, reignited innovation, strengthened operations and improved customer and partner satisfaction.  It also made strategic moves to focus and strengthen its portfolio, most notably its separation from HP Inc., which was the largest corporate separation in history.  She also led the subsequent spin off and mergers of HPE’s Enterprise Services and Software businesses, as well as strategic acquisitions including Aruba, SGI, SimpliVity and Nimble Storage.

Under Whitman’s leadership, significant shareholder value has been created, including nearly $18 billion in share repurchases and dividends.  Since the birth of HPE on November 2, 2015, the company has delivered a total shareholder return of 89 percent, which is more than three times that of the S&P 500.

“During the past six years, Meg has worked tirelessly to bring stability, strength and resiliency back to an iconic company,” said Pat Russo, Chairman of HPE’s Board of Directors. “Antonio is an HPE veteran with a passion for the company’s customers, partners, employees and culture. He has worked at Meg’s side and is the right person to deliver on the vision the company has laid out.”

Neri, 50, joined HP in 1995 as a customer service engineer in the EMEA call center.  He went on to hold various roles in HP’s Printing business and then to run customer service for HP’s Personal Systems unit.  In 2011, Neri began running the company’s Technology Services business, then its Server and Networking business units, before running all of Enterprise Group beginning in 2015.  As the leader for HPE’s largest business segment, comprising server, storage, networking and services solutions, Neri was responsible for setting the R&D agenda, bringing innovations to market, and go-to-market strategy and execution.  Neri was appointed President of HPE in June 2017.  In addition to leading the company’s four primary lines of business, as President, Neri has been responsible for HPE Next, a program to accelerate the company’s core performance and competitiveness.

“The world of technology is changing fast, and we’ve architected HPE to take advantage of where we see the markets heading,” said Antonio Neri, President of HPE. “HPE is in a tremendous position to win, and we remain focused on executing our strategy, driving our innovation agenda, and delivering the next wave of shareholder value.”

HPE’s strategy is based on three pillars.  First, making Hybrid IT simple through its offerings in the traditional data center, software-defined infrastructure, systems software, private cloud and through cloud partnerships.  Second, powering the Intelligent Edge through offerings from Aruba in Campus and Branch networking, and the Industrial Internet of Things (IoT) with products like Edgeline and its Universal IoT software platform. Third, providing the services that are critical to customers today, including Advisory, Professional and Operational Services.

About Hewlett Packard Enterprise

Hewlett Packard Enterprise is an industry leading technology company that enables customers to go further, faster. With the industry’s most comprehensive portfolio, spanning the core data center to the cloud to the intelligent edge, our technology and services help customers around the world make IT more efficient, more productive and more secure.

Source: HPE

The post HPE Announces Antonio Neri to Succeed Meg Whitman as CEO appeared first on HPCwire.