HPC Wire

Subscribe to HPC Wire feed
Since 1987 - Covering the Fastest Computers in the World and the People Who Run Them
Updated: 20 hours 21 min ago

iRODS Consortium Announces University of Groningen as Newest Member

Tue, 01/30/2018 - 10:06

Jan. 30, 2018 — The University of Groningen (UG) Center for Information Technology (CIT) is the newest member of the iRODS Consortium, the membership-based organization that leads efforts to develop, support, and sustain the integrated Rule-Oriented Data System (iRODS).

UG, a research university with a global outlook, is deeply rooted in the northern Netherlands town of Groningen, known as the City of Talent. The University ranks among the top 100 in several important ranking lists. It boasts a student population of about 30,000, both locally and internationally, and employs 5,500 full-time faculty and staff. Its Center for Information Technology (CIT) serves as the university’s IT center and promotes the sophisticated use of IT in higher education and research. CIT’s 200 employees manage the IT facilities and support processes for all students and staff members.

“For years there was a need for technical solutions for various problems in the field of research data management that researchers are running into,” said Jonas Bulthuis, IT consultant at the CIT. “iRODS offers building blocks with which we can offer solutions for managing research data.”

UG technology experts envision that in a few years every researcher at the university and the university hospital will use iRODS-based storage. This requires iRODS to be rich in features, easy to use, and cost-effective. Easy-to-use data management and data discovery through iRODS will allow researchers to focus on their work without investing too much time in IT technical skills or having  to worry about the technical requirements related to privacy regulations (such as the European Union’s General Data Protection Regulation (GDPR), which launches in May).

The UG aims to offer a standardized research environment, in which half of researchers will be able to do their work, while the other half of the UG research community will be offered customized solutions. To achieve its goals, the CIT is building a team with developers, technical specialists, analysts, and researchers.

“With CIT as member of the iRODS Consortium, we would like to support the further development of iRODS because iRODS is essential to what we do,” said Haije Wind, technical director of the CIT.

“The University of Groningen is an important, respected research university, and we look forward to giving them the data infrastructure to manage, share, store, and keep their data safe and compliant,” said Jason Cosposky, executive director of the iRODS Consortium. “They will add another important voice to our community that steers the continued development of iRODS.”

In addition to UG, iRODS Consortium members include Bayer, Dell/EMC, DDN, HGST, IBM, Intel, MSC,  the U.S. National Institute of Environmental Health Sciences, OCF, RENCI, the Swedish National Infrastructure for Computing, University College London, Utrecht University, and the Wellcome Trust Sanger Institute. Consortium members direct the technology and governance of iRODS, which is used by thousands of businesses, research institutes, universities, and governments worldwide. Consortium members also receive priority access to support, training, and consulting.

For more on iRODS and the iRODS Consortium, visit the iRODS website.

About the University of Groningen

The University of Groningen has a rich academic tradition dating back to 1614. From this tradition arose the first female student and the first female lecturer in the Netherlands, the first Dutch astronaut and the first president of the European Central Bank. The university is also home to Ben Feringa, a professor of organic chemistry who won the 2016 Nobel Prize in chemistry for his work on the development of molecular machines. University students are challenged to excel, burgeoning talent is cultivated, and the keyword is quality. The University is committed to actively cooperating with its partners in society, with a special focus on its research themes of Healthy Aging, Energy, and Sustainable Society.

Source: iRODS Consortium

The post iRODS Consortium Announces University of Groningen as Newest Member appeared first on HPCwire.

KAUST Supercomputing Core Laboratory and ANSYS to Host Second Workshop

Tue, 01/30/2018 - 08:56

Jan. 30, 2018 — The KAUST Supercomputing Core Laboratory is co-organizing with ANSYS and their channel partner for the Middle East, Fluid Codes, a one-day workshop about ANSYS based engineering applications using Shaheen II on Thursday, February 8, 2018. ANSYS is a leader in developing engineering simulation software.

What: 2nd KAUST-ANSYS Workshop on “ANSYS Based Engineering Applications Using Shaheen II”

When: Thursday, February 8, 2018, 9:00 a.m. — 4:00 p.m.

Where: Computer Lab Room #3134, Level 3, University Library

At the workshop, you can find out more about:

  • Shaheen II and how ANSYS software can be used on this world class supercomputer
  • Presentations about ANSYS CFD
  • Presentations about ANSYS Mechanical
  • Impact of HPC in reducing the time to solution – some success stories
  • ANSYS Discovery Live – Real-time design-simulation made easy with GPUs
  • Reduced Order Modeling
  • Hands-on demos on using ANSYS software

The first KAUST-ANSYS Workshop was held at KAUST on April 16, 2017. In addition to KAUST participants, a large number of participants from Saudi Industry and Universities participated in that workshop. You can find more details on that event here. The objective of this workshop is to engage industrial partners, and to educate engineering students.


Jose Ramon Rodriguez is the Fluid Codes Technical Manager since January 2015. In his previous work, Jose Ramon has been working for the last 5 years as an application engineer in ANSYS Inc, in the French and Spanish offices. During his experience in ANSYS, Jose Ramon has performed 100+ FEA & CFD ANSYS simulation analysis, as well as participated in the ANSYS Field Testing Team for improving the ANSYS software during more than 3 years. During the first 2 years of his career, Jose Ramon was an R&D Engineer in a Renewable Energy company in Spain and Italy, where he performed FEA & CFD simulations for solar plant machines. Jose Ramon holds a MSC degree in Mechanical Engineering from the Technical University of Cartagena (Spain).

Rooh Khurram is working as a Staff Scientist at Supercomputing Core Lab at KAUST. He has conducted research in finite element methods, high performance computing, multiscale methods, fluid structure interaction, detached eddy simulations, in-flight icing, and computational wind engineering. He has over 15 years of industrial and academic experience in CFD. He specializes in developing custom made computational codes for industrial and academic applications. His industrial collaborators include: Boeing, Bombardier, Bell Helicopter, and Newmerical Technologies Inc. Before joining KAUST in 2012, Rooh worked at the CFD Lab at McGill University and the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign. Rooh received his Ph.D. from the University of Illinois at Chicago in 2005. In addition to a Ph.D. in Civil Engineering, Rooh has degrees in Mechanical Engineering, Nuclear Engineering, and Aerospace Engineering.

Click here to register for the event. Please note that seating for this hands-on event is limited.

Source: KAUST Supercomputing Core Laboratory

The post KAUST Supercomputing Core Laboratory and ANSYS to Host Second Workshop appeared first on HPCwire.

Nor-Tech Receives Major High Performance Computing Cluster Order for LIGO Project

Tue, 01/30/2018 - 08:35

MINNEAPOLIS, Minn., Jan. 30, 2018 — Nor-Tech, one of the primary high performance technology suppliers for the Nobel Physics Prize winning LIGO Gravitational Wave Project, just announced the company received another major order for high performance cluster updates from the University of Wisconsin, Milwaukee (UWM). This is in addition to a related in-process project from another LIGO institution.

UWM just finalized an order with Nor-Tech for a turnkey HPC cluster solution with nearly 100 Intel dual-CPU Skylake compute nodes and nearly 2,500 compute cores. In all, the order includes new racks, clean cable management and extensive testing. Nor-Tech will also provide onsite assistance with un-boxing and racking.

After a 20-year search, the first gravitational wave observation was officially announced on Feb. 11, 2016 by the LIGO team. Gravitational waves were picked up by the two LIGO detectors at Livingston, La. and Hanford, Wash. The Nobel Prize was awarded a year later. This is the first physical evidence of a phenomenon Albert Einstein predicted in 1915 in his General Theory of Relativity. LIGO is also the largest and most ambitious project ever funded by the National Science Foundation.

The LIGO team is an international consortium of leading technicians, engineers, scientists, and research institutions. Nor-Tech has been working with several major LIGO consortium members for more than 11 years on the project by designing, building, and upgrading a number of the clusters that made the original detection and subsequent detections possible.

A major service Nor-Tech is providing to LIGO team members is the availability of Nor-Tech’s demo cluster—which enables no-risk testing of the latest high performance hardware and software, including Intel’s Skylake. The no-cost, no-strings demo cluster is available to all of Nor-Tech’s clients and prospects.

Nor-Tech Executive Vice President and General Manager Jeff Olson said, “We continue to be humbled by the astounding achievements of the LIGO team. Our goal has been and always will be to provide support with the latest and best technology. In this case, the addition of Skylake processor functionality will significantly improve performance of UWM’s existing Nor-Tech cluster.”

In addition to seven LIGO supercomputers around the world, Nor-Tech continues to build powerful, leading-edge HPC clusters for premier research institutions. The company takes the intimidation factor out of high performance computing by delivering a turnkey solution and promising no wait-time service—two reasons that LIGO team members chose Nor-Tech in the first place and continue to value their collaboration.

About Nor-Tech

Nor-Tech is on CRN’s list of the top 40 Data Center Infrastructure Providers along with Dell, HPE and Lenovo; and is a cluster builder for 2015 and 2017 Nobel Physics Award-winning projects. Nor-Tech engineers average 20+ years of experience. This strong industry reputation and deep partner relationships also enable the company to be a leading supplier of cost-effective desktops, laptops, tablets and Chromebooks to schools and enterprises. All of Nor-Tech’s technology is made by Nor-Tech in Minnesota and supported by Nor-Tech around the world. The company has been in business since 1998 and is headquartered in Burnsville, Minn. just outside of Minneapolis.

Source: Nor-Tech

The post Nor-Tech Receives Major High Performance Computing Cluster Order for LIGO Project appeared first on HPCwire.

ASC Student Supercomputer Challenge to include AI and Nobel Prize-winning application

Tue, 01/30/2018 - 01:01

The 2018 ASC World Supercomputer Contest (ASC18) has seen more participating teams than ever before. In the following two months, over 300 teams from around the world will take on challenges including AI Answer Prediction for Search Query, cryo-electron microscopy, RELION, and supercomputer benchmarks, HPL and HPCG. The top 20 teams will compete in the final round of the contest in May.

For the first time, this year’s ASC includes both AI and an application used in Nobel Prize-winning research. The AI challenge deals with Answer Prediction for Search Query in natural language reading and comprehension, and is provided by Microsoft. Teams are tasked with creating an AI answer prediction method and model based on massive amounts of data generated by real questions from search engines such as Bing or voice assistants such as Cortana.

ASC18 also includes a Nobel Prize-winning technology, cryo-electron microscopy, whose developers were awarded the 2017 prize in Chemistry. Allowing scientists to solve challenges in structural biology beyond the scope of traditional X-rays and crystallology, Cryo-electron microscopy is based on RELION, a 3D reconstruction software. By including RELION among the challenges of this year’s ASC, the competition organizers aim to keep today’s computing students abreast of the latest cutting-edge developments in scientific discovery and spark their passion for exploring the unknown.

All participating students are provided with a free HPC learning platform, courtesy of the EasyHPC Supercomputing Study House, a key R&D project in China’s national high-performance computing plan. Combining software and hardware based on China’s national HPC environment, the Study House aims to establish a shared HPC educational platform. With this platform, students who lacked supercomputing resources now have the opportunity to study theoretical knowledge and perform high-performance computing coding, with leading resources at their disposal, including the world’s fastest supercomputer.

ASC will also host a two-day training camp for participants starting January 30, featuring speakers including HPC and AI experts from China’s State Key Laboratory of High-end Server & Storage Technology, the Chinese Academy of Sciences, Microsoft, Inspur, Intel, Nvidia, and Mellanox. The camp will include sessions on ASC18 rules, cluster building and evaluation, Internet optimization and selection, and an introduction to RELION and Answer Prediction for Search Query. Tsinghua University’s winning team from last year’s ASC17 will also be present to share their experiences in the competition. In this way, ASC hopes to provide this year’s participants with everything they need to perform their best.

About ASC

The ASC Student Supercomputer Challenge is the world’s biggest student supercomputer competition. The ASC was initiated by China, launched by experts and institutions from Japan, Russia, ROK, Singapore, Thailand, Taiwan, Hong Kong and other regions and countries, and has been supported by experts and institutions from the US and across Europe. Through promoting exchanges and furthering the development of talented young minds in the field of supercomputing around the world, the ASC aims to improve applications and R&D capabilities of supercomputing and accelerate technological and industrial innovation.

The first ASC challenge was launched in 2012. Since then, the competition has continued to grow in influence, with more than 1,100 teams and 5,500 young talents from around the world having participated.

The post ASC Student Supercomputer Challenge to include AI and Nobel Prize-winning application appeared first on HPCwire.

Networking, Data Experts Design a Better Portal for Scientific Discovery

Mon, 01/29/2018 - 13:45

Jan. 29, 2018 — These days, it’s easy to overlook the fact that the World Wide Web was created nearly 30 years ago primarily to help researchers access and share scientific data. Over the years, the web has evolved into a tool that helps us eat, shop, travel, watch movies and even monitor our homes.

The Science DMZ includes multiple DTNs that provide for high-speed transfer between network and storage. Portal functions run on a portal server, located on the institution’s enterprise network. The DTNs need only speak the API of the data management service (Globus in this case).

Meanwhile, scientific instruments have become much more powerful, generating massive datasets, and international collaborations have proliferated. In this new era, the web has become an essential part of the scientific process, but the most common method of sharing research data remains firmly attached to the earliest days of the web. This can be a huge impediment to scientific discovery.

That’s why a team of networking experts from the Department of Energy’s Energy Sciences Network (ESnet), with the Globus team from the University of Chicago and Argonne National Laboratory, has designed a new approach that makes data sharing faster, more reliable and more secure. In an article published Jan. 15 in Peer J Comp Sci, the team describes their “The Modern Research Data Portal: a design pattern for networked, data-intensive science.”

“Both the size of datasets and the quantity of data objects has exploded, but the typical design of a data portal hasn’t really changed,” said co-author Eli Dart, a network engineer with the Department of Energy’s Energy Sciences Network, or ESnet. “Our new design preserves that ease of use, but easily scales up to handle the huge amounts of data associated with today’s science.”

Data portals, sometimes called science gateways, are web-based interfaces for access data storage and computing systems, allowing authorized users to access data and perform shared computations. As science becomes increasingly data-driven and collaborative, data portals are advancing research in materials, physics, astrophysics, cosmology, climate science and other fields.

The traditional portal is driven by a web server that is connected to a storage system and a database and processes users’ requests for data. While this simple design was straightforward to develop 25 years ago, it has increasingly become an obstacle to performance, usability and security.

“The problem with using old technology is that these portals don’t provide fast access to the data and they aren’t very flexible,” said lead author Ian Foster, who is the Arthur Holly Compton Professor at the University of Chicago and Director of the Data Science and Learning Division at Argonne National Laboratory. “Since each portal is developed as its own silo, the organization therefore must implement, and then manage and support, multiple complete software stacks to support each portal.”

The new portal design is built on two approaches developed to simplify and speed up transfers of large datasets.

  • The Science DMZ, which Dart developed, is a high-performance network design that connects large-scale data servers directly to high-speed networks and is increasingly used by research institutions to better manage data transfers.
  • Globus is a cloud-based service to which developers of data portals and other science services can outsource responsibility for complex tasks like authentication, authorization, data movement, and data sharing. Globus can be used, in particular, to drive data transfers into and out of Science DMZs.

Kyle Chard, Foster, David Shiffett, Steven Tuecke and Jason Williams are co-authors of the paper and helped develop Globus at Argonne National Laboratory and the University of Chicago. In their paper, the authors note that the concept became feasible in 2015 as Globus and the Science DMZ became mature technologies.

“Together, Globus and the Science DMZ give researchers a powerful toolbox for conducting their research,” Dart said.

One portal incorporating the new design is the Research Data Archive managed by the National Center for Atmospheric Research, which contains a large and diverse collection of meteorological and oceanographic observations, operational and reanalysis model outputs, and remote sensing datasets to support atmospheric and geosciences research.

For example, a scientist working at a university could download data from the National Center for Atmospheric Research (NCAR) in Colorado and then use it to run simulations at DOE and NSF supercomputing centers in California and Illinois, and finally move the data to her home institution for analysis. To illustrate how the design works, Dart selected a 460-gigabyte dataset at NCAR, initiated a Globus transfer to DOE’s National Energy Research Scientific Computing Center at Lawrence Berkeley National Laboratory, logged in to his storage account and started the transfer. Four minutes later, the 5,141 files had been seamlessly transferred.

How the design works

The Modern Research Data Portal takes the single-server model of the traditional portal design and divides it among three distinct components.

  • A portal web server handles the search for and access to the specified data, and similar tasks.
  • The data servers, often called Data Transfer Nodes, are connected to high-speed networks through a specialized enclave, in this case the Science DMZ. The Science DMZ provides a dedicated, secure link to the data servers, but avoids common performance bottlenecks caused by typical designs not optimized for high-speed transfers.
  • Globus manages the authentication, data access and data transfers. Globus makes it possible for users to manage data irrespective of the location or storage system on which data reside and supports data transfer, sharing, and publication directly from those storage systems.

“The design pattern thus defines distinct roles for the web server, which manages who is allowed to do what; data servers, where authorized operations are performed on data; and external services, which orchestrate data access,” the authors wrote.

Globus is already used by tens of thousands of researchers worldwide with endpoints at more than 360 sites, so many researchers are familiar with its capabilities and rely on it on a regular basis. In fact, about 80 percent of major research universities and national labs in the U.S. use Globus.

At the same time, more than 100 research universities across the country have deployed Science DMZs, thanks to funding support through the National Science Foundation’s Campus Cyberinfrastructure Program.

A critical component of the system is “a little agent called Globus Connect, which is much like the Google Drive or Dropbox agents one would install on their own PCs,” Chard said. Globus Connect allows the Globus service to move data to and from the computer using high performance protocols and also HTTPS for direct access. It also allows users to share data dynamically with their peers.

According to Chard, the design provides research organizations with easy-to-use technology tools similar to those used by business startups to streamline development.

“If we look to industry, startup businesses can now build upon a suite of services to simplify what they need to build and manage themselves,” Chard said. “In a research setting, Globus has developed a stack of such capabilities that are needed by any research portal. Recently, we (Globus) have developed interfaces to make it trivial for developers to build upon these capabilities as a platform.”

“As a result of this design, users have a platform that allows them to easily place and transfer data without having to scale up the human effort as the amount of data scales up,” Dart said.

ESnet is a DOE Office of Science User Facility. Argonne and Lawrence Berkeley national laboratories are supported by the Office of Science of the U.S. Department of Energy. The Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.  For more information, please visit science.energy.gov.

About Computing Sciences at Berkeley Lab

The Lawrence Berkeley National Laboratory (Berkeley LabComputing Sciences organization provides the computing and networking resources and expertise critical to advancing the Department of Energy’s research missions: developing new energy sources, improving energy efficiency, developing new materials and increasing our understanding of ourselves, our world and our universe.

ESnet, the Energy Sciences Network, provides the high-bandwidth, reliable connections that link scientists at 40 DOE research sites to each other and to experimental facilities and supercomputing centers around the country. The National Energy Research Scientific Computing Center (NERSC) powers the discoveries of 6,000 scientists at national laboratories and universities, including those at Berkeley Lab’s Computational Research Division (CRD). CRD conducts research and development in mathematical modeling and simulation, algorithm design, data storage, management and analysis, computer system architecture and high-performance software implementation. NERSC and ESnet are DOE Office of Science User Facilities.

Lawrence Berkeley National Laboratory addresses the world’s most urgent scientific challenges by advancing sustainable energy, protecting human health, creating new materials, and revealing the origin and fate of the universe. Founded in 1931, Berkeley Lab’s scientific expertise has been recognized with 13 Nobel prizes. The University of California manages Berkeley Lab for the DOE’s Office of Science.

DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit science.energy.gov.

The post Networking, Data Experts Design a Better Portal for Scientific Discovery appeared first on HPCwire.

Quantum Computers Threaten Data Encryption

Mon, 01/29/2018 - 13:03

The promise of quantum computing comes with a major downside: “Cryptographically useful” quantum machines will threaten public key encryption used to secure data in the cloud, a new report warns.

The Cloud Security Alliance also found in a report released last week that companies are aware of the growing security risk associated with quantum computing but have so far done little to prepare. Among the reasons are the usual lack of resources and the perception that few if any security solutions exist.

Only 30 percent of those surveyed by the alliance said they were confident that current security approaches would protect encrypted data while another one-third were unaware of defenses. Forty percent of those polled said they are working to “future-proof” data against the quantum computing threat.

The Seattle-based group expects commercial quantum computers to arrive over the next decade. “With this considerable breakthrough will come a significant threat to the security of public key cryptography and the associated challenges of securing the global digital communications infrastructure,” said Bruno Huttner, co-chair of the alliance’s quantum security working group and product manager for a Swiss-based security firm.

Of particular concern is how cloud and infrastructure vendors will secure personal and financial data over time as quantum technologies hit the market. Some “quantum-safe cryptography” approaches have been proposed.

For example, the National Security Agency has announced plans to revamp its suite of crypto algorithms to include quantum safeguards. Similarly, the National Institute of Standards and Technology is sorting through proposals for quantum-resistant, public-key encryption algorithms.

Meanwhile, security specialists said they were familiar with at least some “quantum-safe” technologies, including longer symmetric keys and expanded cryptographic hash functions. The latter is an algorithm used to produce a checksum for verifying data integrity.

“If we want more protection, we need bigger numbers,” said Duncan Steel, a University of Michigan engineering and computer science professor developing quantum-safe technologies.

“While there is still a tremendous amount of work to be done in convincing the industry of the importance of including the threat of quantum computing in enterprise security strategies, the good news is that there is a great deal of interest in learning more about the threat quantum presents and how it can be mitigated,” added Jane Melia, a quantum working group co-chair and vice president of another security vendor.

Still, the alliance found that many of those surveyed have done little to address security concerns raised by quantum computing. Most said they would not have a plan in place for at least three years, and few are including quantum-safe encryption as a requirement from their suppliers.

“This makes their data vulnerable to harvesting attacks, in which data is downloaded and then stored for later decryption by quantum computers,” the report warned.

The post Quantum Computers Threaten Data Encryption appeared first on HPCwire.

Dell EMC Moves HPC, HCI Teams Under Server Division

Mon, 01/29/2018 - 13:00

Reverberations from Dell’s acquisition of EMC in October 2015 – at $67 billion, the biggest in the history of the technology industry – have extended into 2018 with the news that the company’s core server and storage business units will absorb its Converged Platforms and Solution Division (CPSD), formerly located within Dell EMC’s Infrastructure Solutions Group (ISG), according to published reports.

Matt Baker, Dell EMC SVP of strategy and planning, told CRN that “CPSD as we know it today as an independent organization is no longer.” The goal, he said, is to move the company’s hyperconverged and converged solutions “closer to our core product groups” with the intent of streamlining systems development and delivery.

A spokesperson for the company said Dell EMC’s org chart “is designed to help simplify our organization for clear lines of decision making, get our products to market faster and align our teams to our biggest priorities.”

The shakeup has major implications for Dell EMC’s HPC and hyperconverged infrastructure teams, which are being moved to the Server Division led by Ashley Gorakhpurwalla, divisional president and general manager; meanwhile, Gil Shneorson will remain in leadership of Dell’s VMware-aligned Hyperconverged Infrastructure (HCI) products and Dan McConnell will continue at the head of its ecosystem HCI offers, and both will report to Gorakhpurwalla, according to the CRN story.

Meanwhile, Dell EMC’s converged infrastructure team will be folded into the company’s Storage Division under the leadership of Jeffrey Boudreau, the division’s SVP/GM. Observers noted that the shakeup could be designed to jumpstart Dell EMC’s storage division, which company executives have conceded has lost market share of late.

Michael Dell

Baker said that Chad Sakac, formerly head of CPSD, will continue to be an infrastructure leader at Dell EMC, though he has not yet been a given an official title.  The Ready Solutions group, except for the HPC and cloud groups, will come under the leadership of Kash Shaikh, VP product management and marketing.

The restructuring was carried out under the direction of ISG President Jeff Clarke, a 30-year Dell veteran who took over for David Goulden, a former top EMC executive, four months ago.

The restructuring has been favorably received by at least two industry analysts.

“The reorg makes good business sense,” said Chirag Dekate, research director, HPC, machine learning, emerging compute technologies, Gartner. “It will help Dell rationalize product portfolios and cross-pollinate technologies from HPC across a broader set of market segments. With digital business initiatives gathering steam across enterprises, HPC-inspired ideas can prove to be a real differentiator in the new infrastructure stack.”

“This reorg is all about speed of decision making and efficiency,” said Patrick Moorhead, president and principal analyst, Moor Insights & Strategy. “I’m looking at how it was architected, and on paper it definitely does that. There’s less people in the decision making process, which is good for a company of their size.”

A spokesperson from Dell had this to say: “For years our HPC team has partnered with the server organization to put our PowerEdge servers at the core of their offerings, and in the last year we recognized massive gains and ended the year in the #2 position with HPE in our sights. Now, we’ll be able to accelerate our strategy to continue our mission to lead that market and deliver our commitment to make HPC available to all who need it.”

The restructuring comes amid other reports in Bloomberg and the Wall Street Journal that Dell Technologies, which went private in 2013, may be considering going public again. The reports say that a Dell IPO would be intended to raise money to pay debt , which currently amounts to $50 billion.

The post Dell EMC Moves HPC, HCI Teams Under Server Division appeared first on HPCwire.

PSSC Labs and Atmospheric Data Solutions Deliver HPC Clusters to Forecast Weather and Reduce Wildfires

Mon, 01/29/2018 - 09:11

LAKE FOREST, Calif., Jan. 29, 2018 — PSSC Labs, a developer of custom High Performance Computing and Big Data computing solutions, in collaboration with Atmospheric Data Solutions (ADS) designs, builds and implements custom High Performance Computing Clusters (HPCC) to assist utility companies as well as public and private agencies in predicting, mitigating and managing risk from severe weather patterns.

Accurate severe weather and wildfire potential predictions demand technologically advanced high performance servers and artificial intelligence (AI). PSSC Labs’ PowerWulf cluster solutions provide cutting-edge hardware which efficiently process large amounts of data needed to train complex AI models for valuable wildfire potential forecasts.

For example, a government website called the Santa Ana Wildfire Threat Index developed in collaboration with a major southern California utility company, the U. S. Forest Service(USFS), and ADS forecasts short-term and long-term large wildfire potential. The site, sawti.fs.fed.us, advises the USFS and the public on approaching wildfire potential events using forecasted weather and wildfire-centric variables including dead and live fuel moisture, all of which are generated on PSSC Labs PowerWulf Clusters.

This highly developed solution predicted large fire potential during the fall 2017 busy Santa Ana wind season providing month-ahead and season-ahead forecasts that warned of above normal Santa Ana winds this past fall. The public received forecasts and recommended actions well ahead of approaching critical fire weather. The utility agency used this advanced predictive technology to create forecasts to reduce the potential for accidental wildfire ignitions in their territories.

“Big Data is playing a crucial role in weather forecasting and wildfire analysis. This advanced information is important because we cannot just look outside for the weather,” said Alex Lesser, executive vice president at PSSC Labs. “With improved technology and the ability to process large amounts of data comes better extreme weather forecasts, which can save lives.”

How HPCC, Big Data and AI Impact Weather Modeling and Wildfire Analysis

Weather models require specialized computing platforms that allow parallel computations. The closer the model horizontal and vertical grid spacing, the more computational resources are needed. Timely dissemination of the high-impact forecast to stakeholders and the public also requires state-of-the-art computing hardware.

PSSC Labs works with ADS to deliver cluster servers that are individually customized to meet the specific needs of each client. Key features of the PSSC Labs PowerWulf HPC Clusters include pre-configured and fully validated blocks with the latest Intel HPC technology and all the necessary hardware, network settings, and cluster management software prior to shipping.

Thanks to the processing power of PSSC Labs clusters, AI and Big Data can now play a key role in developing forecasting and wildfire analysis solutions that can prevent casualties. ADS uses AI to find relationships in data that can transcend the current understanding. For example, ADS uses AI to forecast damage to infrastructure during severe downslope windstorms that may occur during Santa Ana wind events.

“AI solutions are being built using multi-decadal historical atmospheric and land surface data at a close-enough grid spacing. This data allows for an intelligent historical ranking of a forecast which is one of the most valuable analytics ADS can provide a stakeholder,” said Dr. Scott Capps, Principal and Founder of ADS.

“The solutions ADS are developing use a comprehensive blend of predictors that are important to large wildfire potential including dead fuel and live fuel moisture, near-surface atmospheric moisture and temperature, wind speed and gusts,” Dr. Capps, continues. “Our partnership with PSSC Labs provides us with the hardware platform to meet the demands of high performance computing in order to maximize accuracy and maximize the number of times models can be run daily.”

According to Dr. Capps, the operational month-ahead and season-ahead Santa Ana wind forecasts are the first such solution.

For more information visit http://www.pssclabs.com/solutions/hpc-cluster/

About Atmospheric Data Solutions

Atmospheric Data Solutions, LLC (ADS) was founded to work with public and private agencies to develop atmospheric science products that help mitigate and manage risk from severe weather and future climate change. ADS works closely with clients to create products that are guided by the client’s needs while utilizing ADS’s expertise in atmospheric science. www.atmosdatasolutions.com

About PSSC Labs

For technology powered visionaries with a passion for challenging the status quo, PSSC Labs is the answer for hand-crafted HPC and Big Data computing solutions that deliver relentless performance with the absolute lowest total cost of ownership.  All products are designed and built at the company’s headquarters in Lake Forest, California.

Source: PSSC Labs

The post PSSC Labs and Atmospheric Data Solutions Deliver HPC Clusters to Forecast Weather and Reduce Wildfires appeared first on HPCwire.

IDC View: Inspur Leading the AI Infrastructure Development

Mon, 01/29/2018 - 01:01

Recently, IDC released two reports that highlight Inspur’s AI capabilities in products, solutions and technology innovation. The reports detailed Inspur’s product portfolio geared toward a future powered by artificial intelligence (AI). The product portfolio includes reconfigurable server system, cluster management software, performance optimization tools, and open source deep learning frameworks. Meanwhile, Inspur will continuingly devote itself to developing end-to-end industry-specific AI solutions, targeting a larger partner ecosystem.

IDC believes that the innovative design in a series of artificial intelligence infrastructure products from Inspur will not only happened in local market, and will also impact on the global AI infrastructure products.

Inspur AGX-2 supports the deployment of eight NVLink or PCI-E GPU in 2U chassis, which is the first domestic vendor offered products can support the NV Link 2.0 interface .It dramatically improving GPU computing density, and it can provide higher computational efficiency for AI applications. GX4 and SR-AI Rack have decoupled of coprocessor and CPU. It helped GPU computing expanded form of standard rack to expansion module, breaking the traditional restriction of 8 adapter, to achieve a higher single-node computing performance. And it can meet the demand of various scenarios such as deep learning training and online reasoning by different modular design. The ABC integrated system launched by Inspur and Baidu, it brings PaaS level solution of artificial intelligence. Based on Baidu’s investment of AI technology and application, ABC integrated system supports face recognition, text recognition, voice recognition and many other functions. It can help traditional industries and small & medium-sized internet companies to create their own AI applications such as remote creating financial account, intelligent security monitoring, intelligent voice assistant, credit card recognition and other scenarios by rapid deployment of PaaS layer. The application of this ABC integrated system helps them to break technology threshold and focus on the application development, and will make artificial intelligence more popular.

According to IDC, Inspur has consolidated learning and developed a software-defined reconfigurable server and storage system specialized to support AI training and inference, in which performance can be accelerated, pooled, and reconfigured. Inspur AGX-2 product, for example, supports 8 GPUs, NVLink, a 2U form factor, and 100Gbps InfiniBand. Users can choose to utilize either 4 or 8 GPUs depending on the workload requirement. Inspur’s smarting computing solution powers close to 90% of China’s active cognitive/AI applications running on BAT platforms.

IDC believes the model brought in large revenue for Inspur not only in the domestic market, as it was also able to start from zero in the U.S. market and accelerate rapidly. For now, Inspur touts a turnkey AI computing solution covering heterogenous server system, software-defined storage, managed service for resource optimization, software to facilitate AI model training, and a customized multi node deep learning application framework. Inspur aims to become the business partner of choice for companies adopting cognitive/AI and transfer best practices gathered by working with hyperscalers.

IDC reports:

Inspur Held the First Analyst Conference and Released AI Infrastructure Products Jointly with Baidu, Leading the Artificial Intelligence Infrastructure Innovation

Inspur Walks with Hyperscalers to Propel Artificial Intelligence Application Adoptions

The post IDC View: Inspur Leading the AI Infrastructure Development appeared first on HPCwire.

2018 RMACC HPC Symposium Issues Call for Participation

Fri, 01/26/2018 - 13:54

Jan. 26, 2018 — The 2018 Rocky Mountain Advanced Computing Consortium (RMACC) High Performance Computing (HPC) symposium is now welcoming proposals for presentations on a variety of topics in several formats.

The 2018 RMACC HPC Symposium is scheduled for August 7-9th at The University of Colorado Boulder. Open to the public, the symposium brings together computational scientists, system administrators, and users of high performance computing systems from universities, government laboratories, and industry from throughout the 6-state Rocky Mountain Region. The symposium features a wide array of panel discussions, technical presentations, and tutorial sessions on research, education, and best practices in the areas of computational science and high performance computing.

Submission Deadlines

Presentation proposals must be received no later than 11:59 p.m. on February 28, 2018.  See below for submission instructions.

Session Types

  1. Tutorial – Tutorials give participants in-depth training in how to effectively use and manage advanced research computing resources and services. 1.5-, 3-, and 6-hr sessions are available.  Two or more 1.5- or 3-hr tutorials that build on each other may be combined into a sequence.  Tutorials that include some degree of hands-on content are encouraged.
  2. Panel Discussion – Expert panelists give a brief overview of their perceptions regarding the general topic, may discuss several preselected questions among themselves, and respond to audience questions.  Panels focused on the use of HPC in particular research domains (eg, genomics, polar science, fluid dynamics …) can be especially helpful for newcomers to HPC.
  3. Technical Presentation – These 55-min sessions may provide a general overview of a particular topic (eg, “scientific visualization” or “HPC in the cloud”) or give a more detailed examination of a specific sub-topic, application, or research result.
  4. Visualization Showcase – 8-min talks that present a novel data visualization and describe its meaning and production.
  5. There will be a separate call for student posters.

Topic Tracks or Categories

  1. Data Visualization, Management, and Transfer
  2. Topics aimed at students or new HPC users, including introductory tutorials, career development, etc.
  3. Topics for experienced HPC users, including advanced tutorials, detailed technical presentations, etc.
  4. User Support and Outreach, including to under-represented or diverse groups
  5. System/Storage/Network Administration
  6. Big Data / Data Analytics / Machine Learning

Submission Instructions

All submissions should include an abstract of up to five sentences in length, as well as a half-page document describing the proposed session in greater detail.  Please suggest the Type and Track that may be most suitable for your presentation, and also include a brief CV for the presenter.


Source: University of Colorado Boulder

The post 2018 RMACC HPC Symposium Issues Call for Participation appeared first on HPCwire.

Czech Republic Signs European Declaration on High-Performance Computing

Fri, 01/26/2018 - 08:06

Jan. 26, 2018 — The Czech Republic is the 14th country to sign the European declaration on high-performance computing (HPC). The initiative aims at building European supercomputers that would rank among the world’s top three by 2022-2023. With this signature, the Czech Republic announced that it will join EuroHPC Joint Undertaking – a new legal and funding structure that was launched earlier this month. The cooperation with a total budget of approximately EUR 1 billion will acquire, build and deploy across Europe world-class High-Performance Computing (HPC) infrastructure and enhance the development of the technologies and machines (hardware) as well as the applications (software) that would run on these supercomputers. Supercomputers are dealing with highly demanding scientific and engineering calculations that cannot be performed using general-purpose computers. They are strategic resource for the future of EU’s scientific leadership and industrial competitiveness and are increasingly needed to process ever larger amounts of data. Supercomputing can benefit the society in many areas from health care, renewable energy to car safety and cybersecurity. Other Member States and countries associated to Horizon 2020 framework programme are encouraged to join forces and sign the declaration. Read more about European supercomputing initiative here and in the recent press releaseQ&A and factsheet.

Source: European Commission

The post Czech Republic Signs European Declaration on High-Performance Computing appeared first on HPCwire.

Intel Reports Fourth-Quarter and Full-Year 2017 Financial Results

Thu, 01/25/2018 - 21:24

SANTA CLARA, Calif., January 25, 2018 — Intel Corporation today reported full-year and fourth-quarter 2017 financial results. The company also announced that its board of directors has approved an increase in its cash dividend to $1.20 per-share on an annual basis, a 10 percent increase. The board also declared a quarterly dividend of $0.30 per-share on the company’s common stock, which will be payable on March 1 to shareholders of
record on February 7.

“2017 was a record year for Intel with record fourth-quarter results driven by strong growth of our data-centric businesses,” said Brian Krzanich, Intel CEO. “The strategic investments we’ve made in areas like memory, programmable solutions, communications and autonomous driving are starting to pay off and expand Intel’s growth opportunity. In 2018, our highest priorities will be executing to our data-centric strategy and meeting the commitments we make to our shareholders and our customers.”

“The fourth quarter was an outstanding finish to another record year. Compared to the expectations we set, our revenue was stronger, our operating margins were higher, and our spending was lower,” said Bob Swan, Intel CFO.

“Intel’s PC-centric business continued to execute well in a declining market while the growth of our data-centric businesses shows Intel’s transformation is on track.”

Intel’s fourth-quarter results reflect an income tax expense of $5.4 billion as a result of the U.S. corporate tax reform enacted in December. This includes a one-time, required transition tax on our previously untaxed foreign earnings, which was partially offset by the re-measurement of deferred taxes using the new U.S. statutory tax rate. Looking ahead, the company is forecasting a 2018 tax rate of 14 percent as the Tax Cuts and Jobs Act helps level the playing field for U.S. manufacturers like Intel that compete in today’s global economy.

“Intel has a rich history of investing in U.S.-led research and development and U.S. manufacturing,” said Swan. “The tax reform is further incentive to continue these investments and reinforces our decision to invest in the buildout of our Arizona factory. It also informed the dividend increase we’re announcing today.”

For the full year, the company generated a record $22.1 billion cash from operations, and paid dividends of $5.1 billion.

In the fourth quarter, Intel saw strong performance from data-centric businesses, which accounted for 47% of Intel’s fourth-quarter revenue, an all-time high. The Data Center Group (DCG), Internet of Things Group (IOTG) and Programmable Solutions Group (PSG) all achieved record quarterly revenue. Intel’s Client Computing Group (CCG) shipped a record volume of Intel Core i7 processors, launched the new 8th Gen Intel Core processor with Radeon RX Vega M Graphics, and announced an expanding line-up of LTE and 5G multi-mode modems. The Non-Volatile Memory Solutions Group (NSG) launched the new Intel Optane SSD DC P4800X Series for the data center.

The company is also advancing efforts to compete and win in artificial intelligence with the Intel Nervana Neural Network Processor, customer momentum for its Intel Movidius vision processing unit (VPU), and continued customer adoption of Intel Xeon Scalable processors. In autonomous driving, Mobileye had a strong finish to 2017 with a total of 30 ADAS customer designs wins as well as design wins for advanced L2+ and L3 autonomous
systems with 11 automakers.

Additional information regarding Intel’s results can be found in the Q4’17 Earnings Presentation available at: www.intc.com/results.cfm.

Full release here (PDF).

Source: Intel Corp.

The post Intel Reports Fourth-Quarter and Full-Year 2017 Financial Results appeared first on HPCwire.

EU, Brazil Energy Interests Position for Exascale

Thu, 01/25/2018 - 16:38

Will exascale computing support a greener energy future? The European-funded HPC4E project believes that is the case. The consortium of 13 research and industry partners from Europe and Brazil published a detailed white paper this week offering guidance on the use of exascale architectures for the energy sector. Based on two years of research, the paper covers exascale-readiness for applications in oil & gas, wind and biogas combustion, industries important to both EU and Brazil.

The project holds that the “use of new exascale architectures and the corresponding advances in codes to fully exploit new chip capabilities will help address challenges for combustion technologies, wind power generation and hydrocarbon exploration, allowing a transition to greener and more advanced energy systems based on alternative fuels combined with renewable energy technologies.”

The paper is quite in-depth in presenting the technical challenges, successes and future opportunities involved in each of the three industries’ application sets. HPC4E notes that “the computational requirements arising from full wave-form modelling and inversion of seismic and electromagnetic data is ensuring that the O&G industry will be an early adopter of exascale computing technologies.”

Pointing to previous successes for HPC in the field, they cite findings that showed 3D acquisition alone boosted exploration drilling success from 13 percent to 44 percent between 1991 and 1996.

“Such success would have not been possible without HPC resources devoted to its processing,” the authors state. “But perhaps even more crucially, the existence of new technologies allowed opening areas previously thought impossible to explore into hugely successful business stories, as for example the Gulf of Mexico or the Brazilian Pre-salts. Without HPC there would have been no possibility to exploit hydrocarbons in these areas efficiently.”

Wind and biogas are also examined. HPC4E has identified that “the competitiveness of wind farms can be guaranteed only with accurate wind resource assessment, farm design and short-term micro-scale wind simulations to forecast the daily power production.” And that exascale HPC systems will be essential to improving combustion for biogas, for designing more efficient furnaces, engines, clean burning vehicles and power plants.

Illustrating how essential supercomputing has become for worldwide energy endeavors, HPC4E notes that 40 percent of China’s Tianhe-1A cycles went to petroleum-related activities in the 2010-11.

The energy sector has long been the largest industrial user of HPC. Major energy companies BP, Total, Eni and Norwegian company PGS all operate petascale systems to accelerate production and reduce risk, but Eni took the lead last week when it announced a 18.6-petaflops (peak) system, two times faster than BP’s top machine.

Geert Wenes, senior practice leader at Cray has said, “a perfect storm in seismic processing requirements is ensuring that the O&G industry will be an early adopter of exascale computing technologies.”

In making this research and guidance available to its partners, HPC4E anticipates “the exa-scale era will provide a significant platform for making important contributions in the power and transportation sectors towards more efficient, more flexible and with low emissions systems with direct impact on public health and climate change.”

The project’s key results have been summarized in a fact sheet. The full writeup, “Whitepaper about the use of Exascale computers in Oil & Gas, Wind Energy and Biogas Combustion industries” is available as a PDF.

The HPC for Energy project (HPC4E) launched in 2015 to promote the energy interests of Brazil and the European Union. The project website cites Brazil as having an estimated potential wind power of 145 GW, making it one of the largest potential wind energy producers of the world. Industry partners include REPSOL, TOTAL, Iberdrola and PETROBRAS, and the following research centers: Barcelona Supercomputing Center, CIEMAT, Inria, University of Lancaster (ULANC), Queen Mary University of London, COPPE, LNNC, ITA, Universidade Federal do Rio Grande do Sul and Universidade Federal de Pernambuco. The project coordinators are Barcelona Supercomputing Center (EU) and COPPE (Brazil).

The post EU, Brazil Energy Interests Position for Exascale appeared first on HPCwire.

ORNL Researchers Explore Supercomputing Workflow Best Practices

Thu, 01/25/2018 - 14:45

Jan. 25 — Scientists at the Department of Energy’s Oak Ridge National Laboratory are examining the diverse supercomputing workflow management systems in use in the United States and around the world to help supercomputers work together more effectively and efficiently.

Because supercomputers have largely developed in isolation from each other, existing modeling and simulation, grid/data analysis, and optimization workflows meet highly specific needs and therefore cannot easily be transferred from one computing environment to another.

Divergent workflow management systems can make it difficult for research scientists at national laboratories to collaborate with partners at universities and international supercomputing centers to create innovative workflow-based solutions that are the strength and promise of supercomputing.

Led by Jay Jay Billings, team lead for the Scientific Software Development group in ORNL’s Computing and Computational Sciences Directorate, the scientists have proposed a “building blocks” approach in which individual components from multiple workflow management systems are combined in specialized workflows.

Billings worked with Shantenu Jha of the Computational Science Initiative at Brookhaven National Laboratory and Rutgers University, and Jha presented their research at the 2017 Workshop on Open Source Supercomputing in Denver in November 2017. Their article appears in the workshop’s proceedings.

The researchers began by analyzing how existing workflow management systems work—the tasks and data they process, the order of execution, and the components involved. Factors that can be used to define workflow management systems include whether a workflow is long or short running, runs internal cycles or in linear fashion with an endpoint, and requires humans to complete. Long used to understand business processes, the workflow concept was introduced in scientific contexts where automation was useful for research tasks such as setting up and running problems on supercomputers and then analyzing the resulting data.

Viewed through the prism of today’s complex research endeavors, supercomputers’ workflows clearly have disconnects that can hamper scientific advancement. For example, Billings pointed out that a project might draw on multiple facilities’ work while acquiring data from experimental equipment, performing modeling and simulation on supercomputers, and conducting data analysis using grid computers or supercomputers. Workflow management systems with few common building blocks would require installation of one or more additional workflow management systems—a burdensome level of effort that also causes work to slow down.

“Poor or nonexistent interoperability is almost certainly a consequence of the ‘Wild West’ state of the field,” Billings said. “And lack of interoperability limits reusability, so it may be difficult to replicate data analysis to verify research results or adapt the workflow for new problems.”

The open building blocks workflows concept being advanced by ORNL’s Scientific Software Development group will enable supercomputers around the world to work together to address larger scientific problems that require workflows to run on multiple systems for complete execution.

Future work includes testing the hypothesis that the group’s approach is more scalable and sustainable and a better practice.

This research is supported by DOE and ORNL’s Laboratory Directed Research and Development program.

ORNL is managed by UT–Battelle for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. DOE’s Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit http://science.energy.gov/.

Source: Laurie Varma, ORNL

The post ORNL Researchers Explore Supercomputing Workflow Best Practices appeared first on HPCwire.

Scientists Pioneer Use of Deep Learning for Real-Time Gravitational Wave Discovery

Thu, 01/25/2018 - 12:43

Jan. 25, 2018 — Scientists at the National Center for Supercomputing Applications (NCSA), located at the University of Illinois at Urbana-Champaign, have pioneered the use of GPU-accelerated deep learning for rapid detection and characterization of gravitational waves. This new approach will enable astronomers to study gravitational waves using minimal computational resources, reducing time to discovery and increasing the scientific reach of gravitational wave astrophysics. This innovative research was recently published in Physics Letters B.

Combining deep learning algorithms, numerical relativity simulations of black hole mergers—obtained with the Einstein Toolkit run on the Blue Waters supercomputer—and data from the LIGO Open Science CenterNCSA Gravity Group researchers Daniel George and Eliu Huertaproduced Deep Filtering, an end-to-end time-series signal processing method. Deep Filtering achieves similar sensitivities and lower errors compared to established gravitational wave detection algorithms, while being far more computationally efficient and more resilient to noise anomalies. The method allows faster than real-time processing of gravitational waves in LIGO’s raw data, and also enables new physics, since it can detect new classes of gravitational wave sources that may go unnoticed with existing detection algorithms. George and Huerta are extending this method to identify in real-time electromagnetic counterparts to gravitational wave events in future LSST data.

NCSA’s Gravity Group leveraged NCSA resources from its Innovative Systems Laboratory, NCSA’s Blue Waters supercomputer, and collaborated with talented interdisciplinary staff at the University of Illinois. Also critical to this research were the GPUs (Tesla P100 and DGX-1) provided by NVIDIA, which enabled an accelerated training of neural networks. Wolfram Research also played an important role, as the Wolfram Language was used in creating this framework for deep learning.

George and Huerta worked with NVIDIA and Wolfram researchers to create this demo to visualize the architecture of Deep Filtering, and to get insights into its neuronal activity during the detection and characterization of real gravitational wave events. This demo highlights all the components of Deep Filtering, exhibiting its detection sensitivity and computational performance.

This work was awarded first place at the ACM Student Research Competition at SC17, and also received the Best Poster Award at the 24th IEEE international Conference on HPC, Data, and Analytics. This research was presented as a contributed talk at the 2017 Deep Learning Workshop for the Physical Sciences.

About NCSA

The National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign provides supercomputing and advanced digital resources for the nation’s science enterprise. At NCSA, University of Illinois faculty, staff, students, and collaborators from around the globe use advanced digital resources to address research grand challenges for the benefit of science and society. NCSA has been advancing one third of the Fortune 50 for more than 30 years by bringing industry, researchers, and students together to solve grand challenges at rapid speed and scale.

Source: NCSA

The post Scientists Pioneer Use of Deep Learning for Real-Time Gravitational Wave Discovery appeared first on HPCwire.

Dealing with HPC Correctness: Challenges and Opportunities

Thu, 01/25/2018 - 10:35

Developing correct and reliable HPC software is notoriously difficult. While effective correctness techniques for serial codes (e.g., verification, debugging and systematic testing) have been in vogue for decades, such techniques are in their infancy for HPC codes. Why is that?

HPC correctness techniques are burdened with all the well-known problems associated with serial software plus special challenges:

  • growing heterogeneity (e.g., architectures with CPUs, special purpose accelerators)
  • massive scales of computation (i.e., some bugs only manifest under very high degrees of concurrency)
  • use of combined parallel programming models (e.g., MPI+X) that often lead to non-intuitive behaviors
  • new scalable numerical algorithms (e.g., to leverage reduced precision in floating-point arithmetic)
  • use of different compilers and optimizations


HPC practitioners see additional demands on their time as they learn how to effectively utilize newer machine types that can support much larger problem scales. Developing new and scalable algorithms that work well on next-generation machines while also supporting new science imposes additional—and non-trivial—demands. Developers often don’t have time left to graduate beyond the use of printf debugging or traditional debuggers. Unfortunately, mounting evidence suggests that significant productivity losses due to show-stopper bugs do periodically occur, making the development of better debugging methods inevitable.

Two recent efforts took aim at these challenges. First, an HPC correctness summit sponsored by the U.S. Department of Energy (DOE) resulted in a report (50+ pages) covering a spectrum of issues that can help lay this missing foundation in HPC debugging and correctness.

Second, a well-attended workshop entitled Correctness 2017: First International Workshop on Software Correctness for HPC Applications took place at SC17. This article summarizes these two efforts and concludes with avenues for furthering HPC correctness research. We also invite reader comments on ideas and opportunities to advance this cause.

1. HPC Correctness Summit

Held on January 25–26, 2017, at the DOE headquarters (Washington, D.C.), the HPC Correctness Summit included discussions of several show-stopper bugs that have occurred during large-scale, high-stakes HPC projects. Each bug took several painstaking months of debugging to rectify, revealing the potential for productivity losses and uncertainties of much more severe proportions awaiting the exascale era.

The DOE report distills many valuable nuggets of information not easily found elsewhere. For instance, it compiles one of the most comprehensive tables capturing existing debugging and testing solutions, the family of techniques they fall under, and further details of the state of development of these tools.

The report concludes that we must aim for rigorous specifications, go after debugging automation by emphasizing bug-hunting over formal proofs, and launch a variety of activities that address the many facets of correctness.

These facets include reliable compilation; detecting data races; root-causing the sources of floating-point result variability brought in by different algorithms, compilers, and platforms; combined uses of static and dynamic analysis; focus on libraries; and smart IDEs.

Last but not least, the DOE report laments a near-total absence of a community culture of sharing bug repositories, developing common debugging solutions, and even talking openly about bugs (and not merely about performance and scalability successes). Dr. Leslie Lamport, the 2014 ACM Turing Award Winner, observes that the difficulty of verification can be an indirect measure of how ill-structured the software design is. A famous verification researcher, Dr. Ken McMillan, states it even more directly: We design through debugging. Promoting this culture of openness calls for incentives through well-targeted research grants, as it takes real work to reach a higher plane of rigor. While some of the best creations in the HPC-land were acts of altruism, experience suggests that more than altruism is often inevitable.

Recommendation for sponsoring the Summit was made by the DOE ASCR program manager Dr. Sonia R. Sachs, under the leadership of research director Dr. William Harrod. In addition to the authors of this article, participating researchers were Paul Hovland (Argonne National Lab), Costin Iancu (Lawrence Berkeley National Lab), Sriram Krishnamoorthy (Pacific Northwest National Lab), Richard Lethin (Reservoir Labs), Koushik Sen (UC Berkeley), Stephen Siegel (University of Delaware), and Armando SolarLezama (MIT).

2. HPC Correctness Workshop

As correctness becomes an increasingly important aspect of HPC applications, the research and practitioner community begins to discuss ways to address the problem. Correctness 2017: The First International Workshop on Software Correctness for HPC Applications debuted at the SC conference series on November 12, 2017, demonstrating growing interest on this topic. The goal was to discuss ideas for HPC correctness, including novel research methods to solve challenging problems as well as tools and techniques that can be used in practice today.

A keynote address by Stephen Siegel (Associate Professor, University of Delaware) on the CIVL verification language opened the workshop, followed by seven paper presentations grouped into three categories: applications and algorithms correctness; runtime systems correctness; and code generation and code equivalence correctness.

Topics of discussion included static analysis for finding the root-cause of floating-point variability, how HPC communities like climate modeling deal with platform-dependent result variability, and ambitious proposals aimed at in situ model checking of MPI applications. Participants also examined automated synthesis of HPC algorithms and successes in detecting extremely tricky cases of OpenMP errors by applying rigorous model-level analysis.

While using formal methods to verify large HPC applications is perhaps too ambitious today, a question arose: Can formal methods be applied to verify properties of small HPC programs? (For example, small programs like DOE proxy applications extracted from large production applications could be used to mimic some features of large-scale applications.) Workshop participants agreed that this may be a possibility—at least for some small proxy applications or for some of their key components.

The audience voiced enthusiastic support for continuing correctness workshops at SC. This inaugural workshop was organized by Ignacio Laguna (Lawrence Livermore National Laboratory) and Cindy Rubio-González (University of California at Davis).

3. What’s Next?

As the community depends on in silico experiments for large-scale science and engineering projects, trustworthy platforms and tools will ensure that investments in HPC infrastructures and trained personnel are effective and efficient. While further experience is yet to be gained on cutting-edge exascale machines and their productive use, waiting for the machines to be fully operational before developing effective debugging solutions is extremely short-sighted. Today’s petaflop machines can—and should—be harnessed for testing and calibrating debugging solutions for the exascale era.

Initiatives to address the correctness problem in HPC, such as the DOE summit and the SC17 workshop, are only the beginning of many more such studies and events to follow. In addition to the DOE, the authors thank their own organizations for their support and for facilitating these discussions.

Overall, we encourage the HPC community to acknowledge that debugging is fundamentally an enabler of performance optimizations. While this question was not settled in any formal way at the Correctness workshop, the level of interest exhibited by the attendees coupled with their keen participation suggested that research on rigorous methods at all levels must be encouraged and funded. There was however widespread agreement that conventional methods aren’t bringing in the requisite levels of incisiveness with respect to defect elimination in HPC.

Ganesh Gopalakrishnan’s work is supported by research grants from divisions under the NSF directorate for Computer and Information Science and Engineering. Ignacio Laguna’s work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DEAC52-07NA27344 (LLNL-MI-744729).

The post Dealing with HPC Correctness: Challenges and Opportunities appeared first on HPCwire.

HPC and AI – Two Communities Same Future

Thu, 01/25/2018 - 10:30

According to Al Gara (Intel Fellow, Data Center Group), high performance computing and artificial intelligence will increasingly intertwine as we transition to an exascale future using new computing, storage, and communications technologies as well as neuromorphic and quantum computing chips. Gara observes that, “The convergence of AI, data analytics and traditional simulation will result in systems with broader capabilities and configurability as well as cross pollination.”

Gara sees very aggressive hardware targets being set for this intertwined HPC and AI future, where the hardware will deliver usable performance exceeding one exaflops of double precision performance (and much more for lower and reduced precision arithmetic). He believes a user focus on computation per memory capacity will pay big dividends across architectures and provide systems software and user applications the opportunity to stay on the exponential performance growth curve through exascale and beyond as shown in the performance table below.

Figure 1: Architectural targets for future systems that will support both HPC and AI. Note: PiB is petabytes of memory capacity

Unification of the “3 Pillars”

The vision Gara presented is based on a unification of the “3 Pillars” of HPC: Artificial Intelligence (AI) and Machine Learning (ML); Data Analytics and Big Data; plus High Performance Computing (HPC). What this means is that users of the future will program using models that leverage each other and that interact through memory.

Figure 2: Unifying the “3 Pillars” (Source Intel)

More concretely, Intel is working towards exascale systems that are highly configurable that can support upgrades to fundamentally new technologies including scalable processors, accelerators, neural network processors, neuromorphic chips, FPGAs, Intel persistent memory, 3D NAND, and custom hardware.

Figure 3: Working towards a highly configurable future (Source Intel)

The common denominator in Gara’s vision is that the same architecture will cover HPC, AI, and Data Analytics through configuration, which means there needs to be a consistent software story across these different hardware backends to address HPC plus AI workloads.

A current, very real instantiation of Gara’s vision is happening now through the use of Intel  nGraphT library in popular machine learning packages such as TensorFlow. Essentially, Intel nGraph library is being used as an intermediate language (in a manner analogous to LLVM) that can deliver optimized performance across a variety of hardware platforms from CPUs to FPGAs, dedicated neural network processors, and more.

Jason Knight (CTO office, Intel Artificial Intelligence Products Group) writes, “We see the Intel nGraph library  as the beginning of an ecosystem of optimization passes, hardware backends and frontend connectors to popular deep learning frameworks.”

Figure 4: XLA support for TensorFlow

Overall, Gara noted that “HPC is truly the birthplace of many architectures … and the testing ground” as HPC programmers, researchers, and domain scientists explore the architectural space map the performance landscape:

  • Data level parallel (from fine grain to coarse grain)
  • Energy efficient accelerators (compute density and energy efficiency often are correlated)
  • Exploiting predictable execution at all levels (cache to coarse grain)
  • Integrated fixed function data flow accelerators
  • General purpose data flow accelerators

Technology Opportunities

HPC and AI scientists will have access and the ability to exploit the performance capabilities of a number of new network, storage, and computing architectures.

In particular, HPC is a big driver of optical technology as fabrics represent one of the most challenging and costly elements of a supercomputer. For this reason, Gara believes that silicon photonics is game changing as the ability to integrate silicon and optical devices will deliver significant economic and performance advantages including room to grow (in a technology sense) as we transition to linear and ring devices and optical devices that communicate using multiple wavelengths of light.

New non-volatile storage technologies such as Intel persistent memory are blurring the line between memory and storage. Gara describes a new storage stack for exascale supercomputers, but of course this stack can be implemented on general compute clusters as well.

The key, Gara observes, is that this stack is designed from the ground up to use NVM storage. The result will be high throughput IO operations at arbitrary alignment and transaction sizes because applications can perform ultra-fine grained IO through a new userspace NVMe/pmem software stack. At a systems level, this means that users will be able to manage massively distributed NVM storage using scalable communications and IO operations across homogenous, shared-nothing servers in a software managed redundant, self-healing environment. In other words, high-performance, big-capacity scalable storage to support big-data and in-core algorithms such as log-runtime algorithms and data analytics on sparse and unstructured data sets.

Researchers are exploiting the advances in memory performance and capacity to change the way that we approach AI and HPC problems. Examples of such work range from the University of Utah to King Abdullah University of Science and Technology (KAUST) in Saudi Arabia.

For example, Dr. Aaron Knoll (research scientist, Scientific Computing and Imaging Institute at the University of Utah) stresses the importance the logarithmic runtime algorithms in the Ospray visualization package. Logarithmic runtime algorithms are important for big visualizations and exascale computing. Basically the runtime increases slowly as data sizes increase. The logarithmic growth is important as the runtime increases slowly even when the data size increases by orders of magnitude. Otherwise, the runtime growth can prevent computations from finishing in a reasonable time, thus obviating the benefits of a large memory capacity computer.

As a result, large memory capacity (e.g., “fat”) compute nodes that provide low latency access to data are the enabling technology that can compete and beat massively parallel accelerators at their own game. Research at the University of Utah [PDF] shows a single large memory (three terabyte) workstation can deliver competitive and even superior interactive rendering performance compared to a 128-node GPU cluster. The University of Utah group is also exploring in-situ visualization using P-k-d trees and other fast, in-core approaches [PDF] to show that large “direct” in-core techniques are viable alternatives to traditional HPC visualization approaches.

In a second example, KAUST has been enhancing the ecosystem of numerical tools for multi-core and many-core processors in collaboration with Intel and the Tokyo Institute of Technology. Think of processing really big billion by billion sized matrices using CPU technology in a mathematically and computationally efficient manner.

The importance of these contributions in linear algebra and Fast Multi-pole Methods (FMM) can be appreciated by non-HPC scientists as numerical linear algebra is at the root of nearly all applications in engineering, physics, data science, and machine learning. The FMM method has been listed as one of the top ten algorithms of the 20th century.

Results show that HPC scientists now have the ability to solve faster and larger dense linear algebra problems and FMM related numerical problems than is possible using current highly optimized libraries such as the Intel Math Kernel Library (Intel MKL) running on the same hardware. These methods have been made available in highly optimized libraries bearing the names of ExaFMM, and HiCMA.

Looking to the future: Neuromorphic and Quantum Computing

The new neuromorphic test chips codenamed Loihi may represent a phase change in AI because they “self-learn”. Currently, data scientists spend a significant amount of time working with data to create training sets that are used to train a neural network to solve a complex problem. Neuromorphic chips eliminate the need for a human to create a training set (e.g., no human in the loop). Instead, humans need to validate the accuracy once the neuromorphic hardware has found a solution.

Succinctly, neuromorphic computing utilizes an entirely different computational model than traditional neural networks used in machine and deep learning. This model more accurately mimics how biological brains operate so neuromorphic chips can “learn” in an event driven fashion simply by observing their environment.  Further, they operate in a remarkably energy efficient manner. Time will tell if and when this provides an advantage. The good news is that neuromorphic hardware is now becoming available.

Gara states that the goal is to create a programmable architecture that delivers >100x energy efficiency over current architectures to solve hard AI problems efficiently. He provided examples such as sparse coding, dictionary learning, constraint satisfaction, pattern matching, and dynamic learning and adaptation.

Finally, Gara described advances in quantum computing that are being made possible through a collaboration with Delft University to make better Qubits (a Quantum Bit), improve connectivity between Qubits, and develop scalable IO. Quantum computing is non-intuitive because most people don’t intuitively grasp the idea of entanglement or something being in multiple states at the same time. Still the web contains excellent resources such as Quantum computing 101 at the University of Waterloo to help people make sense of this technology that is rapidly improving and, if realized, will change our computing universe forever.

Quantum computing holds the possibility of solving currently intractable problems using general purpose computers. Gara highlighted applications of the current Intel quantum computing efforts in quantum chemistry, microarchitecture and algorithm co-design, and post-quantum secure cryptography.


We are now seeing the introduction of new computing, storage, and manufacturing technologies that are forcing the AI and HPC communities to rethink their traditional approaches so they can use these ever more performant, scalable, and configurable architectures.  Al Gara pointed out, technologies are causing a unification of the “3 pillars” which, in turn, makes the future of AI and HPC in the data center indistinguishable from each other.

Rob Farber is a global technology consultant and author with an extensive background in HPC and in developing machine learning technology that he applies at national labs and commercial organizations. Rob can be reached at info@techenablement.com

The post HPC and AI – Two Communities Same Future appeared first on HPCwire.

Purdue-Affiliated Startup Designing Hardware and Software for Deep Learning

Thu, 01/25/2018 - 10:22

WEST LAFAYETTE, Ind., Jan. 25, 2018 – A Purdue University-affiliated startup is designing next-generation hardware and software for deep learning aimed at enabling computers to understand the world in the same way humans do.

FWDNXT, based in the Purdue Research Park, has developed a low-power mobile coprocessor called Snowflake for accelerating deep neural networks effective at image recognition and classification. Snowflake was designed with the primary goal of optimizing computational efficiency by processing multiple streams of information to mix deep learning and artificial intelligence techniques with augmented reality application.

“Everybody was looking for a solution like this,” said Eugenio Culurciello, an associate professor in the Weldon School of Biomedical Engineering at Purdue. “We have a special computer that can operate on large data very fast with low power consumption. Our mission is to propel machine intelligence to the next level.”

Culurciello says the goal of FWDNXT is to enable computers to understand the environment so computers, phones, tablets, wearables and robots can be helpful in daily activities. FWDNXT uses innovative algorithms to differentiate items the same ways humans do.

This is a Field Programmable Gate Array module from Micron, where FWDNXT installs Snowflake, a deep neural network accelerator. Snowflake was designed with the primary goal of optimizing computational efficiency by processing multiples streams of information to mix deep learning and artificial intelligence techniques with augmented reality application. Snowflake is effective at image recognition and classification. (Image provided by FWDNXT)

Culurciello says Snowflake is able to achieve a computational efficiency of more than 91 percent on entire convolutional neural networks, which are the deep learning model of choice for performing object detection, classification, semantic segmentation and natural language processing tasks. Snowflake also is able to achieve 99 percent efficiency on some individual layers.

FWDNXT has shown expertise in scene analysis and scene parsing, which allows the computer to perceive the outside environment. That ability is among the most difficult challenges in augmented reality content, Culurciello says.

FWDNXT’s innovation in hardware and software will be used to drive cars autonomously, to recognize faces for security and other purposes and numerous other day-to-day purposes, such as helping people find items on their shopping lists as they walk down a store aisle or smart appliances recognizing a user’s preferences.

Culurciello says there has been remarkable development in machine learning in the past five years as computer scientists turned to specialized chips to do more complex computing instead of depending on a central processing unit.

“It can operate on large data very fast with low power consumption,” Culurciello said. “We want to have the maximum performance with the minimal energy.”

FWDNXT was able to create a new, efficient computer architecture with funding originally from grants it received from Purdue and the Navy, including one worth nearly $1 million.

Culurciello says FWDNXT wants to make microchips that will be used in virtually all smart device – for instance, in cars to start them, in appliances to recognize persons’ preferences, in mobile phone to listen to voices.

Culurciello says FWDNXT has found a strategic partner, has obtained multimillion dollars in funding, and the next step is to seek Series A funding. FWDNXT also is looking to add to its team, which already includes Ali Zaidy, the lead architect designer of Snowflake and a deep learning expert; Abhishek Chaurasia, the team’s deep learning lead developer; Andre Chang, architect and compiler of deep learning; and Marko Vitez, neural network optimization wizard.

Another big milestone will be to develop a prototype microchip due in the first half of 2018. FWDNXT has shown it can run on FPGA, but a custom microchip would make it even more efficient.

FWDNXT hopes to be selling programmable logic prototypes soon and hopes to be able to sell the initial microchips to preferred customers in the next year or so.

FWDNEXT has filed patent applications related to Snowflake through the Purdue Office of Technology Commercialization. Culurciello also credits the Purdue Foundry, an entrepreneurship and commercialization accelerator in Discovery Park’s Burton D. Morgan Center for Entrepreneurship at Purdue, with helping FWDNXT come up with a business plan that will help attract investors and expand its team.

About Purdue Foundry

The Purdue Foundry is an entrepreneurship and commercialization accelerator in Discovery Park’s Burton D. Morgan Center for Entrepreneurship whose professionals help Purdue innovators create startups. Managed by the Purdue Research Foundation, the Purdue Foundry was named a top recipient at the 2016 Innovation and Economic Prosperity Universities Designation and Awards Program by the Association of Public and Land-grant Universities for its work in entrepreneurship. For more information about funding and investment opportunities in startups based on a Purdue innovation, contact the Purdue Foundry at foundry@prf.org.

About Purdue Office of Technology Commercialization

The Purdue Office of Technology Commercialization operates one of the most comprehensive technology transfer programs among leading research universities in the U.S. Services provided by this office support the economic development initiatives of Purdue University and benefit the university’s academic activities. The office is managed by the Purdue Research Foundation, which received the 2016 Innovation and Economic Prosperity Universities Award for Innovation from the Association of Public and Land-grant Universities. For more information about funding and investment opportunities in startups based on a Purdue innovation, contact the Purdue Foundry at foundry@prf.org. For more information on licensing a Purdue innovation, contact the Office of Technology Commercialization atinnovation@prf.org.

Source: Purdue University

The post Purdue-Affiliated Startup Designing Hardware and Software for Deep Learning appeared first on HPCwire.

TACC Highlights Science and Engineering Problems Solved with Supercomputers and AI

Thu, 01/25/2018 - 08:20

Jan. 25, 2018 — Artificial intelligence represent a new approach scientists can use to interrogate data, develop hypotheses, and make predictions, particularly in areas where no overarching theory exists.

Traditional applications on supercomputers (also know as high-performance computers [HPC]) start from “first principles” — typically mathematical formulas representing the physics of a natural system — and then transform them into a problem that can be solved by distributing the calculations to many processors.

By contrast, machine learning and deep learning — two subsets of the field of artificial intelligence — take advantage of the availability of powerful computers and very large datasets to find subtle correlations in data and rapidly simulate, test and optimize solutions. These capabilities enable scientists to derive the governing models (or workable analogs) for complex systems that cannot be modeled from first principles.

Machine learning involves using a variety of algorithms that “learn” from data and improve performance based on real-world experience. Deep learning, a branch of machine learning, relies on large data sets to iteratively “train” many-layered neural networks, inspired by the human brain. These trained neural networks are then used to “infer” the meaning of new data.

Training can be a complex and time-consuming activity, but once a model has been trained, it is fast and easy to interpret each new piece of data accordingly in order to recognize, for example, cancerous versus healthy brain tissue or to enable a self-driving vehicle to identify a pedestrian crossing a street.

In Search of Deep Learning Trainers: Heavy Computation Required

Researchers are using Stampede2 — a Dell/Intel system at the Texas Advanced Computing Center (TACC) that is one of the world’s fastest supercomputers and the fastest at any U.S. university — to advance machine and deep learning. Image courtesy of TACC.

Just like traditional HPC, training a deep neural network or running a machine learning algorithm requires extremely large numbers of computations (quintillions!) – theoretically making them a good fit for supercomputers and their large numbers of parallel processors.

Training a deep neural network to act as an image classifier, for instance, requires roughly 1018 single precision operations (an exaFLOPS). Stampede2 — a Dell/Intel system at the Texas Advanced Computing Center (TACC) that is one of the world’s fastest supercomputers and the fastest at any U.S. university — can perform approximately two times 1016

Logically, supercomputers should be able to train deep neural networks rapidly. But in the past, such training has required hours, days or even months to complete (as was the case with Google’s AlphaGo).

Overcoming Bottlenecks in Neural Networks

With frameworks optimized for modern CPUs, however, experts have recently been able to train deep neural network models in minutes. For instance, researchers from TACC, the University of California, Berkeley and the University of California, Davis used 1024 Intel Xeon Scalable processors to complete a 100-epoch ImageNet training with AlexNet in 11 minutes, the fastest that such training has ever been reported. Furthermore, they were able to scale to 1600 Intel Xeon Scalable processors and finish the 90-epoch ImageNet training with ResNet-50 in 31 minutes without losing accuracy.

These efforts at TACC (and similar ones elsewhere) show that one can effectively overcome bottlenecks in fast deep neural network training with high-performance computing systems by using well-optimized kernels and libraries, employing hyper-threading, and sizing the batches of training data properly.

In addition to Caffe, which the researchers used for the ImageNet training, TACC also supports other popular CPU- and GPU-optimized deep learning frameworks, such as MXNet and TensorFlow, and is creating an extensive environment for machine and deep learning research.

Though mostly done as a proof-of-concept showing how HPC can be used for deep learning, high-speed, high-accuracy image classification can be useful in characterizing satellite imagery for environmental monitoring or labeling nanoscience images obtained by scanning electron microscope.

This fast training will impact the speed of science, as well as the kind of science that researchers can explore with these new methods.

Successes in Critical Applications

While TACC staff explore the potential of HPC for artificial intelligence, researchers from around the country are using TACC supercomputers to apply machine learning and deep learning to science and engineering problems ranging from healthcare to transportation.

For instance, researchers from Tufts University and the University of Maryland, Baltimore County, used Stampede1 to uncover the cell signaling network that determines tadpole coloration. The research helped identify the various genes and feedback mechanisms that control this aspect of pigmentation (which is related to melanoma in humans) and reverse-engineered never-before-seen mixed coloration in the animals.

They are exploring the possibility of using this method to uncover the cell signaling that underlies various forms of cancer so new therapies can be developed.

In another impressive project, deep learning experts at TACC collaborated with researchers at the University of Texas Center for Transportation Research and the City of Austin to automatically detect vehicles and pedestrians at critical intersections throughout the city using machine learning and video image analysis.

The work will help officials analyze traffic patterns to understand infrastructure needs and increase safety and efficiency in the city. (Results of the large-scale traffic analyses were presented at IEEE Big Data in December 2017 and the Transportation Research Board Annual Meeting in January 2018.)

In another project, George Biros, a mechanical engineering professor at the University of Texas at Austin, used Stampede2 to train a brain tumor classification systemthat can identify brain tumors (gliomas) and different types of cancerous regions with greater than 90 percent accuracy — roughly equivalent to an experienced radiologist.

The image analysis framework will be deployed at the University of Pennsylvania for various clinical studies of gliomas.

Through these and other research and research-enabling efforts, TACC has shown that HPC architectures are well suited to machine learning and deep learning frameworks and algorithms. Using these approaches in diverse fields, scientists are beginning to develop solutions that will have near-term impacts on health and safety, not to mention materials science, synthetic biology and basic physics.

The Artificial Intelligence at TACC Special Report showcases notable examples for this growing area of research. Check back for more advances and applications.

Source: Aaron Dubrow, TACC

The post TACC Highlights Science and Engineering Problems Solved with Supercomputers and AI appeared first on HPCwire.

San Diego State Researchers Advance 3D Earthquake Simulation

Wed, 01/24/2018 - 11:39

Working with data from the Lander earthquake that shook Southern California in 1992, a team of researchers from San Diego State University has advanced earthquake simulation capability by improving a widely-used wavefield simulation code and adapting it for improved 3D modeling and for use on high-end HPC systems. Their research delivered new insight into strike-slip earthquakes such as the Lander earthquake which was magnitude 7.3 and leveled homes, sparked fires, cracked roads and caused one death.

The code used by the team is their updated version of anelastic wave propagation (AWP-ODC) – the ODC standing for developers Kim Olsen and Steven Day at San Diego State University (SDSU) and Yifeng Cui at the San Diego Supercomputer Center. Daniel Roten, a computational research seismologist at SDSU, led the studies.

The research showed how one earthquake can deliver building-collapsing shakes to some areas but not to others, and the Landers simulation helps solve a long-standing puzzle (more below) of earthquake science. The work was supported by a Department of Energy Innovative and Novel Computational Impact on Theory and Experiment (INCITE) award and by the Southern California Earthquake Center (SCEC). A full account of the work is posted on the DoE Office of Science web site.

Many earthquake simulations use either linear models of forces in three dimensions or nonlinear models of forces in one or two dimensions. These demand less computer time and memory than three-dimensional nonlinear models but they do not capture true relationships between the forces and their effects. For example, linear models typically predict more violent shaking than actually occurs, producing inaccurate risk and hazard assessments.

Key to the team’s success was its ability to model nonlinear phenomena in three dimensions using a code that can be scaled up to run well on large HPC systems, namely the Blue Waters machine at the University of Illinois at Urbana-Champaign and Titan at the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science user facility. The team also used two additional OLCF systems to process data and to store results and input data files.

Snapshots from linear (left) and nonlinear (right) simulations showing wave propagation during a magnitude 7.7 earthquake rupturing the San Andreas fault from southeast to northwest. Depicted here: intensity of shaking (red-blue) and permanent ground deformation (green, nonlinear only). The jagged lines at center represents a seismogram reading near San Bernardino. Plastic deformation reduces the intensity of shaking in the San Bernardino valley; the nonlinear simulation allows a more accurate ground-motion prediction than the linear.

The researchers plan to continue to develop the code for faster HPC systems. Roten says they developed the nonlinear method for CPU systems, which use standard central processing units, before they ported the implementation to the GPU version of the code, one employing graphics processing units to accelerate calculations. The code runs about four times faster on a GPU system than on CPUs, he adds. The team optimized the code to run even faster and to reduce the amount of memory it required, since GPUs have less memory available than CPUs.

“Parallel file systems and parallel I/O (input/output) are also important for these simulations, as we are dealing with a lot of input and output data,” Roten says. “Our input source file alone had a size of 52 terabytes.”

The GPU version does not presently handle all of the features needed. For now, the code takes advantage of Blue Waters’ mix of CPU and faster GPU nodes while the team develops the code to work exclusively on Titan’s GPU nodes. “Titan would have enough GPUs to further scale up the problem, which is what we plan to do,” Roten says.

The long-standing puzzle the Landers quake simulation addressed pertained to strike-slip earthquakes, in which rocks deep underground slide past each other, or slip, causing surface rock and soils to shift with them. Slips can cause dramatic surface changes, such as broken roads that are shifted so the lanes no longer line up.

But after studying the Landers earthquake and other strike-slip quakes with magnitudes higher than 7, scientists realized these observations were not as straightforward as they seemed. “Geologists and geophysicists were surprised to see that the slip at depth, inferred from satellite observations, is larger than slip observed at the surface, from shifts measured by geologists in the field,” Roten says.

Source: DoE
Link to full article: http://ascr-discovery.science.doe.gov/2018/01/the-big-one-in-3-d/

The post San Diego State Researchers Advance 3D Earthquake Simulation appeared first on HPCwire.