HPC Wire

Subscribe to HPC Wire feed
Since 1987 - Covering the Fastest Computers in the World and the People Who Run Them
Updated: 3 hours 34 min ago

Tsinghua University Wins ASC17 Championship Big Time

Fri, 04/28/2017 - 09:45

On April 28, the final round of the 2017 ASC Student Supercomputer Challenge (ASC17) ended in Wuxi. Tsinghua University stood out from 20 teams from around the world after a fierce one-week competition, becoming grand champion and winning the e Prize.

Tsinghua University secured ASC17 Champion

As the world’s largest supercomputing competition, ASC17 received applications from 230 universities around the world, 20 of which got through to the final round held this week at the National Supercomputing Center in Wuxi after the qualifying rounds. During the final round, the university student teams were required to independently design a supercomputing system under the precondition of a limited 3000W power consumption. They also had to operate and optimize standard international benchmark tests and a variety of cutting-edge scientific and engineering applications including AI-based transport prediction, genetic assembly, and material science. Moreover, they were required to complete high-resolution maritime simulation on the world’s fastest supercomputer, “Sunway TaihuLight.

The grand champion, team Tsinghua University, completed deep parallel optimization of the high-resolution maritime data simulation mode MASNUM on TaihuLight, expanding the original program up to 10,000 cores and speeding up the program by 392 times. This helped the Tsinghua University team win the e Prize award. MASNUM was nominated in 2016 for the Gordon Bell Prize, the top international prize in the supercomputing applications field.

The runner-up, Beihang University, gave an outstanding performance in the popular AI field. After constructing a supercomputing system which received massive training based on past big data of transportation provided by Baidu, their self-developed excellent deep neural network model yielded the most accurate prediction of road conditions during the morning peak.

The first-time finalist, Weifang University team, constructed a highly optimized advanced heterogeneous supercomputing system with Inspur’s supercomputing server, and ran the international HPL benchmark test, setting a new world record of 31.7 TFLOPS for float-point computing speed. The team turned out to be the biggest surprise of the event and won the award for best computing performance.

Moreover, Ural Federal University, National Tsing Hua University, Northwestern Polytechnical University and Shanghai Jiao Tong University won the application innovation award. The popular choice award was shared by Saint-Petersburg State University and Zhengzhou University.

“It is great to see the presence of global teams in this event,” Jack Dongarra, the Chairman of the ASC Expert Committee, founder of the TOP500 list that ranks the 500 most powerful supercomputer systems in the world, and professor at the Oak Ridge National Laboratory of the United States and the University of Tennessee, said in an interview. “This event inspired students to gain advanced scientific knowledge. TaihuLight is an amazing platform for this event. Just imagine the interconnected computation of everyone’s computer in a gymnasium housing 100,000 persons, and TaihuLight’s capacity is 100 times of such a gym. This is something none of the teams will ever be able to experience again.”

According to Wang Endong, initiator of the ASC competition, academician of the Chinese Academy of Engineering, and the chief scientist of Inspur Group, the rapid development of AI at the moment is significantly changing human society. At the core of such development are computing, data and algorithms. With this trend, supercomputers will become an important infrastructure for intelligent society in the future, and their speed of development and standards will be closely related to social development, improvement in livelihood, and progress of civilization. ASC competition is always committed to cultivating future-oriented, inter-disciplinary supercomputing talents to extend the benefits to the greater population.

ASC17 is jointly organized by the Asian Supercomputing Community, Inspur Group, the National Supercomputing Center in Wuxi, and Zhengzhou University. Initiated by China, the ASC supercomputing challenge aims to be the platform to promote exchanges among young supercomputing talent from different countries and regions, as well as to groom young talent. It also aims to be the key driving force in promoting technological and industrial innovations by improving the standards in supercomputing applications and research.

The post Tsinghua University Wins ASC17 Championship Big Time appeared first on HPCwire.

ISC 2017 Early-Bird Registration Ends May 10

Fri, 04/28/2017 - 09:07

FRANKFURT, Germany, April 28, 2017 — The early bird registration for the 2017 ISC High Performance conference will come to an end in less than two weeks and we would like to encourage participants not to wait until the last minute and miss out on the opportunity to save over 45 percent off the on-site rates.

The conference will once again be held at Messe Frankfurt in Germany and is expected to have an attendance of over 3,000 participants from around the globe. So far 146 companies have signed up to exhibit, and with very few booth spaces still left for booking, we will likely end up with a total of 150 exhibitors on the 2017 show floor.

Here is an overview of the different passes available to ISC 2017 attendees:

  • Conference Pass: The conference pass gives access to all sessions from Monday, June 19, through Wednesday, June 21, as well as the exhibition and all social events.
  • Exhibition Pass: Pass holders have access to the exhibition (and all related activities), all Birds-of-a-Feather (BoF) sessions, the PhD Forum, the Vendor Showdown, the ISC Student Cluster Competition awarding and the Welcome Party on Monday, June 19.
  • Tutorial Pass: This pass provides access to 13 tutorials interactive tutorials on Sunday, June 18.
  • Workshop Pass: This pass gives access to the 21 workshops that will take place on Thursday, June 22 at the Frankfurt Marriott Hotel.

This year the conference will cover a broad array of topics, such as exascale development, deep learning, big data, extreme-scale algorithms and new programming models. The conference also provides insights into a wide range of performance-demanding applications, including product prototyping, earthquake prediction, transportation logistics, energy exploration, and drug design, to name a few.

ISC Tutorials (Sunday, June 18)

If you are interested in broadening your knowledge in key HPC, networking and storage topics, consider attending the ISC tutorials. Renowned experts in their respective fields will give attendees a comprehensive introduction to the topic as well as providing a closer look at specific problems. They will also incorporate hands-on components where appropriate. The organizers are expecting over 300 attendees. Here is an overview of the five full-day and eight half-day tutorials.

ISC Workshops (Thursday, June 22)

The ISC Workshops are well-known to attract over 600 researchers and commercial users interested in learning more about current developments in specific areas of HPC. There are 21 unique workshops to choose from, and among them eight are full-day workshops. Click here to find out more about the individual workshops

Other interesting program elements include:

About ISC High Performance

First held in 1986, ISC High Performance is the world’s oldest and Europe’s most important conference and networking event for the HPC community. It offers a strong five-day technical program focusing on HPC technological development and its application in scientific fields, as well as its adoption in commercial environments.

ISC High Performance attracts engineers, IT specialists, system developers, vendors, scientists, researchers, students, journalists, and other members of the HPC global community. The exhibition draws decision-makers from automotive, finance, defense, aeronautical, gas & oil, banking, pharmaceutical and other industries, as well those providing hardware, software and services for the HPC community. Attendees will learn firsthand about new products and applications, in addition to the latest technological advances in the HPC industry.

Source: ISC

The post ISC 2017 Early-Bird Registration Ends May 10 appeared first on HPCwire.

Intel, JSC Collaborate to Deploy Next-Gen Modular Supercomputer

Fri, 04/28/2017 - 08:06

JÜLICH, April 28, 2017 — Intel and Forschungszentrum Jülich together with ParTec and DELL today announced a cooperation to develop and deploy a next-generation modular supercomputing system. Leveraging the experience and results gained in the EU-funded DEEP and DEEP-ER projects, in which three of the partners have been strongly engaged, the group will develop the necessary mechanisms required to augment JSC’s JURECA cluster with a highly-scalable component named “Booster” and being based on Intel’s Scalable Systems Framework (Intel SSF).

“This will be the first-ever demonstration in a production environment of the Cluster-Booster concept, pioneered in DEEP and DEEP-ER at prototype-level, and a considerable step towards the implementation of JSC’s modular supercomputing concept”, explains Prof. Thomas Lippert, Director of the Jülich Supercomputing Centre. Modular supercomputing is a new paradigm directly reflecting the diversity of execution characteristics, found in modern simulation codes, in the architecture of the supercomputer. Instead of a homogeneous design, different modules with distinct hardware characteristics are exposed via a homogeneous global software layer that enables optimal resource assignment.

Code parts of a simulation that can only be parallelized up to a limited concurrency level stay on the Cluster – equipped with faster general-purpose processor cores – while the highly parallelizable parts are to run on the weaker Booster cores but at much higher concurrency. In this way increased scalability and significantly higher efficiency with lower energy consumption can be reached, addressing both big data analytics and Exascale simulation capabilities.

Technical Specifications

The JURECA Booster will use Intel Xeon Phi 7250-F processors with on-package Intel Omni-Path Architecture interfaces. The system will be delivered by Intel with its subcontractor Dell, utilizing Dell’s PowerEdge C6230P servers. Once installed, it will provide a peak performance of 5 Petaflop/s. The system was co-designed by Intel and JSC to enable maximum scalability for large-scale simulations. The JURECA Booster will be directly connected to the JURECA cluster, a system delivered by T-Platforms in 2015, and both modules will be operated as a single system. As part of the project a novel high-speed bridging mechanism between JURECA’s InfiniBand EDR and the Boosters’ Intel Omni-Path Architecture interconnect will be developed by the group of partners. Together with the modularity features of ParTec’s ParaStation ClusterSuite, this will enable efficient usage of the whole system by applications flexibly distributed across the modules.

Supercomputer JURECA at the Jülich Supercomputing Centre (JSC)
Copyright: Forschungszentrum Jülich / Ralf-Uwe Limbach

Source: Jülich Supercomputing Centre

The post Intel, JSC Collaborate to Deploy Next-Gen Modular Supercomputer appeared first on HPCwire.

Supermicro Announces 3rd Quarter 2017 Financial Results

Fri, 04/28/2017 - 07:53

SAN JOSE, Calif., April 28, 2017 — Super Micro Computer, Inc. (NASDAQ:SMCI), a global leader in high-performance, high-efficiency server, storage technology and green computing, today announced third quarter fiscal 2017 financial results for the quarter ended March 31, 2017.

Fiscal 3rd Quarter Highlights

  • Quarterly net sales of $631.1 million, down 3.2% from the second quarter of fiscal year 2017 and up 18.5% from the same quarter of last year.
  • GAAP net income of $16.7 million, down 24.2% from the second quarter of fiscal year 2017 and equal to the same quarter of last year.
  • GAAP gross margin was 14.0%, down from 14.3% in the second quarter of fiscal year 2017 and down from 14.9% in the same quarter of last year.
  • Server solutions accounted for 70.0% of net sales compared with 68.1% in the second quarter of fiscal year 2017 and 69.9% in the same quarter of last year.

Net sales for the third quarter ended March 31, 2017 totaled $631.1 million, down 3.2% from $652.0 million in the second quarter of fiscal year 2017. No customer accounted for more than 10% of net sales during the quarter ended March 31, 2017.

GAAP net income for the third quarter of fiscal year 2017 and for the same period a year ago were both $16.7 million or $0.32 per diluted share. Included in net income for the quarter is $4.8 million of stock-based compensation expense (pre-tax). Excluding this item and the related tax effect, non-GAAP net income for the third quarter was $20.3 million, or $0.38 per diluted share, compared to non-GAAP net income of $19.0 million, or $0.36 per diluted share, in the same quarter of the prior year. On a sequential basis, non-GAAP net income decreased from the second quarter of fiscal year 2017 by $4.7 million or $(0.1) per diluted share.

GAAP gross margin for the third quarter of fiscal year 2017 was 14.0% compared to 14.9% in the same period a year ago. Non-GAAP gross margin for the third quarter was 14.0% compared to 14.9% in the same period a year ago. GAAP gross margin for the second quarter of fiscal year 2017 was 14.3% and Non-GAAP gross margin for the second quarter of fiscal year 2017 was 14.4%.

The GAAP income tax provision for the third quarter of fiscal year 2017 was $5.1 million or 23.6% of income before tax provision compared to $7.4 million or 30.7% in the same period a year ago and $9.3 million or 29.7% in the second quarter of fiscal year 2017. The effective tax rate for the third quarter of fiscal year 2017 was lower primarily due to a tax benefit resulting from the completion of an income tax audit in a foreign jurisdiction.

The Company’s cash and cash equivalents and short and long term investments at March 31, 2017 were $110.5 million compared to $183.7 million at June 30, 2016. Free cash flow for the nine months ended March 31, 2017 was $(113.5) million, primarily due to an increase in the Company’s cash used in operating activities.

Business Outlook & Management Commentary

The Company expects net sales of $655 million to $715 million for the fourth quarter of fiscal year 2017 ending June 30, 2017. The Company expects non-GAAP earnings per diluted share of approximately $0.40 to $0.50 for the fourth quarter.

“We are pleased to report third quarter revenues that exceeded our guidance in a quarter complicated by shortages in memory and SSD. Our resurgent revenue growth and market share gains are a result of our strategy of developing vertical markets that expand our TAMs. Storage, IOT, Accelerated Computing, Enterprise and Asia contributed to the 18.5% growth from last year,” said Charles Liang, Chairman and Chief Executive Officer. “Supermicro’s preparation for the upcoming new Xeon processor launches has never been stronger and our traction with new customer engagement for seeding and early deployment has been outstanding. We expect to lead the industry with the most innovative platform architectures, the broadest product array and total solutions during the upcoming technology transitions.”

It is currently expected that the outlook will not be updated until the Company’s next quarterly earnings announcement, notwithstanding subsequent developments. However, the Company may update the outlook or any portion thereof at any time. Such updates will take place only by way of a news release or other broadly disseminated disclosure available to all interested parties in accordance with Regulation FD.

 

 

Use of Non-GAAP Financial Measures

Non-GAAP gross margin discussed in this press release excludes stock-based compensation expense. Non-GAAP net income and net income per share discussed in this press release exclude stock-based compensation expense and the related tax effect of the applicable items. Management presents non-GAAP financial measures because it considers them to be important supplemental measures of performance. Management uses the non-GAAP financial measures for planning purposes, including analysis of the Company’s performance against prior periods, the preparation of operating budgets and to determine appropriate levels of operating and capital investments. Management also believes that the non-GAAP financial measures provide additional insight for analysts and investors in evaluating the Company’s financial and operational performance. However, these non-GAAP financial measures have limitations as an analytical tool, and are not intended to be an alternative to financial measures prepared in accordance with GAAP. Pursuant to the requirements of SEC Regulation G, detailed reconciliations between the Company’s GAAP and non-GAAP financial results is provided at the end of this press release. Investors are advised to carefully review and consider this information as well as the GAAP financial results that are disclosed in the Company’s SEC filings.

About Super Micro Computer, Inc.

Supermicro is a provider of end-to-end green computing solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and Embedded Systems worldwide. Supermicro’s advanced Server Building Block Solutions offer a vast array of components for building energy-efficient, application-optimized, computing solutions. Architecture innovations include Twin, TwinPro, FatTwin, Ultra Series, MicroCloud, MicroBlade, SuperBlade, Simply Double, Double-sided Storage, Battery Backup Power (BBP) modules and WIO/UIO. Products include servers, blades, GPU systems, workstations, motherboards, chassis, power supplies, storage, networking, server management software and SuperRack cabinets/accessories delivering unrivaled performance and value.

Source: Supermicro

The post Supermicro Announces 3rd Quarter 2017 Financial Results appeared first on HPCwire.

IARPA Launches ‘RAVEN’ to Develop Rapid Integrated Circuit Imaging Tools

Thu, 04/27/2017 - 09:46

WASHINGTON, D.C., April 27, 2017 — The Intelligence Advanced Research Projects Activity, within the Office of the Director of National Intelligence, announced today the Rapid Analysis of Various Emerging Nano-electronics—“RAVEN”—program, a multi-year research effort to develop tools to rapidly image current and future integrated circuit chips.

“As semiconductor technology continues to follow Moore’s Law, each new generation of chips has smaller geometries and more transistors. The ability to quickly image advanced chips has become extremely challenging. Maintaining this capability is critical for failure analysis, process manufacturing verification, and identification of counterfeit chips in these latest technologies,” said Carl E. McCants, RAVEN program manager at IARPA.

The goal of the RAVEN program is to develop a prototype analysis tool for acquiring the images and reconstructing all layers (up to 13 metal layers) from a 10-nanometer integrated circuit chip within an analysis area of 1 centimeter squared in less than 25 days. To be successful, the performer teams must create and integrate solutions to four primary challenges: image acquisition speed and resolution, rapid processing of extremely large file for image reconstruction, file manipulation and storage, and sample preparation.

The RAVEN program is divided into three phases. While each IARPA-funded research team offers a unique approach, the teams must achieve a demanding set of metrics for time, resolution, accuracy, and repeatability by the end of each phase.

Through a competitive Broad Agency Announcement process, IARPA has awarded research contracts in support of the RAVEN program to teams led by the University of Southern California-Information Sciences Institute, Varioscale, Inc., BAE Systems, and the Massachusetts Institute of Technology.

About IARPA

IARPA invests in high-risk, high-payoff research programs to tackle some of the most difficult challenges of the agencies and disciplines in the Intelligence Community. Additional information on IARPA and its research may be found on https://www.iarpa.gov.

Source: IARPA

The post IARPA Launches ‘RAVEN’ to Develop Rapid Integrated Circuit Imaging Tools appeared first on HPCwire.

Inspur launches 16 GPU capable AI computing box

Thu, 04/27/2017 - 09:40

On April 26, 2017, Inspur and Baidu jointly launched the super-large scale AI computing platform “SR-AI Rack” (SR-AI) for huge scale data sets and deep neural network at Inspur Partner Forum 2017 (IPF2017).

Inspur SR-AI Rack Computing Module

Compatible with China’s latest Scorpio 2.5 standard, Inspur SR-AI is the world’s first AI solution based on the interconnected architecture design of PCIe Fabric. The coordination between the PCI-E switch and I/O BOX, and physical decoupling and pooling of the GPU and CPU can realize large extension nodes for 16 GPUs. The solution can support a maximum of 64 GPUs with peak processing ability at 512TFlops, which is 5-10 times faster than regular AI solutions, making it possible to support model trainings with hundreds of billions of samples and trillions of parameters.

Shaking off close GPU/CPU coupling of traditional servers, Inspur SR-AI connects the uplink CPU computing/scheduling nodes and the downlink GPU Box through PCI-e Switch nodes. Such arrangement allows independent CPU/GPU expansion and avoids excessive component redundancy in traditional architecture upgrades. As a result, more than 5% of the cost can be saved and such advantage will become even more obvious as the scale expands, since no high-cost IT resources are needed in GPU extension.

Meanwhile, Inspur SR-AI is also a 100G RDMA GPU cluster. Its RDMA (Remote Direct Memory Access) technology can directly interact with the GPU and memory data without CPU computation, which has realized ns-level network delay in the cluster, 50% faster than that of traditional GPU expansion methods.

SR-AI Rack Topology

With continuous exploration in AI in recent years, Inspur has created strong computing platforms and innovation ability. Currently, Inspur is a supplier of the most diversified GPU (2U2/4/8) server arrays and accounted for more than 60% of the market share in AI computation in 2016. Thanks to the deep cooperation in system and application with Baidu, Alibaba, Tencent, iFLYTEK, Qihoo 360, Sogou, Toutiao, Face++, and other leading companies in AI, Inspur helps customers achieve substantial improvement in application performance in voice, images, videos, searching, and network.

Inspur provides users and partners with advanced computing platforms, system management tools, performance optimization tools and basic algorithm integration platform software, such as face and voice recognition and other regular algorithm components, as well as Caffe-MPI deep learning computing framework, and AI-Station deep learning system management tools. In addition, Inspur offers integrated solutions for scientific research institutions and other general users. The integrated deep learning machine D1000 released in 2016 is a multi-GPU server cluster system carrying Caffe-MPI.

The annual Inspur Partner Forum is an important event for Inspur’s partners. The IPF in 2017 was held in the Wuzhen Internet International Conference & Exhibition Center of Zhejiang province, China. The forum attracted around 2000 partners across the nation, including ISVs, SIs, and distributors from various sectors.

The post Inspur launches 16 GPU capable AI computing box appeared first on HPCwire.

Mellanox Reports First Quarter 2017 Results

Thu, 04/27/2017 - 08:26

SUNNYVALE, Calif. & YOKNEAM, Israel, April 27, 2017 — Mellanox Technologies, Ltd. (NASDAQ: MLNX) has announced financial results for its first quarter ended March 31, 2017.

“Our first quarter InfiniBand revenues were down year-over-year, impacted by delays in the general availability of next generation x86 CPUs, seasonal trends in high-performance computing, and technology transitions occurring across several end users and OEM customers. We believe InfiniBand has maintained share in HPC, and expect revenues will see sequential growth in the coming quarters driven by current backlog and additional pipeline opportunities,” said Eyal Waldman, president and CEO of Mellanox Technologies. “Our first quarter Ethernet revenues grew across all product families sequentially, driven by the adoption of our 25/50/100 gigabit solutions. We expect 2017 to be a growth year for Mellanox.”

First Quarter 2017 -Highlights

  • Revenues of $188.7 million decreased 4.1 percent, compared to $196.8 million in the first quarter of 2016.
  • GAAP gross margins of 65.8 percent in the first quarter, compared to 64.2 percent in the first quarter of 2016.
  • Non-GAAP gross margins of 71.7 percent in the first quarter, compared to 71.4 percent in the first quarter of 2016.
  • GAAP operating loss was $12.6 million, compared to $3.9 million in the first quarter of 2016.
  • Non-GAAP operating income was $15.7 million, or 8.3 percent of revenue, compared to $41.3 million, or 21.0 percent of revenue in the first quarter of 2016.
  • GAAP net loss was $12.2 million, compared to $7.2 million in the first quarter of 2016.
  • Non-GAAP net income was $14.7 million, compared to $39.3 million in the first quarter of 2016.
  • GAAP net loss per diluted share was $0.25 in the first quarter, compared to $0.15 in the first quarter of 2016.
  • Non-GAAP net income per diluted share was $0.29 in the first quarter, compared to $0.81 in the first quarter of 2016.
  • $35.0 million in cash was provided by operating activities, compared to $48.6 million in the first quarter of 2016.
  • Cash and investments totaled $325.2 million at March 31, 2017, compared to $328.4 million at December 31, 2016.

Second Quarter 2017 Outlook

We currently project:

  • Quarterly revenues of $205 million to $215 million
  • Non-GAAP gross margins of 70.5 percent to 71.5 percent
  • An increase in non-GAAP operating expenses of 3 percent to 5 percent
  • Share-based compensation expense of $17.3 million to $17.8 million
  • Non-GAAP diluted share count of 50.8 million to 51.3 million shares

Recent Mellanox Press Release Highlights

• April 24, 2017 Mellanox InfiniBand Delivers up to 250 Percent Higher Return on Investment for High Performance Computing Platforms • April 19, 2017 Mellanox Announces New Executive Appointments • April 18, 2017 Mellanox 25Gb/s Ethernet Adapters Chosen By Major ODMs to Enable Next Generation Hyperscale Data Centers • March 20, 2017 Mellanox Doubles Silicon Photonics Ethernet Transceiver Speeds to 200Gb/s • March 20, 2017 Mellanox Introduces New 100Gb/s Silicon Photonics Optical Engine Product Line • March 16, 2017 Mellanox Ships More Than 200,000 Optical Transceiver Modules for Next Generation 100Gb/s Networks • March 8, 2017 Mellanox to Showcase Cloud Infrastructure Efficiency with Production-Ready SONiC over Spectrum Open Ethernet Switches • March 7, 2017 Mellanox Enables Industry’s First PCIe Gen-4 OpenPOWER-Based Rackspace OCP Server with 100Gb/s Connectivity • March 7, 2017 Mellanox Announces Industry-Leading OCP-Based ConnectX-5 Adapters for Qualcomm Centriq 2400 Processor-Based Platforms • Feb 26, 2017 Mellanox and ECI Smash Virtual CPE Performance Barriers with Indigo-Based Platform

 

About Mellanox

Mellanox Technologies is a leading supplier of end-to-end InfiniBand and Ethernet interconnect solutions and services for servers and storage. Mellanox interconnect solutions increase data center efficiency by providing the highest throughput and lowest latency, delivering data faster to applications and unlocking system performance capability. Mellanox offers a choice of fast interconnect products: adapters, switches, software, cables and silicon that accelerate application runtime and maximize business results for a wide range of markets including high-performance computing, enterprise data centers, Web 2.0, cloud, storage and financial services. More information is available at www.mellanox.com.

Source: Mellanox

The post Mellanox Reports First Quarter 2017 Results appeared first on HPCwire.

Parallware Trainer 0.3 Now Available for Early Access

Thu, 04/27/2017 - 08:21

April 27, 2017 — Appentra today announced that Parallware Trainer 0.3 is now available through its Early Access Program. Users will have full and free access to the tool and will be able to learn parallel programming while improving their code.

New Features  in Parallware Trainer 0.3

New support for offloading to GPU and Xeon Phi using OpenMP 4.5

New suggestions for guided parallelization:

  • Ranking of strategies to parallelize reduction operations (scalar and sparse reductions)

  • List of variables to offload to the accelerator that need user intervention (e.g. specify array ranges for data transfers)

New support for multiple compiler suites: Intel, GNU and PGI

Improvements in usability for project management:

  • Show number of parallel versions of a sequential source file

  • Exclude files using regular expressions

  • Drag and drop from several file managers (e.g. Nautilus)

Bugfixes in compilation and execution of Fortran source code

Bugfixes in the GUI

Click here to join the Parallware Trainer Early Acces Program.

Source: Appentra

The post Parallware Trainer 0.3 Now Available for Early Access appeared first on HPCwire.

HERMES Team Simulates Health, Economic Impacts of Heat-Stable Vaccines

Wed, 04/26/2017 - 15:46

PITTSBURGH, April 26, 2017 — Health care workers in low-income nations often have to deliver vaccines on rugged footpaths, via motorcycle or over river crossings. On top of this, vaccines need to be kept refrigerated or they may degrade and become useless, which can make getting vaccines to mothers and children that need them challenging.

That’s why researchers at Doctors Without Borders and the HERMES Logistics Team of the Global Obesity Prevention Center at the Johns Hopkins Bloomberg School of Public Health and the Pittsburgh Supercomputing Center at Carnegie Mellon University carried out the first computer simulation of the health and economic impacts of introducing heat-stable vaccines in India and in Benin and Niger in Africa. The simulation offered good news. Not only would vaccines that don’t require refrigeration help increase vaccination rates in these countries, the cost savings of decreased spoilage and improved health would more than cover the cost of making the vaccines stable, even at twice or three times the current cost per dose.

Click here to read the full release from Doctors Without Borders.

About PSC 

The Pittsburgh Supercomputing Center is a joint effort of Carnegie Mellon University and the University of Pittsburgh. Established in 1986, PSC is supported by several federal agencies, the Commonwealth of Pennsylvania and private industry, and is a leading partner in XSEDE (Extreme Science and Engineering Discovery Environment), the National Science Foundation cyberinfrastructure program.

Source: PSC

The post HERMES Team Simulates Health, Economic Impacts of Heat-Stable Vaccines appeared first on HPCwire.

TACC Helps ROSIE Bioscience Gateway Expand its Impact

Wed, 04/26/2017 - 14:18

Biomolecule structure prediction has long been challenging not least because the relevant software and workflows often require high end HPC systems that many bioscience researchers lack easy access to. One bioscience gateway – ROSIE – has been established as part of XSEDE (Extreme Science and Engineering Discover Environment) to expand access to the popular Rosetta suite of prediction software; so far 5000 users have run more than 30,000 jobs and ROSIE organizers are hoping recent additions will further expand use.

A fundamental issue here is that bioscience researchers often face the twin hurdles of possessing limited computational expertise and having limited access to HPC. ROSIE – the Rosetta Online Server that Includes Everyone (quite the name) – lets researchers run their jobs using a straightforward interface and without necessarily knowing the work is being done on supercomputer resources such as TACC’s Stampede. The idea isn’t brand new. ROSIE is the latest morphing of what was the RosettaCommons.

An account (Rosetta Modeling Software and the ROSIE Science Gateway) of the expansion of ROSIE is posted on the TACC site.

Structure prediction is fundamental to much of bioscience research. Think of biomolecules as expert contortionists whose shape critically influences their function. For example, the 3D shape of a protein is critical to its function and is determined the sequence of its constituent amino acids; however predicting the shape from the amino acid sequence is (still) challenging and computationally intensive. The same can be said for many classes of biomolecules.

“One of the most widely used such [structure prediction] programs is Rosetta. Originally developed as a structure prediction tool more than 17 years ago in the laboratory of David Baker at the University of Washington, Rosetta has been adapted to solve a wide range of common computational macromolecular problems. It has enabled notable scientific advances in computational biology, including protein design, enzyme design, ligand docking, and structure predictions for biological macromolecules and macromolecular complexes,” according to the TACC article.

Jeffrey Gray, Johns Hopkins University

“The structure prediction problem is to take a sequence and ask, ‘What does it look like?'” said Jeffrey Gray, a professor of Chemical and Biomolecular Engineering at Johns Hopkins University and a collaborator on the project. “The design problem asks ‘What sequence would fold into this structure?’ That’s at the heart of Rosetta, but Rosetta does a lot of other things,” Gray said. Over the years, Rosetta evolved from a single tool, to a collection of tools, to a large collaboration called RosettaCommons, which includes more than 50 government laboratories, institutes, and research centers (only nonprofits).

Gray had used TACC resources as a graduate student in Texas in the late 1990s, so he knew about TACC and some of the other NSF supercomputing facilities. “We’ve been using Stampede and applied for it through XSEDE,” Gray said. “We have a Stampede allocation for my lab and we have a separate allocation for ROSIE.”

First described in PLOS One in May 2013, ROSIE continues to add new elements. In January 2017, a team of researchers, including Gray, reported in Nature Protocols on the latest additions to the gateway: antibody modeling and docking tools called RosettaAntibody and SnugDock that can run fully automated via the ROSIE web server or manually, with user control, on a personal computer or cluster.

Link to TACC article: https://www.tacc.utexas.edu/-/rosetta-modeling-software-and-the-rosie-science-gateway

The post TACC Helps ROSIE Bioscience Gateway Expand its Impact appeared first on HPCwire.

ASC17 Challenge Established A New HPL Record

Wed, 04/26/2017 - 09:40

On April 26, the very first day of the ASC Student Supercomputer Challenge (ASC17), Weifang University from China, setting new student competition HPL record with 31.70 TFLOPS. Weifang University is a common college from China Shandong Province, for them this is the second time to participate in ASC17 challenge and the first time finalist.

Final of ASC17 Challenge

The HPL test in ASC17 has strict rules for competing teams to complete the construction of a supercomputing system with a total power constraint of 3000W using equipment such as Inspur supercomputing nodes, high-speed networks and self-configuration accelerator cards provided by the organizing committee. The team from Weifang University designed a heterogeneous supercomputing system using 5 Inspur supercomputing servers and 10 P100 GPU accelerator cards to achieve a consistent floating-point performance of 31.7 TFLOPS.

The Weifang University team leader, Cao Jian, said that they only switched their team’s supercomputer design from a CPU to a GPU system at the very last minute just one day before the competition. Nevertheless, the team had prior experience with power consumption testing on GPU cards in the cluster back at school and they were very confident even with the last minute change in strategy. Ultimately, Weifang University was able to successfully control the power consumption of individual GPU cards to within 173W, more than 30% lower than the 250W power rating, proving the team’s strong hands-on capabilities. According to Cao Jian, even though it gives them great joy to break the world record for HPL computing performance, team members were most attracted by the opportunity to operate Sunway TaihuLight, China’s very own world number one supercomputer, as well as to exchange notes and learn from supercomputing talents the world over.

Team Weifang University

The ASC Student Supercomputer Challenge is initiated by China, which is the largest student supercomputer challenge in the world. This year the ASC17 Challenge is organized by ASC Community, Inspur, the National Supercomputing Centre in Wuxi, and Zhengzhou University,with 230 teams from all over the world having taken part in the competition. The 20 finalist teams will design and build a cluster under 3000W using Inspur supercomputing nodes to run AI-based traffic prediction, Falcon, LAMMPS, Saturne, benchmark HPL&HPCG, and run MASNUM on Sunway TaihuLight.

The post ASC17 Challenge Established A New HPL Record appeared first on HPCwire.

Messina Update: The U.S. Path to Exascale in 16 Slides

Wed, 04/26/2017 - 08:10

Paul Messina, director of the U.S. Exascale Computing Project, provided a wide-ranging review of ECP’s evolving plans last week at the HPC User Forum. The biggest change, of course, is ECP’s accelerated timetable with delivery of the first exascale machine now scheduled for 2021. While much of the material covered by Messina wasn’t new there were a few fresh details on the long awaited Path Forward hardware contracts and on progress to-date in other ECP fronts.

Paul Messina, ECP Director

“We have selected six vendors to be primes, and in some cases they have had other vendors involved in their R&D requirements. [We have also] been working on detailed statements of work because the dollar amounts are pretty hefty, the approval process [reaches] high up in the Department of Energy,” said Messina of the Path Forward awards. Five of the contracts are signed and the sixth is not far off. Even his slide had the announcement to be ready by COB April 14, 2017. “It would have been great to announce them at this HPC User Forum but it was not meant to be.” He said the announcements will be made public soon.

The duration of the ECP project has been shortened to seven years from ten years although there’s a 12-month schedule contingency built in to accommodate changes said Messina. Interestingly, during Q&A Messina was asked about U.S. willingness to include ‘individuals’ not based in the U.S. in the project. The question was a little ambiguous as it wasn’t clear if ‘individuals’ was intended to encompass foreign interests broadly, but Messina answered directly, “[For] people who are based outside the U.S. I would say the policy is they are not included.”

Presented here are a handful of Messina’s slides updating the U.S. march towards exascale computing – most of the talk dwelled on software related challenges – but first it’s worth stealing a few Hyperion Research (formerly IDC) observations on the global exascale race that were also presented during the forum. The rise of national and regional competitive zeal in HPC and the race to exascale is palpable as evidenced by Messina’s comment on U.S. policy.

China is currently ahead in the race to stand up an exascale machine first, according to Hyperion. That’s perhaps not surprising given its recent dominance of the Top500 list. Japan is furthest along in settling on a design, key components, and contractor. Here are two Hyperion slides summing up the world race. (see HPCwire article, Hyperion (IDC) Paints a Bullish Picture of HPC Future, for full rundown of HPC trends)

Messina emphasized the three-year R&D projects (Path Forward) are intended to result in better hardware at the node level, memory, system level and energy consumption, and programmability. Moreover, ECP is looking past the initial exascale systems. “The idea is that after three years hopefully the successful things will become part of [vendors’] product lines and result in better HPC systems for them not just for the initial exascale systems,” he said. The RFPs for the exascale systems themselves will come from the labs doing the buying.

The ECP is a collaborative effort of two U.S. Department of Energy organizations, the Office of Science (DOE-SC) and the National Nuclear Security Administration (NNSA). Sixteen of seventeen national labs are participating in ECP and the six who have traditionally fielded leadership HPC system – Argonne, Oak Ridge, Lawrence Livermore, Sandia, Los Alamos, and Lawrence Berkeley National Laboratories – form the core partnership and signed a memorandum of agreement on cooperation defining roles and responsibilities.

Under the new schedule, “We will have an initial exascale system delivered in 2021 ready to go into production in 2022 and which will be based on advanced architecture which means really that we are open to something that is not necessarily a direct evolution of the systems that are currently installed at the NL facilities,” explained Messina.

“Then the ‘Capable Exascale’ systems, which will benefit from the R&D we do in the project, we currently expect them to be delivered in 2022 and available in 2023. Again these are at the facilities that normally get systems. Lately it’s been a collaboration of Argonne, Oak Ridge and Livermore, that roughly every four years establish new systems. Then [Lawrence] Berkeley, Los Alamos and Sandia, which during the in between years installs systems.”  Messina again emphasized, “It is the facilities that will be buying the systems. [The ECP] be doing the R&D to give them something that is hopefully worth buying.”

Four key technical challenges are being addressed by the ECP to deliver capable exascale computing:

  • Parallelism a thousand-fold greater than today’s systems
  • Memory and storage efficiencies consistent with increased computational rates and data movement requirements
  • Reliability that enables system adaptation and recovery from faults in much more complex system components and designs
  • Energy consumption beyond current industry roadmaps, which would be prohibitively expensive at this scale

Another important ECP goal, said Messina, is to kick the development of U.S. advanced computing into a new higher trajectory (see slide below).

From the beginning the exascale project has steered clear of FLOPS and LINPACK as the best measure of success. That theme has only grown stronger with attention focused on defining success as performance on useful applications and the ability to tackle problems that are intractable on today’s Petaflops machines.

“We think of 50x times that performance on applications [as the exascale measure of merit], unfortunately there’s a kink in this,” said Messina. “The kink is people won’t be running todays jobs in these exascale systems. We want exascale systems to do things we can’t do today and we need to figure out a way to quantify that. In some cases it will be relatively easy – just achieving much greater resolutions – but in many cases it will be enabling additional physics to more faithfully represent the phenomena. We want to focus on measuring every capable exascale based on full applications tackling real problems compared to what they can do today.”

“This list is a bit of an eye chart (above, click to enlarge) and represents the 26 applications that are currently supported by the ECP. Each of them, when selected, specified a challenge problem. For example, it wasn’t just a matter of saying they’ll do better chemistry but here’s a specific challenge that we expect to be able to tackle when the first exascale systems are available,” said Messina

One example is GAMESS (General Atomic and Molecular Electronic Structure System) an ab initio quantum chemistry package that is widely used. The team working on GAMESS has spelled out specific problems to be attacked. “It’s not only very good code but they have ambitious goals; if we can help that team achieve its goals exascale for the games community code, the leverage is huge because it has all of those users. Now not all of them need exascale to do their work but those that do will be able to do it quickly and more easily,” said Messina.

GAMESS is also a good example of a traditional FLOPS heavy numerical simulation application. Messina reviewed four other examples (earthquake simulation, wind turbine applications, energy grid management optimization, and precision medicine). “The last one that I’ll mention is a collaboration between DoE and NIH and NCI on cancer as you might imagine,” said Messina. “It is extremely important for society and also quite different to traditional partial differential equation solving because this one will rely on deep learning and use of huge amounts of data – being able to use millions of patient records on types of cancer and the treatments they received and what the outcome was as well as millions of potential cures.”

Data analytics is a big part of these kinds of precision medicine applications, said Messina. When pressed on whether the effort to combine traditional simulation with deep learning would inevitably lead to diverging architectures, Messina argued for the contrary: “One of our second level goals is try to promote convergence as opposed divergence. I don’t know that we’ll be successful in that but that’s what we are hoping. [We want] to understand that better because we don’t have a good understanding of deep learning and data analytics.”

Co-design also been a priority and received a fair amount of attention. Doug Kothe, ECP director of applications development, is spearheading those efforts. Currently there are five co-design centers including a new one focused on graph analytics. All of the teams have firm milestones, including some shared milestones with other ECP effort to ensure productive cooperation.

Messina noted that, “Although we will be measuring our success based on whole applications, in the meantime you can’t always deal with the whole application, so we have proxies and sub projects. The vendors need this and we will need it to guide our software development.”

Ensuring resilience is a major challenge given the exascale system complexity. “On average the user should notice a fault on the order of once a week. There may be faults every hour but the user shouldn’t see them more than once a week,” said Messina. This will require, among other things, a robust capable software stack, “otherwise it’s a special purpose system or a system that is very difficult to use.”

Messina showed a ‘notional’ software stack slide (below, click to enlarge). “Resilience and workflows are on the side because we believe they influence all aspects of the software stack. In a number of areas we are investing in several approaches to accomplish the same functionality. At some point we will narrow things down. At the same time we feel we probably have some gaps, especially in the data issues, and are in the process of doing a gap analysis for our software stack,” he said.

Clearly it’s a complex project with many parts. Integration of all these development activities is an ongoing challenge.

“You have to work together so the individual teams have shared milestones. Here’s one that I selected simply because it was easy to describe. By the beginning of next calendar year [we should have] new prototype APIs to have coordination between MPI and OpenMPI runtimes because this is an issue now in governing the threads and messages when you use both programming models which a fair number of applications do. How is this going to work? So the software team doing this will interact with a number of application development teams to make sure we understand their runtime requirements. We can’t wait until we have the exascale systems to sort things out.”

“We also want to be able to measure how effective the new ideas are likely to be and so we are also launched a design space evaluation effort,” said Messina. The ECP project has actively sought early access to several existing resources for use during development.

These are just a few of the topics Messina touched on. Workforce training is another key issue that ECP is tackling. It is also increasing it communications and outreach efforts as shown below. There is, of course, an ECP web site and recently ECP launched newsletter expected to be published roughly monthly.

With the Path Forward awards coming soon, several working group meetings having been held, and the new solidified plan, the U.S. effort to reach exascale computing is starting to feel concrete. It will be interesting to see how well ECP’s various milestones are hit. The last slide below depicts the overall program. Hyperion indicated it will soon post all of Messina’s slides (and other presentations from HPC User Forum) on the HPC User Forum site.

The post Messina Update: The U.S. Path to Exascale in 16 Slides appeared first on HPCwire.

R Systems Sponsors rLoop Team as Part of SpaceX Hyperloop Pod Competition

Wed, 04/26/2017 - 08:08

CHAMPAIGN, Ill., April 26, 2017 — R Systems NA, Inc. confirmed today its sponsorship of the rLoop Hyperloop team participating in SpaceX Competition II.

Founded in 2002, SpaceX is a private space exploration company sponsoring the Hyperloop Pod       Competition. The competition aims to facilitate the development of the high speed Hyperloop transportation system by encouraging independent groups to develop functional prototypes. While many teams participating in the competition are student organizations, rLoop is a unique collaboration of over 100 members spanning more than 14 countries.

R Systems, a bare metal high performance computing provider, began sponsoring rLoop’s pod design in March 2016. “We were very excited when rLoop won the Pod Innovation Award at the SpaceX Competition in January,” said R Systems co-Founder Brian Kucic. “We are pleased to be able to support their ongoing efforts to revolutionize transportation, and to be a part of rLoop’s impressive global collaborative efforts.”

As the only non-student team among the 20 finalist teams to participate in SpaceX’s January Hyperloop competition, rLoop did not have ready access to university HPC resources. “We were looking for more CPU power to run the advanced simulations being developed by our team, and R Systems offered to provide exactly what we needed,” said rLoop Project Manager, Brent Lessard. “The R Systems support team had us running on their utility cluster in no time. We now have easy access to significant HPC resources to run our jobs at any time of the day or night.”

Amir Khan, rLoop’s design and analysis lead is pleased to hear about R System’s continued sponsorship. Khan said, “The ability of our engineers to access this powerful resource whenever needed provides us with a valuable advantage.”

According to R Systems Senior Systems Engineer Steve Pritchett, the relationship with rLoop has been very positive. Pritchett commented, “After we provided login access and installed necessary software, the rLoop users were able to focus on their pod design rather than dealing with system management issues.”

The second phase of the Hyperloop Competition focuses on maximum pod speed. R Systems coFounder Greg Keller says the company is a perfect match for the rLoop team. “At R Systems, we are intimately familiar with the goal of achieving maximum speed,” Keller said. “Our people and clusters quickly move our customer’s vision forward to conquer big challenges,” Keller added.

The SpaceX Competition II is scheduled for completion in mid-2017.

About Hyperloop

Hyperloop is a conceptual, high-speed transportation system proposed by SpaceX CEO Elon Musk. The concept involves passenger or cargo boarded pods being propelled in a low pressure tube using sustainable and cost-efficient energy. To accelerate its development, SpaceX is hosting a competition for engineering teams to design and test their own Hyperloop pods. More information about this innovative technology is available at http://www.spacex.com/hyperloop.

About rLoop

rLoop is a non-profit, open source participant in the SpaceX Hyperloop competition. With a mission to democratize the Hyperloop through collective design, rLoop has gained more than 100 members from over 14 countries. Learn more about how rLoop is revolutionizing transportation at http://rloop.org/.

About R Systems

R Systems is a service provider of high performance computing resources. The company empowers research by providing leading edge technology with a knowledgeable tech team, delivering the best performing result in a cohesive working environment. Offerings include lease-time for bursting as well as for short-term and long-term projects, available at industry-leading prices.

Source: R Systems

The post R Systems Sponsors rLoop Team as Part of SpaceX Hyperloop Pod Competition appeared first on HPCwire.

Asetek Reports Financial Results for Q1 2017

Wed, 04/26/2017 - 08:04

OSLO, Norway, April 26, 2017 — Asetek reported revenue of $11.5 million in the first quarter of 2017, a 10% increase from the first quarter 2016. The change from prior year reflects an increase in desktop revenue driven by shipments in the DIY market, partly offset by a decline in data center revenue.

  • Quarterly revenue growth of 10% driven by high-end gaming cooling demand
  • New orders and development agreement with undisclosed major player reflect increased end-user adoption in data center segment
  • Shipments of Asetek’s sealed loop coolers surpassed 4 million units since inception
  • Continued positive EBITDA
  • Cash dividend of NOK 1.00 per share approved at AGM
  • Reaffirming 2017 expectations of moderate desktop segment and significant data center revenue growth

“We are making good progress in our emerging data center business with several new orders and the announced signing of a development agreement with an undisclosed partner. It confirms that we are on track to meet our ambition of increasing end-user adoption in the data center market. Our desktop business segment delivered another quarter of revenue growth on demand from the high-end gaming market,” says André Sloth Eriksen, Chief Executive Officer.

Gross margin for the first quarter was 38.5%, compared with 39% in the first quarter of 2016 and 37% in the fourth quarter 2016. The EBITDA was $0.7 million in the first quarter 2017, compared with EBITDA of $1.2 million in the first quarter of 2016.

Desktop revenue was $11.1 million in the first quarter, an increase of 17% from the same period of 2016. Operating profit from the desktop segment was $3.4 million, an increase from $2.8 million in the same period last year, due to an increase in DIY product sales.

Data center revenue was $0.4 million, a decrease from $1.0 million in the prior year due to fewer shipments to OEM customers. This variability is expected while the Company secures new OEM partners and growth of end-user adoption through existing OEM partners.

Asetek continued to invest in its data center business and the segment operating loss was $1.8 million for the first quarter, compared with $1.0 million in the same period of 2016. Expenditures relate to technology development, and sales and marketing activities with data center partners and OEM customers.

Through new and repeat orders received from existing data center OEM partners in the first quarter and more recently in April, Asetek is increasing its end-user adoption with technology deployed to new HPC installations.

In February Asetek signed a development agreement with an undisclosed major player in the data center market and expects this agreement to result in new products in 2017 which will drive long-term data center revenue.

Asetek reaffirms its annual outlook for the full year 2017, anticipating moderate growth in the desktop business and significant revenue growth in the data center segment, when comparing to 2016.

The proposal of a cash dividend of NOK 1.00 per share was approved by the AGM.

Source: Asetek

The post Asetek Reports Financial Results for Q1 2017 appeared first on HPCwire.

Supermicro Announces 25/100Gbps Networking Solutions

Wed, 04/26/2017 - 08:00

SAN JOSE, Calif., April 26, 2017 — Super Micro Computer, Inc. (NASDAQ: SMCI) has announced general availability of Mellanox, Broadcom and Intel-based 100Gbps and 25Gbps standard networking cards and onboard SIOM solutions, 25Gbps MicroLP networking cards, and onboard riser cards optimized for the Ultra SuperServer.

Supermicro networking modules deliver high bandwidth and industry-leading connectivity for performance-driven server and storage applications in the most demanding Data Center, HPC, Cloud, Web2.0, Machine Learning and Big Data environments. Clustered databases, web infrastructure, and high frequency trading are just a few applications that will achieve significant throughput and latency improvements resulting in faster access, real-time response and virtualization enhancements with this generation of industry leading Supermicro solutions.

Supermicro’s range of 25Gbps and 100Gbs interface solutions:
https://www.supermicro.com/products/accessories/index.cfm?Type=20

Supermicro’s 25/100Gbps networking solutions offer high performance and efficient network fabrics, covering a range of application optimized products. These interfaces provide customers with networking alternatives optimized for their applications and data center environments. The AOC-S Standard LP series cards are designed for any Supermicro server with a PCI-E x8 (for 25G) or PCI-E x16 (for 100G) expansion slot. The AOC-C MicroLP add-on card is optimized for Supermicro high-density FatTwin and MicroCloud SuperServers. The Supermicro AOC-M flexible, cost-optimized 25/100Gbps onboard SIOM series cards support the Supermicro TwinPro, BigTwin, Simply Double and 45/60/90-Bay Top-Load SuperStorage, plus 7U 8-Way SuperServer. The Supermicro Ultra series utilizes the AOC-U series onboard riser cards. These 25G and 100G modules are fully compatible with Supermicro and other comparable industry switch products.

25/100G MODULE

TYPE

DESCRIPTION

INTERFACE

PORTS

CONTROLLER

SUPPORTED
SUPERMICRO
SERVERS

AOC-SHFI-i1C

STANDARD LP

OMNI-PATH 100GBPS

PCI-E 3.0 X16

1 QSFP28

INTEL

ALL W/ PCI-E 3.0 X16

AOC-MHFI-i1C/M

ONBOARD

OMNI-PATH 100GBPS

PCI-E 3.0 X16

1 QSFP28

INTEL

BIGTWIN, TWINPRO, SUPERSTORAGE, 8-WAY

AOC-S100G-m2C

STANDARD LP

DUAL-PORT 100GBE

PCI-E 3.0 X16

2 QSFP28

MELLANOX

ALL W/ PCI-E 3.0 X8

AOC-S25G-b2S

STANDARD LP

DUAL-PORT 25GBE

PCI-E 3.0 X8

2 SFP28

BROADCOM

ALL W/ PCI-E X8

AOC-S25G-m2S

STANDARD LP

DUAL-PORT 25GBE

PCI-E 3.0 X8

2 SFP28

MELLANOX

ALL W/ PCI-E 3.0 X8

AOC-S25G-i2S

STANDARD LP

DUAL-PORT 25GBE

PCI-E 3.0 X8

2 SFP28

INTEL

ALL W/ PCI-E 3.0 X8

AOC-C25G-m1S

MICROLP

SINGLE-PORT 25GBE

PCI-E 3.0 X8

1 SFP28

MELLANOX

FATTWIN, MICROCLOUD

AOC-MH25G-m2S2T/M

ONBOARD

DUAL-PORT 25GBE

PCI-E 3.0 X16

2 SFP28

MELLANOX

BIGTWIN, TWINPRO, SUPERSTORAGE, 8-WAY

AOC-M25G-m4S/M

ONBOARD

QUAD-PORT 25GBE

PCI-E 3.0 X16

4 SFP28

MELLANOX

BIGTWIN, TWINPRO, SUPERSTORAGE, 8-WAY

AOC-URN4-m2TS

ONBOARD

DUAL-PORT 25GBBE

PCI-E 3.0 X16

2 SFP28

MELLANOX

1U ULTRA

AOC-URN4-i2TS

ONBOARD

DUAL-PORT 25GBE

PCI-E 3.0 X8

2 SFP28

INTEL

1U ULTRA

AOC-2UR68-m2TS

ONBOARD

DUAL-PORT 25GBBE

PCI-E 3.0 X8

2 SFP28

MELLANOX

2U ULTRA

Supermicro 25/100G Ethernet Modules

“With 2.5 times the bandwidth of 10G, less than half the cost of 40G, and incorporating Remote Direct Memory Access for low latency with backward compatibility with 10G switches, the industry leading 25GbE capability that Supermicro offers our customers provides the highest scalability and potential for future growth,” said Charles Liang, President and CEO of Supermicro. “We believe that 100G, having a clear upgrade path from 25G, is the natural next step in the evolution of modern high-performance converged data center server/storage deployments for our customers as they experience ever higher demands on their data center I/O infrastructures.”

Dual- and Single-port Modules supporting 100Gbps
AOC-SHFI-i1C Omni-Path Standard Card
Designed for HPC, this card uses an advanced “on-load” design that automatically scales fabric performance with rising server core counts, making these adapters ideal for today’s increasingly demanding workloads with 100Gbps link speed, single QSFP28 connector, PCI-E 3.0 x16 slot and standard low-profile form factor.

AOC-MHFI-i1C/M Onboard Omni-Path SIOM Card
Designed specifically for HPC utilizing the Intel OP HFI ASIC, this card offers 100Gbps link speeds for Supermicro servers that support the SIOM interface.

AOC-S100G-m2C Standard Card
This card offers dual-port QSFP+ connectivity in a low-profile, short length standard form factor with PCI-E 3.0 x16 slot. Utilizing the Mellanox ConnectX-4 EN chipset with features such as VXLAN and NVGRE, this card offers network flexibility, high bandwidth with specific hardware offload for I/O virtualization, and efficiently optimizes bandwidth demand from virtualized infrastructure in the data center or cloud deployments.

Quad-, Dual- and Single-port Modules supporting 25Gbps
AOC-S25G-b2S Standard Card
Based on the Broadcom BCM57414 chipset with features such as RDMA, NPAR, VXLAN and NVGRE, it is backward compatible with 10GbE network and the most cost effective upgrade from 10GbE to 25GbE in data center or cloud deployments.

AOC-S25G-m2S Standard Card
This is a dual-port 25GbE controller that can be used in any Supermicro server with a PCI-E 3.0 x8 expansion slot. Based on the Mellanox ConnectX-4 Lx EN chipset with features such as RDMA and RoCE, it is backward compatible with 10GbE networks and addresses bandwidth demand from virtualized infrastructure.

AOC-S25G-i2S Standard Card
This card is implemented with the Intel XXV710. It is fully compatible with existing 10GbE networking infrastructures but doubles the available bandwidth. The 25GbE bandwidth enables rapid networking deployment in an agile data center environment.

AOC-C25G-m1S MicroLP Card
This card is based on the Mellanox ConnectX-4 Lx EN controller. It is the solution for Supermicro high density MicroCloud and Twin series servers.

AOC-MH25G-m2S2T/M  Onboard SIOM Card
This is a proprietary SIOM (Supermicro I/O module) card based on Mellanox ConnectX-4 Lx EN and optimized for SuperServers with SIOM support. Optimized for Supermicro BigTwin, TwinPro, and SuperStorage products.

AOC-M25G-m4S/M Onboard SIOM Card
This is one of the most feature rich 25GbE controllers in the market. Based on the Mellanox ConnectX®-4 Lx EN, with 4-ports of 25GbE SFP28 connectivity in small form factor SIOM, it provides density, performance, and functionality. Optimized for Supermicro BigTwin, TwinPro, and SuperStorage products.

AOC-URN4-m2TS Onboard 1U Ultra Riser Card
Mellanox ConnectX-4 Lx EN, 2 ports, 2 SFP28, onboard 1U Ultra Riser

AOC-URN4-i2TS Onboard 1U Ultra Riser Card
Intel XXV710, 2 ports, 2 SFP28, onboard 1U Ultra Riser

AOC-2UR68-m2TS Onboard 2U Ultra Riser Card
Mellanox ConnectX-4 Lx EN, 2 ports, 2 SFP28, onboard 2U Ultra Riser

About Super Micro Computer, Inc.

Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology is a premier provider of advanced server Building Block Solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and Embedded Systems worldwide. Supermicro is committed to protecting the environment through its “We Keep IT Green” initiative and provides customers with the most energy-efficient, environmentally-friendly solutions available on the market.

Source: Supermicro

The post Supermicro Announces 25/100Gbps Networking Solutions appeared first on HPCwire.

Cray Signs Solutions Provider Agreement With Mark III Systems

Wed, 04/26/2017 - 01:46

SEATTLE and HOUSTON, April 26, 2017 — Global supercomputer leader Cray Inc. today announced the Company has signed a solutions provider agreement with Mark III Systems, Inc. to develop, market and sell solutions that leverage Cray’s portfolio of supercomputing and big data analytics systems.

Headquartered in Houston, Texas, Mark III Systems is a leading enterprise IT solutions provider focused on delivering IT infrastructure, software, services, cloud, digital, and cognitive solutions to a broad array of enterprise clients. The company’s BlueChasm digital development unit is focused on building and running open digital, cognitive, and AI platforms in partnership with enterprises, institutions, service providers, and software and cloud partners.

Mark III Systems can now combine the design, development, and engineering expertise of its BlueChasm team with the data-intensive computing capabilities of the Cray® XC™, Cray CS™, and Urika®-GX systems, and offer enterprise IT customers customized solutions across a wide range of commercial use cases.

“We’re very excited to be partnering with Cray to deliver unique platforms and data-driven solutions to our joint clients, especially around the key opportunities of data analytics, artificial intelligence, cognitive compute, and deep learning,” said Chris Bogan, Mark III’s director of business development and alliances.  “Combined with Mark III’s full stack approach of helping clients capitalize on the big data and digital transformation opportunities, we think that this partnership offers enterprises and organizations the ability to differentiate and win in the marketplace in the digital era.”

“Solution providers are a key part of Cray’s go-to-market strategy,” said Fred Kohout, Cray’s senior vice president of products and chief marketing officer. “We’re thrilled to be partnering with Mark III as they bring the expertise to develop and deliver differentiated solutions that leverage Cray’s supercomputing infrastructure and deliver superior value to our respective customers.”

For more information on Cray’s partner initiatives, please visit the Cray website at www.cray.com.

About Mark III Systems

Mark III Systems is a long-time, industry-leading IT solutions provider delivering IT infrastructure, software, services, cloud, digital, and cognitive solutions to enterprises, institutions, and service provider clients across North America.  With a diverse team of developers, DevOps engineers, enterprise architects, and systems engineers, Mark III’s areas of expertise include IT infrastructure, datacenter, HPC, data analytics, security, DevOps, IoT, AI, cognitive, and cloud.  Whether it be optimizing the performance and resiliency of an existing business-critical tech stack, or building a next-generation digital stack for data analytics, AI, IoT, or mobile use cases, Mark III’s “full stack” approach helps clients stand out and win in the era of digital transformation.  For more information, visit www.markiiisys.com.

About Cray Inc.

Global supercomputing leader Cray Inc. (Nasdaq:CRAY) provides innovative systems and solutions enabling scientists and engineers in industry, academia and government to meet existing and future simulation and analytics challenges. Leveraging more than 40 years of experience in developing and servicing the world’s most advanced supercomputers, Cray offers a comprehensive portfolio of supercomputers and big data storage and analytics solutions delivering unrivaled performance, efficiency and scalability. Cray’s Adaptive Supercomputing vision is focused on delivering innovative next-generation products that integrate diverse processing technologies into a unified architecture, allowing customers to meet the market’s continued demand for realized performance. Go to www.cray.com for more information.

The post Cray Signs Solutions Provider Agreement With Mark III Systems appeared first on HPCwire.

MSST 2017 Announces Conference Themes, Keynote

Tue, 04/25/2017 - 15:00

April 25, 2017 — The 33rd International Conference on Massive Storage Systems and Technology (MSST 2017) will dedicate five days to computer-storage technology, including a day of tutorials, two days of invited papers, two days of peer-reviewed research papers, and a vendor exposition. The conference will be held on the beautiful campus of Santa Clara University, in the heart of Silicon Valley May 15-19, 2017.

Kimberly Keeton, Hewlett Packard Enterprise, will keynote:

Data growth and data analytics requirements are outpacing the compute and storage technologies that have provided the foundation of processor-driven architectures for the last five decades. This divergence requires a deep rethinking of how we build systems, and points towards a memory-driven architecture, where memory is the key resource and everything else, including processing, revolves around it.

Memory-driven computing (MDC) brings together byte-addressable persistent memory, a fast memory fabric, task-specific processing, and a new software stack to address these data growth and analysis challenges. At Hewlett Packard Labs, we are exploring MDC hardware and software design through The Machine. This talk will review the trends that motivate MDC, illustrate how MDC benefits applications, provide highlights from our Machine-related work in data management and programming models, and outline challenges that MDC presents for the storage community.

Themes for the conference this year include:

  • Emerging Open Source Storage System Design for Hyperscale Computing
  • Leveraging Compression, Encryption, and Erasure Coding Chip
  • Hardware Support to Construct Large Scale Storage Systems
  • The Limits of Open Source in Large-Scale Storage Systems Design
  • Building Extreme-Scale SQL and NoSQL Processing Environments
  • Storage Innovation in Large HPC Data Centers
  • How Large HPC Data Centers Can Leverage Public Cloud for Computing and Storage
  • Supporting Extreme-Scale Name Spaces with NAS Technology
  • Storage System Designs Leveraging Hardware Support
  • How Can Large Scale Storage Systems Support Containerization?
  • Trends in Non-Volatile Media

For registration and the full agenda visit the MSST 2017 website: http://storageconference.us

Source: MSST

The post MSST 2017 Announces Conference Themes, Keynote appeared first on HPCwire.

Cycle Computing Flies Into HTCondor Week

Tue, 04/25/2017 - 07:38

NEW YORK, April 25, 2017 — Cycle Computing today announced that it will address attendees at HTCondor Week 2017, to be held May 2-5 in Madison, Wisconsin. Cycle will also be sponsoring a reception for attendees, slated for Wednesday, May 3rd from 6:00 pm to 7:00 pm at the event in Madison.

Cycle’s Customer Operations Manager, Andy Howard, will present “Using Docker, HTCondor, and AWS for EDA model development” Thursday, May 4th at 1:30 pm. Andy’s session will detail how a Cycle Computing customer used HTCondor to manage Docker containers in AWS to increase productivity, throughput, and reduce overall time-to-results.

HTCondor develops, implements, deploys, and evaluates mechanisms and policies that support High Throughput Computing (HTC). Guided by both the technological and sociological challenges of such a computing environment, the Center for High Throughput Computing at UW-Madison continues to build the open source HTCondor distributed computing software and related technologies to enable scientists and engineers to increase their computing throughput. An extension of that research is HTCondor Week, the annual conference for the HTCondor batch scheduler, featuring presentations from developers and users in academia and industry. The conference gives collaborators and users the chance to exchange ideas and experiences, to learn about the latest research, to experience live demos, and to influence HTCondor’s short and long term research and development directions.

“At Cycle we have a great deal of history and context for HTCondor. Even today, some of our largest customers are using HTCondor under the hood in their cloud environments,” said Jason Stowe, CEO, Cycle Computing. “Simply put, HTCondor is an important scheduler to us and to our customers. We’re happy to remain part of the HTCondor community and support it with our presentation and the reception.”

Cycle Computing’s CycleCloud orchestrates Big Compute and Cloud HPC workloads enabling users to overcome the challenges typically associated large workloads. CycleCloud takes the delays, configuration, administration, and sunken hardware costs out of HPC clusters. CycleCloud easily leverages multi-cloud environments moving seamlessly between internal clusters, Amazon Web Services, Google Cloud Platform, Microsoft Azure and other cloud environments.

More information about the CycleCloud cloud management software suite can be found at www.cyclecomputing.com.

ABOUT CYCLE COMPUTING

Cycle Computing is the leader in Big Compute software to manage simulation, analytics, and Big Data workloads. Cycle turns the Cloud into an innovation engine for your organization by providing simple, managed access to Big Compute. CycleCloud is the enterprise software solution for managing multiple users, running multiple applications, across multiple clouds, enabling users to never wait for compute and solve problems at any scale. Since 2005, Cycle Computing software has empowered customers in many Global 2000 manufacturing, Big 10 Life Insurance, Big 10 Pharma, Big 10 Hedge Funds, startups, and government agencies, to leverage hundreds of millions of hours of cloud based computation annually to accelerate innovation. For more information visit: www.cyclecomputing.com

Source: Cycle Computing

The post Cycle Computing Flies Into HTCondor Week appeared first on HPCwire.

IBM, Nvidia, Stone Ridge Claim Gas & Oil Simulation Record

Tue, 04/25/2017 - 06:30

IBM, Nvidia, and Stone Ridge Technology today reported setting the performance record for a “billion cell” oil and gas reservoir simulation. Using IBM Minsky servers with Nvidia P100 GPUs and Stone Ridge’s ECHELON petroleum reservoir simulation software, the trio say their effort “shatters previous (Exxon) results using one-tenth the power and 1/100th of the space. The results were achieved in 92 minutes with 60 Power processors and 120 GPU accelerators and broke the previous published record (Aramco) of 20 hours using thousands of processors.”

The ‘billion cell” simulation represents a heady challenge typically tackled with ‘supercomputer’ class HPC infrastructure. The Minsky, of course, is the top of IBM’s Power server line and leverages Nvidia’s fastest GPU and NVLink interconnect. This simulation used 60 processors and 120 accelerators. IBM owed the systems – each Minsky had two Power8 CPUs with 256GB of memory, four Nvidia P100 GPUs, and InfiniBand EDR.

Reservoir simulation
Source: Stone Ridge

“This calculation is a very salient demonstration of the computational capability and density of solution that GPUs offer. That speed lets reservoir engineers run more models and ‘what-if’ scenarios than previously so they can produce oil more efficiently, open up fewer new fields and make responsible use of limited resources,” said Vincent Natoli, president of Stone Ridge Technology, in the official announcement. “By increasing compute performance and efficiency by more than an order of magnitude, we’re democratizing HPC for the reservoir simulation community.”

According to the collaborators, the data set was taken from public information and used to mimic large oil fields like those found in the middle east. Key code optimization included taking advantage of the CPU-GPU NVLink and GPU-GPU NVLink in the Power systems and also scaling the software to take advantage of 10s of Minsky systems in an HPC cluster.

The new solution, say the collaborators, is intended “to transform the price and performance for business critical High Performance Computing (HPC) applications for simulation and exploration.” The performance is impressive but not overly cheap. IBM estimates the cost of the 30 Minsky systems in the range of $1.5 million to $2 million. ECHELON is a standard Stone Ridge product and IBM and Stone Ridge plan to jointly sell the new solution into the oil and gas market.

Sumit Gupta, IBM

Sumit Gupta, IBM vice president, High Performance Computing & Analytics, said, “The bottom line is that by running ECHELON on Minsky, users can achieve faster run-times using a fraction of the hardware. One recent effort used more than 700,000 processors in a server installation that occupies nearly half a football field. Stone Ridge did this calculation on two racks of IBM machines that could fit in the space of half a ping-pong table.”  

IBM has been steadily ratcheting up efforts to showcase its Power systems – including Minsky – as it tries to wrestle market share in an x86 dominated landscape. Last month, the company spotlighted another Power8-based system – VOLTRON at Baylor College – which researchers used to assemble the 1.2 billion letter genome of the mosquito that carries the West Nile virus.

IBM and its collaborators argue “this latest advance” challenges misconceptions that GPUs can’t be efficient on complex application codes such as reservoir simulators and are better suited to simple, more naturally parallel applications such as seismic imaging.

They do note, “Billion cell models in the industry are rare in practice, but the calculation was accomplished to highlight the growing disparity in performance between new fully GPU based codes like ECHELON and equivalent legacy CPU codes. ECHELON scales from the cluster to the workstation and while it can turn over a billion cells on 30 servers, it can also run smaller models on a single server or even on a single Nvidia P100 board in a desktop workstation, the latter two use cases being more in the sweet spot for the industry.”

The post IBM, Nvidia, Stone Ridge Claim Gas & Oil Simulation Record appeared first on HPCwire.

ASC17 Championship to Challenge Front-end Science

Tue, 04/25/2017 - 01:01

AI, challenge a Gordon Bell Prize application, optimize the latest third generation sequencing assembly tool, attempt to revitalize traditional scientific computing software on a quantum computing platform. All these sound like what a team of top engineers would do, but the truth is that these are the challenges that groups of university students, with an average age of 20 years old, need to overcome in the finals of the 2017 ASC Student Supercomputer Challenge (ASC17). The finals of this tournament are scheduled to be held at the National Supercomputing Center in Wuxi, China, from April 24 to 28, where 20 teams from around the world will compete to be crowned the champion.

In the ASC17 finals, the competitors have to use PaddlePaddle framework to accurately predict the traffic situation in a city for a particular day in the future. This requires each team to design and build an intelligent “brain” on their own, and then employ high-intensity training to coach this “brain” to come up with the results. They also need to ensure that the training is efficient and the trained “brains” will have a high recognition accuracy.

MASNUM, which is the third generation oceanic wave numerical model developed by China and was nominated for the Gordon Bell Prize. For compatibility with these top applications, the participants will get to perform their calculations using the world’s fastest supercomputer, Sunway TaihuLight, in the finals, as they attempt to extend parallel calculations in the software to 10,000 computing cores or more.

Currently for third-generation gene sequencers, each sequencing can generate as many as hundreds of thousands of gene fragments. Once the sequencing is completed, a more critical challenge emerges where the scientists have to assemble millions of gene fragments into a complete and correct genome and chromosome sequence. The finalists in ASC17 will attempt to optimize Falcon, a third-generation gene sequencing assembly tool, and the results will help research work in human genetics and even the origin of life to advance.

LAMMPS is the abbreviation for Large-scale Atomic/Molecular Massively Parallel Simulator, and is the most widely used molecular dynamics simulation software worldwide. It is the key software for research in many cutting-edge disciplines including chemistry, materials, and molecular biology. The challenge for ASC17 finalists is to port this very mature software to the latest “Knights Landing” architecture platform, and to improve the operational efficiency of this software.

In addition, the teams in ASC17 finals are also required by the organizing committee to make use of the supercomputing nodes from Inspur to design and build a supercomputer on their own under 3000W power to optimize HPL , HPCG and one mystery application. Each team should also provide an English presentation.

The ASC Student Supercomputer Challenge is initiated by China, and supported by experts and institutions worldwide. The competition aims to be the platform to promote exchanges among young supercomputing talent from different countries and regions, as well as to groom young talent. It also aims to be the key driving force in promoting technological and industrial innovations by improving the standards in supercomputing applications and research. ASC Challenge has been held for 6 years. This year the ASC17 Challenge is co-organized by  Zhengzhou University, the National Supercomputing Centre in Wuxi , and Inspur,with 230 teams from all over the world having taken part in the competition.

The post ASC17 Championship to Challenge Front-end Science appeared first on HPCwire.

Pages