Feed aggregator

Human Genetics Centre at University of Oxford Deploys Univa Solutions

HPC Wire - Tue, 11/14/2017 - 08:51

CHICAGO, Nov. 14, 2017 — Univa, a leading innovator of workload management products, today announced its Univa Grid Engine distributed resource management system is powering the Wellcome Centre for Human Genetics’ (WHG) high performance computing (HPC) environment.

WHG is a research institute within the Nuffield Department of Medicine at the University of Oxford. The Centre is an international leader in genetics, genomics, statistics and structural biology with more than 400 researchers and 70 administrative and support personnel. WHG’s mission is to advance the understanding of genetically-related conditions through a broad range of multi-disciplinary research.

To support its research community, the Centre operates a shared HPC cluster comprising over 4,000 InfiniBand-connected, high-memory compute cores and 4PB of high performance, parallel storage running 250 applications. WHG’s previous open source scheduler lacked practical software support and did not address the increasing use of containerized machine learning applications. To plan for growth and accommodate mixed workload types (serial-batch, array, MPI, container, Spark) on the same shared cluster, the Centre evaluated a variety of open source and commercial options. The review committee awarded Univa Grid Engine as the replacement, citing its modern scheduler, expert technical support and minimal user re-training for its selection.

“The conversion from the previous scheduler to Univa Grid Engine was virtually painless. Our users are happy that their hard-won knowledge continues to be relevant, significant scheduler bugs and vulnerabilities were fixed, and we also save on our own precious system administration time,” said Dr. Robert Esnouf, Head of Research Computing Core, Wellcome Centre for Human Genetics. “We can now plan for significant future growth with Univa as a key component of our infrastructure offering.”

The transition to Univa Grid Engine also provided WHG with new capabilities like GPU-aware scheduling, DRMAA2, and container support, placing WHG in a position to embrace emerging research techniques and support a wider range of research. To learn more how WHG expanded workloads for their life-science research, download this detailed case study.

About Wellcome Centre for Human Genetics

The Wellcome Centre for Human Genetics is a large interdisciplinary research centre comprising 400 scientists in 45 research groups, within the University of Oxford. It is one of the leading institutes, globally, in human genetics. Since its founding 21 years ago, the WHG has played a pioneering role in the progress and success of human disease genetics. The Centre’s focus is the development and implementation of novel approaches to exploit human genetics and uncover disease biology so as to improve healthcare.

About Univa Corporation

Univa is the largest independent provider of workload management products that optimize performance of applications, services and containers. Univa enables enterprises to modernize their scaled compute resources and run mixed workloads across on-premise, cloud, and hybrid infrastructures. Over 2 million computer cores are currently managed by Univa products in industries such as life sciences, manufacturing, oil and gas, transportation and financial services. Univa is headquartered in Chicago, with offices in Canada and Germany. For more information, please visit www.univa.com.

Source: Univa Corporation

The post Human Genetics Centre at University of Oxford Deploys Univa Solutions appeared first on HPCwire.

HPE Launches ARM-based Apollo System for HPC, AI

HPC Wire - Tue, 11/14/2017 - 08:12

HPE doubled down on its memory-driven computing vision while expanding its processor portfolio with the announcement yesterday of the company’s first ARM-based high performance computing system (not counting the ARM-based Moonshot, which targeted the datacenter and didn’t pan out), along with other purpose-built solutions designed to help enterprises adopt HPC and AI applications.

HPE’s new Apollo 70 system uses Cavium’s 64-bit ARMv8-A ThunderX2 Server Processor and is designed for memory-intensive HPC workloads, with up to 33 percent more memory bandwidth than industry standard servers, according to Bill Mannel, HPE’s vice president and general manager, HPC and AI Segment Solutions. He said the Cavium chip reverses recent trends in which “pretty much all of the characteristics of memory – Gbytes per core, bytes per FLOP – have been declining.”

HPE describes the Apollo 70 as a dense, scalable platform that supports standard HPE provisioning, cluster management and performance software. It provides access Red Hat Enterprise Linux, SUSE Linux Enterprise Server for ARM and Mellanox InfiniBand and Ethernet fabrics.

“Enterprises are looking for ways to leverage HPC and AI in their business processes to gain faster insights and intelligence for competitive advantage,” said Steve Conway, SVP of Research, Hyperion Research. “HPE’s next-generation and new Apollo systems will facilitate that adoption by providing easier integration and management while delivering extreme density to reduce data center footprint and extend the range of HPC and AI use cases.”

HPE also introduced the Apollo 4510 Gen10 System, built for object storage and, according to HPE, delivering one of the highest storage capacities in any standard depth 4U server, with up to 600TB per system and 16 percent more cores than the previous generation of the product.

“The platform is ideal for customers looking to optimize the retention and placement of massive amounts of data, using object storage as an active archive with immediate access to structured and unstructured data,” HPE said. The system supports NVMe cards that can be used as a Scality RING metadata cache enabling 100 percent of bulk drive bays for object data storage.

HPE’s new Apollo 2000 Gen10 System is a multi-server platform with a “plug and play” system configuration designed for customers with limited data center space who need to support enterprise HPC and deep learning applications.

The system has a scale-out architecture and a shared infrastructure. It supports NVIDIA Tesla V100 GPU accelerators for deep learning training and inference, and uses Intelligent System Tuning to accelerate application performance. It also includes proprietary HPE firmware security features, such as the HPE iLO5 server management and Silicon Root of Trust.

HPE’s Bill Mannel

HPE’s StoreEver LTO-8 Tape is designed to provide an added layer of protection against cybercrime and ransomware attacks with offline and off-premises data protection. HPE called it a long-term retention solution that allows customers to offload primary storage while reducing the cost of storing data overtime. It has up to 30 terabytes of capacity per tape cartridge, double the capacity in the same data center footprint as previous generation tape, making it suitable for HPC data centers, HPE said. The system stores up to 300 petabytes of data in the HPE T950 tape library and 1.6 exabytes of data in the HPE TFinity tape libraries. Full height drives offer up to 360MB/s transfer rate speeds, a 20 percent increase from the LTO-7 generation, according to HPE.

According to Hyperion Research (formerly a group within industry analyst firm IDC), HPE is the HPC market leader, with 36.4 percent market share and a 4.4 percent share gain in CYQ2. The company, along with other systems providers, has attempted to expand their market reach by offering systems designed to accelerate adoption of advanced scale and HPC-class technologies in the enterprise. But Mannel admitted that these attempts by HPE, and its competitors, have met with mixed results.

The problem, he said, is in the nature and accessibility of the lower end of the HPC market, for which pre-packaged solutions are best suited. The upper portions of the HPC market are organizations with advanced computing and sophisticated staff resources, some of which even run their own HPC stacks and cluster managers. The middle market tends to run packaged solutions on an a la carte basis, Mannel said, elements that they tie together themselves into an integrated HPC capability.

He said it’s the lower portion of the market – start-ups or companies with 50-100 employees that are new to HPC and to AI – has the strongest need for help with pre-built solutions.

“But can I say any of it us have been successful at that?” Mannell said. “I’d say not as much as we’d like to be… I think all the companies, including ourselves, have a challenge getting to those customers. The format is right, they want these kinds of turnkey solutions. But how we get to them is a challenge.”

HPE said the Apollo 2000 and Apollo 4510 are available today. The StoreEver LTO-8 Tape will be available next month and the Apollo 70 in 2018.

The post HPE Launches ARM-based Apollo System for HPC, AI appeared first on HPCwire.

ESnet Renews, Upgrades Transatlantic Network Connections

HPC Wire - Tue, 11/14/2017 - 07:49

Nov. 14, 2017 — Three years after ESnet first deployed its own transatlantic networking connection, the project is now being upgraded to four 100 gigabits-per-second links. These links gives researchers at America’s national laboratories and universities ultra-fast access to scientific data from the Large Hadron Collider (LHC) and other research sites in Europe.

The original configuration that went into service in December 2014 consisted of three 100 Gbps and one 40 Gbps links. Since December 2014, the LHC traffic being carried by ESnet alone has grown nearly 1600%, from 1.7 Petabytes per month in January 2015, to nearly 30 Petabytes per month in August 2017.

The four new connections link peering points in New York City and London, Boston and Amsterdam, New York and London, and Washington, D.C. and CERN in Switzerland. The contracts are with three different telecom carriers.

“Our initial approach was to build in redundancy in terms of both infrastructure and vendors and the past three years proved the validity of that idea,” said ESnet Director Inder Monga. “So, we stuck with those design principles while upgrading the fourth link to 100G.”

Overall goals of the new agreements accomplished:

  • Increase in overall capacity to meet projected demand
  • Reduction in the overall cost
  • Increase in the diversity of the cable systems providing ESnet circuits, and
  • Maintain as much R&E network community transatlantic cable diversity as possible, including that of the Advanced North Atlantic Collaboration.

Another new component is a collaboration with Indiana University funded by the National Science Foundation with its Networks for European, American and African Research (NEAAR) award within the International Research Network Connections (IRNC) program. The goal of NEAAR is to make science data from Africa, such as that collected by the Square Kilometer Array, and Europe, like data from CERN’s Large Hadron Collider, available to a broader research community.

With the upgrade, the total transatlantic capacity for Research and Education networks  is now 800 Gbps, continuing the close collaboration between the seven partners providing transatlantic connectivity under the broader umbrella of the Global Network Architecture Initiative (GNA).

Source: ESnet

The post ESnet Renews, Upgrades Transatlantic Network Connections appeared first on HPCwire.

CoolIT Systems Announces Liquid Cooled Intel Buchanan Pass Server

HPC Wire - Tue, 11/14/2017 - 07:16

CALGARY, Alberta, Nov. 14, 2017 — CoolIT Systems (CoolIT), a global leader in energy efficient liquid cooling solutions for HPC, Cloud and Hyperscale markets, today announced a liquid cooling solution to support Intel Compute Module HNS2600BPB (Buchanan Pass).

CoolIT has developed an innovative Rack DCLC coldplate solution featuring patented Split-Flow design, for Intel Buchanan Pass. The liquid cooling solution for this 2U, four node server manages heat from the dual Intel Xeon Scalable processors (Skylake), voltage regulators, and memory. A sample of the DCLC enabled Buchanan Pass server will be showcased by CoolIT during SC17 from November 13-16 in Denver, Colorado (booth 1601).

“When Intel approached us to support this server with a high efficiency liquid cooling solution, our team was excited to accept the challenge,” says CoolIT Systems VP of Product Marketing, Patrick McGinn. “This tightly integrated solution creates a very dense, energy saving platform that showcases how liquid cooling can be implemented without sacrificing serviceability.”

CoolIT’s Rack DCLC technology has quickly become the leading choice for tier 1 server OEM’s around the world. With the highest performing coldplates on the market, HPC data centers around the world are using CoolIT technology to enable higher performance servers, increase rack density, and lower their total cost of ownership.

About CoolIT Systems

CoolIT Systems, Inc. is a world leader in energy efficient liquid cooling technology for the Data Center, Server and Desktop markets. CoolIT’s Rack DCLC platform is a modular, rack-based, advanced cooling solution that allows for dramatic increases in rack densities, component performance, and power efficiencies. The technology can be deployed with any server and in any rack making it a truly flexible solution. For more information about CoolIT Systems and its technology, visit www.coolitsystems.com.

About Supercomputing Conference (SC17)

Established in 1988, the annual SC conference continues to grow steadily in size and impact each year. Approximately 5,000 people participate in the technical program, with about 11,000 people overall. SC has built a diverse community of participants including researchers, scientists, application developers, computing center staff and management, computing industry staff, agency program managers, journalists, and congressional staffers. This diversity is one of the conference’s main strengths, making it a yearly “must attend” forum for stakeholders throughout the technical computing community. For more information, visit http://sc17.supercomputing.org/.

Source: CoolIT Systems

The post CoolIT Systems Announces Liquid Cooled Intel Buchanan Pass Server appeared first on HPCwire.

Mines Residence Life staff win major conference awards

Colorado School of Mines - Mon, 11/13/2017 - 11:37

Colorado School of Mines residence life staff took home multiple major awards from the Intermountain Affiliate of College and University Residence Halls Regional Leadership Conference, held Nov. 3-6 in Albuquerque, New Mexico.

Mines received the Program of the Year award for bringing members of the U.S Paralympic goalball team to campus to teach students how to play the game and then organizing a campus tournament. Goalball, designed for athletes with impaired vision, has teams competing to throw a ball with bells inside into their opponents’ goal.

Chase Schumacher, an engineering physics major and a second-year resident assistant in Weaver Towers, was named Student Staff Member of the Year.

Mary F. Elliott, Mines’ director of housing and residence life, was named Advisor of the Year.

Mines students were also honored for presenting two of the top 12 programs at the conference. Brandon Bakka, a chemical engineering student, was recognized for “How LGBTQ+ People Navigate the Jungle of College Campuses.” Schumacher and Keenan Urmann, a mechanical engineering student, were recognized for “Miracle Gro Fer(Tea)lizer, a weekly Tuesday Tea program in the residence halls.

CONTACT
Mark Ramirez, Managing Editor, Communications and Marketing | 303-273-3088 | ramirez@mines.edu
Emilie Rusch, Public Information Specialist, Communications and Marketing | 303-273-3361 | erusch@mines.edu

Categories: Partner News

Netlist, Nyriad and TYAN to Accelerate the Adoption of NVDIMMs and GPUs for Storage

HPC Wire - Mon, 11/13/2017 - 10:49

DENVER, Nov. 13, 2017 — Netlist, Inc. (NASDAQ: NLST), Nyriad and TYAN today announced a solution to support Netlist NVvault non-volatile memory for cache acceleration in Nyriad’s graphics processing unit (GPU)-accelerated storage platform, NSULATE on a TYAN Thunder server.

By adopting Netlist’s NVvault DDR4 NVDIMM-N non-volatile memory, Nyriad NSULATE-based storage systems can be configured to achieve millions of IOPS, sustaining high throughput while also enabling levels of storage resilience and integrity –  impossible with traditional central processing unit (CPU) or redundant array of independent disks (RAID) -based solutions.

The Netlist and Nyriad technologies will be showcased on a TYAN Thunder HX FT77D-B7109 dual root complex 4U 8GPU server configured with Netlist’s NVvault at the SuperComputing 2017 Conference Exhibition taking place in Denver, CO from November 13-16. Additional information on the demonstration will be available at Netlist’s booth #2069 and TYAN’s booth #1269.

C.K. Hong, Netlist Chief Executive Officer, said, “NVvault, which is part of our storage-class memory family of solutions, is vital to Nyriad’s NSULATE accelerated and resilient storage-processing architecture. When combined with TYAN’s latest server targeted at big data and high-performance computing applications, we have created a game changing platform to drive improved IOPS (input/output operations per second), security, scale, performance and total storage array cost per terabyte. The solution enables NVvault to bring substantial performance benefits to end user applications such as big-data analytics by storing data in a way that is directly accessible to high performance GPUs.”

Nyriad Chief Executive Officer Matthew Simmons stated, “Processing and storing large volumes of data has become so I/O (input/output) intensive that traditional storage and network fabrics can’t cope with the volume of information that needs to be processed and stored in real-time. However, GPUs have become the dominant solution for modern high-performance computing, big-data and machine learning applications.  Our collaboration with Netlist and TYAN has broken this bottleneck and will enable major leaps in exascale storage performance and efficiency.”

Danny Hsu, Vice President of MiTAC Computing Technology Corporation’s TYAN Business Unit stated, “For many years, TYAN has met the ongoing challenge to provide efficient and powerful products that can support demanding applications in many areas, including the storage and high-performance computer space. Towards this goal, we are working with Netlist and Nyriad to define a new kind of computing solution to address vastly larger data sets and analytics, offering huge performance gains for customers worldwide.”

Netlist’s NVvault DDR4 is an NVDIMM-N that provides data acceleration and protection in a JEDEC standard DDR4 interface. It is designed to be integrated into industry standard server or storage solutions.  NVvault is a persistent memory technology that has been widely adopted by industry standard servers and storage systems.  By combining the high performance of DDR4 DRAM with the non-volatility of NAND Flash, NVvault improves the performance and data preservation found in storage virtualization, RAID, cache protection, and data logging applications requiring high-throughput.

Nyriad’s NSULATE solves these problems by replacing RAID controllers with GPUs for all Linux storage applications. This enables the GPUs to perform double duty as both I/O controllers and compute accelerators in the same integrated solution. The combination of Netlist NV Memory with NSULATE produces the best of both worlds, with low-latency IOPS achievable by any storage solution combined with maximum data resilience, security, throughput and efficiency in the same architecture.

The first next-generation solutions based on the Netlist and Nyriad technology is expected to appear in the market from leading industry partners early next year.

About Netlist

Netlist is a leading provider of high-performance modular memory subsystems serving customers in diverse industries that require superior memory performance to empower critical business decisions. Flagship products NVvault and EXPRESSvault enable customers to accelerate data running through their servers and storage and reliably protect enterprise-level cache, metadata and log data by providing near instantaneous recovery in the event of a system failure or power outage. HybriDIMM, Netlist’s next-generation storage class memory product, addresses the growing need for real-time analytics in Big Data applications and in-memory databases. Netlist holds a portfolio of patents, many seminal, in the areas of hybrid memory, storage class memory, rank multiplication and load reduction. Netlist is part of the Russell Microcap Index.  To learn more, visit www.netlist.com.

About Nyriad

Nyriad is a New Zealand-based exascale computing company specializing in advanced data storage solutions for big data and high-performance computing. Born out of its consulting work on the Square Kilometre Array Project, the company was forced to rethink the relationship between storage, processing and bandwidth to achieve a breakthrough in system stability and performance capable of processing and storing over 160Tb/s of radio antennae data in real-time, within a power budget impossible with any modern IT solutions.

About TYAN

TYAN, as a leading server brand of MiTAC Computing Technology Corporation under the MiTAC Group (TSE:3706), designs, manufactures and markets advanced x86 and x86-64 server/workstation board technology, platforms and server solution products. Its products are sold to OEMs, VARs, System Integrators and Resellers worldwide for a wide range of applications. TYAN enables its customers to be technology leaders by providing scalable, highly-integrated, and reliable products for a wide range of applications such as server appliances and solutions for HPC, hyper-scale/data center, server storage and security appliance markets. For more information, visit MiTAC’s website athttp://www.mic-holdings.com or TYAN’s website at http://www.tyan.com

Source: Netlist

The post Netlist, Nyriad and TYAN to Accelerate the Adoption of NVDIMMs and GPUs for Storage appeared first on HPCwire.

Red Hat Introduces Arm Server Support for Red Hat Enterprise Linux

HPC Wire - Mon, 11/13/2017 - 10:31

Nov. 13, 2017 — Today marks a milestone for Red Hat Enterprise Linux with the addition of a new architecture to its list of fully supported platforms. Red Hat Enterprise Linux for ARM is a part of its multi-architecture strategy and the culmination of a multi-year collaboration with the upstream community and its silicon and hardware partners.

The Arm ecosystem has emerged over the last several years with server-optimized SoC (system on chip) products that are designed for cloud and hyperscale, telco and edge computing, as well as high-performance computing applications. Arm SoC designs take advantage of advances in CPU technology, system-level hardware, and packaging to offer additional choices to customers looking for tightly integrated hardware solutions.

Red Hat took a pragmatic approach to Arm servers by helping to drive open standards and develop communities of customers, partners and a broad ecosystem. Its goal was to develop a single operating platform across multiple 64-bit ARMv8-A server-class SoCs from various suppliers while using the same sources to build user functionality and consistent feature set that enables customers to deploy across a range of server implementations while maintaining application compatibility.

In 2015, Red Hat introduced a Development Preview of the operating system to silicon partners, such as Cavium and Qualcomm, and OEM partners, like HPE, that designed and built systems based on a 64-bit Arm architecture. A great example of this collaboration was the advanced technology demonstration by HPE, Cavium, and Red Hat at the International Supercomputing conference in June 2017. That prototype solution became part of HPE’s Apollo 70 system, announced today. If you are attending SuperComputing17 this week, stop by Red Hat’s booth (#1763) to learn more about this new system.

Red Hat’s focus is to provide software support for multiple architectures powered by a single operating platform – Red Hat Enterprise Linux, and driven by open innovation. Red Hat Enterprise Linux 7.4 for ARM, the first commercial release for this architecture, provides customers who have been planning to run their workloads and software and hardware partners that require a stable operating environment to continue development, with a proven and more secure enterprise-grade platform. They plan to continue their work with the ecosystem to expand the reach for Red Hat Enterprise Linux 7.4 for ARM.

In addition to Red Hat Enterprise Linux, Red Hat is also shipping Red Hat Software Collections 3Red Hat Developer Toolset 7 and single host KVM virtualization (as an unsupported Development preview) for this architecture.

To learn more about Red Hat Enterprise Linux 7.4 for ARM, see the release notes at https://access.redhat.com/articles/3158541

Source: Red Hat

The post Red Hat Introduces Arm Server Support for Red Hat Enterprise Linux appeared first on HPCwire.

Penguin Computing Announces Intel Xeon Scalable Processor Availability for On-Demand HPC Cloud

HPC Wire - Mon, 11/13/2017 - 09:04

FREMONT, Calif., Nov. 13, 2017 — Penguin Computing, provider of high performance computing, enterprise data center and cloud solutions, today announced that more than 11,500 cores of the latest Intel Xeon Scalable processor (codenamed: Skylake-SP) will be available in December 2017 on Penguin Computing On-Demand (POD) HPC cloud. The new POD HPC cloud compute resources use Intel Xeon Gold 6148 processors, a cluster-wide Intel Omni-Path Architecture low-latency fabric and are integrated with Penguin Computing Scyld Cloud Workstation for web-based, remote desktop access into the public HPC cloud service.

“As an HPC cloud provider, we know that it is critical to provide our customers with the latest processor technologies,” said Victor Gregorio, Senior Vice President, Cloud Services, Penguin Computing. “The latest Intel Xeon Scalable processor expansion will provide an ideal compute environment for MPI workloads that can leverage thousands of cores for computation. We have significant customer demand for POD HPC cloud in applicable areas like high-resolution weather forecasting and computational fluid dynamics, including solutions from software partners like ANSYS, Flow Science and CD-adapco.”

“Intel offers a balanced portfolio of HPC optimized components like the Intel Xeon Scalable processor and Intel Omni-Path Architecture, which provides the foundation for researchers and innovators to drive new discoveries and build new products faster than ever before,” said Trish Damkroger, Vice President of technical computing at Intel. “Penguin Computing On-Demand provides an easy and flexible path to access the latest technology so more users can realize the benefits of HPC.”

Scientists and engineers at every company are trying to innovate faster while holding down costs. Modeling and simulation are the backbone of these efforts. Customers may wish to run simulations at scale, or many different permutations simultaneously but may require more computing resources than are readily available in-house. The POD HPC cloud offers organizations a flexible, cost effective approach to meeting these requirements.

The Intel Xeon Scalable processor provides increased performance, a unified stack optimized for key workloads including data analytics, and integrated technologies including networking, acceleration and storage. The processor’s increased performance is realized through innovations including Intel® AVX-512 extensions that can deliver up to 2x FLOPS per clock cycle, which is especially important for HPC, data analytics and hardware-enhanced security/cryptography workloads. Along with numerous acceleration refinements, the new processor offers integrated 100 Gb/s Intel® Omni-Path Architecture fabric options. With these improvements, the Intel Xeon Scalable Platinum 8180 processor yielded an increase of up to 8.2x more double precision GFLOPS/sec when compared to Intel Xeon processor E5-2690 (codenamed Sandy Bridge) common in the server installed-base, and a 2.27x increase over the previous-generation Intel Xeon processor E5-2699 v4 (codenamed Broadwell)1.

The doubling of cores in the publicly available POD HPC cloud resources in 2017 was proceeded by a 50 percent increase in capacity in 2016. As customer demand continues to increase, POD HPC cloud will continue to grow using the most current technologies to deliver the actionable insights that organizations require.

Visit Penguin Computing at Booth 1801 during SC17 in Denver.

About Penguin Computing

Penguin Computing is one of the largest private suppliers of enterprise and high-performance computing solutions in North America and has built and operates the leading specialized public HPC cloud service, Penguin Computing On-Demand (POD). Penguin Computing pioneers the design, engineering, integration and delivery of solutions that are based on open architectures and comprise non-proprietary components from a variety of vendors. Penguin Computing is also one of a limited number of authorized Open Compute Project (OCP) solution providers leveraging this Facebook-led initiative to bring the most efficient open data center solutions to a broader market, and has announced the Tundra product line which applies the benefits of OCP to high performance computing. Penguin Computing has systems installed with more than 2,500 customers in 40 countries across eight major vertical markets.

Source: Penguin Computing

The post Penguin Computing Announces Intel Xeon Scalable Processor Availability for On-Demand HPC Cloud appeared first on HPCwire.

Cavium and Leading Partners to Showcase ThunderX2 Arm-Based Server Platforms and FastLinQ Ethernet Adapters for HPC at SC17

HPC Wire - Mon, 11/13/2017 - 08:52

SAN JOSE, Calif. and DENVER, Nov. 13, 2017 — Cavium, Inc. (NASDAQ: CAVM), a leading provider of semiconductor products that enable secure and intelligent processing for enterprise, data center, wired and wireless networking, will showcase various ThunderX2 Arm-based server platforms for high performance computing at this year’s Supercomputing (SC17) conference taking place in the Colorado Convention Center in Denver, Colorado from November 13th to 16th.

ThunderX2 server SoC integrates fully out-of-order, high-performance custom cores supporting single and dual-socket configurations. ThunderX2 is optimized to drive high computational performance delivering outstanding memory bandwidth and memory capacity. The new line of ThunderX2 processors includes multiple SKUs for both scale up and scale out applications and is fully compliant with Armv8-A architecture specifications as well as the Arm Server Base System Architecture and Arm Server Base Boot Requirements standards.

ThunderX2 SoC family is supported by a comprehensive software ecosystem ranging from platform level systems management and firmware to commercial Operating Systems, Development Environments and Applications. Cavium has actively engaged in server industry standards groups such as UEFI and delivered numerous reference platforms to a broad array of community and corporate partners.  Cavium has also demonstrated its leadership role in the Open Source software community driving upstream kernel enablement and toolchain optimization, actively contributing to Linaro’s Enterprise and Networking Groups, investing in key Linux Foundation projects such as DPDK, OpenHPC, OPNFV and Xen and sponsoring the FreeBSD Foundation’s Armv8 server implementation.

SC17 Show Highlights and Product Demonstrations

Cavium’s executive leaders and technology experts will be available to discuss the company’s ThunderX2 processor technology, platforms, roadmap and HPC target solutions while demonstrating a range of platforms and configurations. Many of Cavium’s key partners will also be present with demonstrations that include system implementation, system software, tools and applications.  In addition to the ThunderX2 based ODM and OEM platforms and Cavium’s FastLinQ Ethernet Adapters, the following product demonstrations will be on display on the show floor and at Cavium’s booth #349.

  • Cavium ThunderX2 – 64–bit Armv8 based SoC family that significantly increases performance, memory bandwidth and memory capacity. We will be demonstrating various applications running on ThunderX2 in both single and dual socket configurations. Cavium’s systems partners Bull/Atos (Booth #1925), Cray (Booth #625), Gigabyte (Booth #2151), HPE (Booth #925), and Penguin (Booth #1801) will be showcasing HPC platforms based on ThunderX2. Cavium’s software partners will be demonstrating a variety of software tools and applications optimized for ThunderX2. In addition, there will be a full rack of ThunderX2 based systems showcased in HPE’s Comanche collaboration booth #494.
  • Cavium FastLinQ: – 10/25/40/50/100Gb Ethernet adapters that enable the highest level of application performance with the industry’s only Universal RDMA capability that supports RoCE v1, RoCE v2 and iWARP concurrently. With the explosion of data there is a critical need for fast and intelligent I/O throughout the data center. Cavium FastLinQ products enable machine learning, data analytics, NVMe over fabrics storage while maximizing system performance.

The following additional presentations by Cavium will cover ThunderX2 updates, Arm Ecosystem, and End User Optimizations focused on HPC.

  • On Monday, November 13, 2017 at 3:30 pm, Surya Hotha, Director of Product Marketing for Cavium’s Datacenter Processor Group, will present ThunderX2 in
  • HPC applications at the third annual Arm SC HPC User Forum.
  • On Tuesday, November 14, 2017 at 10.30 am, Giri Chukkapalli, Distinguished Engineer, will present ThunderX2 technology overview at the Red Hat Theater.
  • On Tuesday, November 14, 2017 at 2.30 pm, Varun Shah, Product Marketing Manager for Cavium’s Datacenter Processor Group, will present ThunderX2 advantages for the HPC Market at the Exhibitor Forum.
  • On Tuesday, November 14, 2017 at 2.30 pm, Giri Chukkapalli, Distinguished Engineer, will present ThunderX2 technology overview at the HPE Theater.
  • On Wednesday, November 15, 2017 at 2.30 pm, Cavium experts will present at the SUSE booth.

To schedule a meeting at SC17, please send an email to sales@cavium.com and enter SC17 Meeting Request in the subject line.

About Cavium

Cavium, Inc. (NASDAQ: CAVM), offers a broad portfolio of infrastructure solutions for compute, security, storage, switching, connectivity and baseband processing. Cavium’s highly integrated multi-core SoC products deliver software compatible solutions across low to high performance points enabling secure and intelligent functionality in Enterprise, Data Center and Service Provider Equipment. Cavium processors and solutions are supported by an extensive ecosystem of operating systems, tools, application stacks, hardware-reference designs and other products. Cavium is headquartered in San Jose, CA with design centers in California, Massachusetts, India, Israel, China and Taiwan. For more information, please visit: http://www.cavium.com.

Source: Cavium

The post Cavium and Leading Partners to Showcase ThunderX2 Arm-Based Server Platforms and FastLinQ Ethernet Adapters for HPC at SC17 appeared first on HPCwire.

Oak Ridge National Laboratory Acquires Atos Quantum Learning Machine

HPC Wire - Mon, 11/13/2017 - 08:31

PARIS and IRVING, Tex., Nov. 13, 2017 — Atos, a global leader in digital transformation, today announces a new contract with US-based Oak Ridge National Laboratory (ORNL) for a 30-Qubit Atos Quantum Learning Machine (QLM), the world’s highest-performing quantum simulator.

Designed by the ‘Atos Quantum’ laboratory, the first major quantum industry program in Europe, the Atos QLM combines an ultra-compact machine with a universal programming language. The appliance enables researchers and engineers to develop and test today the quantum applications and algorithms of tomorrow.

As the Department of Energy’s largest multi-program science and energy laboratory, ORNL employs almost 5,000 people, including scientists and engineers in more than 100 disciplines. The Atos QLM-30, processing up to 30 quantum bits (Qubits) in-memory, installed at ORNL was operational within hours thanks to Atos’ fast-start process. Set up as a stand-alone appliance, the Atos QLM can run on premise ensuring confidentiality of clients’ research programs and data.

ORNL’s Quantum Computing Institute Director, Dr. Travis Humble says:

“At ORNL, we are preparing for the next-generation of high-performance computing by investigating unique technologies such as quantum computing.

We are researching how quantum computing can provide new methods for advancing scientific applications important to the Department of Energy.

Our researchers focus on applications in the physical sciences, such as chemistry, materials science, and biology, as well as the applied and data sciences. Numerical simulation helps to guide development of these scientific applications and support understanding program correctness. The Atos Quantum Learning Machine provides a unique platform for testing new quantum programming ideas.”

Thierry Breton, CEO and Chairman of Atos, adds:

“We are glad to accompany Oak Ridge National Laboratory from the outset in what is likely to be the next major technological evolution. Thanks to our Atos Quantum Learning Machine, designed by our quantum lab supported by an internationally renowned Scientific Council, researchers from the Department of Energy will benefit from a simulation environment which will enable them to develop quantum algorithms to prepare for the major accelerations to come.”

In the coming years, quantum computing should be able to tackle the explosion of data brought about by Big Data and the Internet of Things. Thanks to its innovative targeted computing acceleration capabilities based in particular on the exascale class supercomputer Bull Sequana, quantum computing should also foster developments in deep learning, algorithms and artificial intelligence for domains as varied as pharmaceutical or new materials. To move forward on these issues, Atos plans to set up several partnerships with research centers and universities around the world.

About Atos

Atos is a global leader in digital transformation with approximately 100,000 employees in 72 countries and annual revenue of around € 12 billion. European number one in Big Data, Cybersecurity, High Performance Computing and Digital Workplace, the Group provides Cloud services, Infrastructure & Data Management, Business & Platform solutions, as well as transactional services through Worldline, the European leader in the payment industry. With its cutting-edge technologies, digital expertise and industry knowledge, Atos supports the digital transformation of its clients across various business sectors: Defense, Financial Services, Health, Manufacturing, Media, Energy & Utilities, Public sector, Retail, Telecommunications and Transportation. The Group is the Worldwide Information Technology Partner for the Olympic & Paralympic Games and operates under the brands Atos, Atos Consulting, Atos Worldgrid, Bull, Canopy, Unify and Worldline. Atos SE (Societas Europaea) is listed on the CAC40 Paris stock index.

Source: Atos

The post Oak Ridge National Laboratory Acquires Atos Quantum Learning Machine appeared first on HPCwire.

DDN Announces New Solutions and Next Generation Monitoring Tools

HPC Wire - Mon, 11/13/2017 - 08:26

DENVER and SANTA CLARA, Calif., Nov. 13, 2017 — DataDirect Networks (DDN) today announced new high-performance computing (HPC) storage solutions and capabilities, which it will feature this week at the International Conference for High Performance Computing, Networking, Storage and Analysis (SC17) in Denver, Colorado. The new solutions include an entry-level burst buffer appliance (IME140) for cost-effective I/O acceleration, a next generation monitoring software (DDN Insight), and the company’s new declustered RAID solution (SFA Declustered RAID “DCR”) for increased data protection in massive storage pools. DDN also announced recent HPC customer wins in some of the world’s largest supercomputing centers.

“Modern HPC workflows require new levels of performance, flexibility and reliability to turn data and ideas into value,” said John Abbott, founder and research VP, 451 Research. “With its long-standing HPC storage heritage, DDN is strongly positioned with closely integrated components that can deliver extreme I/O performance, comprehensive monitoring at scale and new levels of data protection.”

New DDN Solutions and Next Generation Monitoring Tools

HPC and data-intensive enterprise environments are facing new pressures that stem from higher application diversity and sophistication along with a steep growth in volume of active datasets. These trends present a tough challenge to today’s filesystems in delivering the performance and economics to match business needs and compute capability. In addition, as rotational drive capacities grow the risk of data loss increases due to longer drive rebuild times. DDN’s latest technology innovations deliver the enhanced performance, flexibility and management simplicity needed to solve these challenges and to accelerate large-scale workflows for greater operational efficiency and ROI.

  • DDN IME140 
    DDN has expanded its IME product line with the new IME140 that makes IME scale-out flash accessible to more organizations at lower cost. The IME140 supports extreme file performance in a small 1U flash data appliance. Each appliance can deliver more than 11GB/s write and 20GB/s read throughputs and more than 1M file IOPs (read and write). Starting with a resilient solution as small as 4 units, the IME140 allows organizations to cost-effectively scale performance independent of amount of capacity required. Traditional parallel file systems often cannot keep pace with the mixed I/O requirements of modern workloads and fail to deliver the potential of flash. The IME software implements a faster, leaner data path which delivers to applications the low-latencies and high throughputs of NVMe. The IME140 1U building block allows organizations to intelligently apply fast flash where it is needed, while maintaining cost-effective capacity on HDD within the file system.
  • DDN Insight 
    DDN Insight is DDN’s next-generation monitoring software.  Easy to deploy, DDN Insight allows customers to monitor the most challenging environments at scale, across multiple file systems and storage appliances. With DDN Insight customers can quickly identify and address hot spots, bottlenecks and misbehaving applications. Tightly integrated with SFA, EXAScaler, GRIDScaler and IME, DDN Insight delivers an intuitive way for customers to comprehensively monitor their complete DDN-based ecosystem.

Availability

The DDN IME140 will ship in volume in the first quarter of 2018. The SFA DCR is shipping today with the SFA14KX, and DDN Insight monitoring software is integrated and shipping today with DDN’s SFA, EXAScaler, GRIDScaler and IME solutions.

About DDN

DataDirect Networks (DDN) is a leading big data storage supplier to data-intensive, global organizations. For almost 20 years, DDN has designed, developed, deployed and optimized systems, software and storage solutions that enable enterprises, service providers, universities and government agencies to generate more value and to accelerate time to insight from their data and information, on premise and in the cloud. Organizations leverage the power of DDN storage technology and the deep technical expertise of its team to capture, store, process, analyze, collaborate and distribute data, information and content at the largest scale in the most efficient, reliable and cost-effective manner. DDN customers include many of the world’s leading financial services firms and banks, healthcare and life science organizations, manufacturing and energy companies, government and research facilities, and web and cloud service providers. For more information, go to www.ddn.com or call 1-800-837-2298.

Source: DDN

The post DDN Announces New Solutions and Next Generation Monitoring Tools appeared first on HPCwire.

CoolIT Systems Showcases Newest Datacenter Liquid Cooling Innovations for OEM and Enterprise Customers at SC17

HPC Wire - Mon, 11/13/2017 - 08:04

DENVER, Nov. 13, 2017 — CoolIT Systems (CoolIT), a global leader in energy efficient liquid cooling solutions for HPC, Cloud and Hyperscale markets, returns to the highly-anticipated Supercomputing Conference 2017 (SC17) in Denver, Colorado for the sixth consecutive year with its latest Rack DCLC and Closed-Loop DCLC innovations for data centers and servers.

As the most popular integration partner for OEM server manufacturers, CoolIT will showcase liquid-enabled servers from Intel, Dell EMC, HPE, and Huawei. Combined with the broadest range of heat exchangers and supporting liquid infrastructure, CoolIT and their OEM partners are delivering the most complete and robust liquid cooling solutions to the HPC market. CoolIT OEM solutions being shown at booth 1601 include:

  • Intel Buchanan Pass – CoolIT is pleased to announce the liquid-enabled Buchanan Pass server with coldplates managing heat from the processor, voltage regulator, and memory.
  • Dell EMC PowerEdge C6420 – this liquid-enabled server will be on display at the Dell EMC booth (#913) within a fully populated rack, including stainless steel Manifold Modules and the best-in-class CHx80 Heat Exchange Module. With factory-installed liquid cooling, this server is purpose-built for high performance and hyperscale workloads.
  • HPE Apollo 2000 Gen9 System – optimized with Rack DCLC to significantly enhance overall data center performance and efficiency.
  • HPE Apollo Trade and Match Server Solution – optimized with Closed-Loop DCLC to increase density, decrease TCO and take advantage of enhanced performance to capitalize on High Frequency Trading trends.
  • STULZ Micro Data Center – combining CoolIT’s Rack DCLC with STULZ’ world-renowned mission critical air cooling products to create a single enclosed solution for managing high-density compute requirements.

Debuting at SC17 are two industry-first Heat Exchange Modules:

  • Rack DCLC AHx10, CoolIT’s new Liquid-to-Air CDU that delivers the benefits of rack level liquid cooling without the requirement for facility water. The standard 5U system manages 7kW at 25°C ambient air temperature and is expandable to 6U or 7U configurations (via the available expansion kit) to scale capacity up to 10kW of heat load.
  • Rack DCLC AHx2, CoolIT’s new Liquid-to-Air heat exchanger tool for OEMs and System Integrators for DCLC enabled servers to be thermally tested during the factory burn-in process, without liquid cooling infrastructure.

CoolIT will also showcase its Liquid-to-Liquid heat exchangers, including the stand-alone Rack DCLC CHx650, and the 4u Rack DCLC CHx80 that provides 80-100kW cooling capacity with N+1 reliability to manage the most challenging, high density HPC racks.

For the first time, CoolIT will showcase its advanced Rack DCLC Command2 Control System for Heat Exchange Modules. Attendees can experience the plug-and-play functionality of Command2, including built-in autonomous controls and sophisticated safety features.

The latest CPU and GPU coldplate assemblies to support CoolIT’s passive Rack DCLC platform will be displayed, including the RX1 for Intel Xeon Scalable Processor Family (Skylake), the GP1 for NVIDIA Tesla P100 and GP2 for NVIDIA Tesla V100. Additionally, CoolIT’s full coverage MX1, MX2 and MX3 memory cooling coldplates will be featured.

CoolIT will highlight customer installations including:

  • Canadian Hydrogen Intensity Mapping Experiment (CHIME), the world’s largest low-frequency radio telescope. Deployed inside a containerized environment, CoolIT’s liquid cooled system consists of 256 rack-mounted General Technics GT0180 custom 4u servers housed in 26 racks managed by Rack DCLC CHx40 Heat Exchange Modules. Featuring liquid cooled Intel Xeon Processor E5-2620 v3 and dual AMD FirePro S9300x2, CoolIT significantly lowers operating temperatures and improves performance and power-efficiencies.
  • Poznan Supercomputing and Networking Center (PSNC). The PSNC “Eagle” cluster uses 1,232 liquid cooled Huawei CH121 servers to increase density and reduce energy consumption. PSNC was able to deploy this new cluster within their existing data center without having to invest in additional air cooling infrastructure. The heated liquid is also being reused for local heating needs.

In partnership with STULZ, CoolIT will host an SC17 Exhibitor Forum presentation on high-density Chip-to-Atmosphere data center cooling solutions on Thursday 16th Nov at 11.00am. CoolIT encourages all attendees to join the Chip-to-Atmosphere: Providing Safe and Effective Cooling for High-Density, High-Performance Data Center Environments presentation in room 503-504. During the session, David Meadows, Director of Industry, Standards and Technology at STULZ Air Technology Systems, Inc. and Geoff Lyon, CEO and CTO at CoolIT Systems, will be discussing the efficiency gains and performance enhancements made possible by liquid cooling solutions.

“Liquid cooling in the data center continues to grow in adoption and delivers more compelling ROIs. Our collaboration with OEM partners such as Dell EMC, HPE, Intel and STULZ provides further evidence that the future of the data center is destined for liquid cooling,” said Geoff Lyon, CEO and CTO at CoolIT Systems.

To learn more about how CoolIT’s products and solutions maximize data center performance and efficiency, visit booth 1601 at SC17. Executives and technical staff will be on site to guide attendees through new product showcases, live demos. To set up an appointment, contact Lauren Macready at marketing@coolitsystems.com.

About CoolIT Systems

CoolIT Systems, Inc. is a world leader in energy efficient liquid cooling technology for the Data Center, Server and Desktop markets. CoolIT’s Rack DCLC platform is a modular, rack-based, advanced cooling solution that allows for dramatic increases in rack densities, component performance, and power efficiencies. The technology can be deployed with any server and in any rack making it a truly flexible solution. For more information about CoolIT Systems and its technology, visit www.coolitsystems.com.

About Supercomputing Conference (SC17) 

Established in 1988, the annual SC conference continues to grow steadily in size and impact each year. Approximately 5,000 people participate in the technical program, with about 11,000 people overall. SC has built a diverse community of participants including researchers, scientists, application developers, computing center staff and management, computing industry staff, agency program managers, journalists, and congressional staffers. This diversity is one of the conference’s main strengths, making it a yearly “must attend” forum for stakeholders throughout the technical computing community. For more information, visit http://sc17.supercomputing.org/.

Source: CoolIT Systems

The post CoolIT Systems Showcases Newest Datacenter Liquid Cooling Innovations for OEM and Enterprise Customers at SC17 appeared first on HPCwire.

Flipping the Flops and Reading the Top500 Tea Leaves

HPC Wire - Mon, 11/13/2017 - 07:58

The 50th edition of the Top500 list, the biannual publication of the world’s fastest supercomputers based on public Linpack benchmarking results, was released from SC17 in Denver, Colorado, this morning and once again China is in the spotlight, having taken what is on the surface at least a definitive lead in multiple dimensions. China now claims the most systems, biggest flops share and the number one machine for 10 consecutive lists. It’s a coup-level achievement to pull off in five years, disrupting 20 years of US dominance on the Top500, but reading deeper into the Top500 tea leaves reveals a more nuanced analysis that has as much to do with China’s benchmarking chops as it does its supercomputing flops.

PEZY-SC2 chip at ISC 2017 –click to enlarge

Before we thread that needle, let’s take a moment to review the movement at the top of the list. There are no new list entrants in the top ten and no change in the top three, but the upgraded ZettaScaler-2.2 “Gyoukou” stuck its landing for a fourth place ranking. Vaulting 65 spots, the supersized Gyoukou combines Xeons and PEZY-SC2 accelerators to achieve 19.14 petaflops, up from 1.68 petaflops on the previous list. The Top500 authors point out that the system’s 19,860,000 cores represent the highest level of concurrency ever recorded on the Top500 rankings.

Gyoukou also had the honor of being the fifth greenest supercomputer. Fellow ZettaScaler systems Shoubu system B, Suiren2 and Sakura, placed first, second and third respectively (see perf-per-watt numbers below). Nvidia’s DGX SaturnV Volta system, installed at Nvidia headquarters in San Jose, Calif., was the fourth greenest supercomputer.

Nov. 2017 Green500 top five — click to enlarge Nov. 2017 Top500 top 10

Another upgraded machine, Trinity, moved up three positions to seventh place thanks to a recent infusion of Intel Knights Landing Xeon Phi processors that raised its Linpack score from 8.10 petaflops to 14.14 petaflops. Trinity is a Cray XC40 supercomputer operated by Los Alamos National Laboratory and Sandia National Laboratories.

China still has a firm grip on the top of the list with 93-petaflops Sunway TaihuLight and 33.86-petaflops Tianhe-2, the number one and and two systems respectively, which together provide the new list with 15 percent of its flops. Piz Daint, the Cray XC50 system installed at the Swiss National Supercomputing Centre (CSCS) remains the third fastest system with 19.6 petaflops. With Gyoukou in fourth position, the fastest US system, Titan, slips another notch to fifth place, leaving the United States without a claim to any of the top four rankings. Benchmarked at 17.59 petaflops, the five-year-old Cray XK7 system installed at the Department of Energy’s Oak Ridge National Laboratory, captured the top spot for one list iteration before being knocked off its perch in June 2013 by China’s Tianhe-2. This is the first time in the list’s 24-year history that the US has not held at least a number four ranking.

Although China has enjoyed number one bragging rights for nearly four years, this is the first list that it also dominates by both system number and aggregate performance share as well. China has the most installed systems: 202 compared to 159 on the last list, while US is in second place with 144 down from 169 six month ago (Japan ranks third place with 35, followed by Germany with 20, France with 18, and the UK with 15.). Aggregate performance is similar: China holds 35.3 percent of list flops, and the US is second with 29.8 percent (then Japan with 10.8 percent, Germany with 4.5 percent, UK with 3.8 percent and France with 3.6 percent).

Based on these metrics, undoubtedly some publications will proclaim China’s supercomputing supremacy, but that would be premature. When China expanded its Top500 toehold by a factor of three at SC15, Intersect360 Research CEO Addison Snell remarked that it wasn’t so much that China discovered supercomputing as it discovered the Top500 list. This observation continues to hold water.

An examination of the new systems China is adding to the list indicates concerted efforts by Chinese vendors Inspur, Lenovo, Sugon and more recently Huawei to benchmark loosely coupled Web/cloud systems, which are not true HPC machines. To wit, 68 out of the 96 systems that China introduced onto the latest list utilize 10G networking and none are deployed at research sites. The benchmarking of Internet and telecom systems for Top500 glory is not new. You can see similar fingerprints on the list (current and historical) from HPE and IBM, but China has doubled down. For comparison’s sake, the US put 19 new systems on the list and eight of those rely on 10G networking.

Top500 development over time–countries by performance share. US is red; China is dark blue. Click to enlarge.

Not only has the Linpacking of non-HPC systems inflated China’s list presence, it’s changed the networking demographics as the number of Ethernet-based machines climbs steadily. As the Top500 authors note, Gigabit Ethernet now connects 228 systems with 204 systems using 10G interfaces. InfiniBand technology is now found on 163 systems, down from 178 systems six months ago, and is the second most-used internal system interconnect technology.

Snell provided additional perspective: “What we’re seeing is a concerted effort to list systems in China, particularly from China-based system vendors. The submission rules allow for what is essentially benchmarking by proxy. If Linpack is run and verified on one system, the result can be assumed for other systems of the same (or greater) configuration, so it’s possible to put together concerted efforts to list more systems, whether out of a desire to show apparent market share, or simply for national pride.”

Discussions of list purity and benchmarking by proxy aside, the High Performance Linpack or any one-dimensional metric has limited usefulness across today’s broad mix of HPC applications. This truth, well understood in HPC circles, is not always appreciated outside the community or among government stakeholders who want “something to show” for public investment.

“Actual system effectiveness is getting more difficult to compare, as the industry swings back toward specialized hardware,” Snell commented. “Just because one architecture outperforms another on one benchmark doesn’t make it the best choice for all workloads. This is particularly challenging for mixed-workload research environments trying to serve multiple domains. 88 percent of all HPC users say they will need to support multiple architectures for the next few years, running applications on the most appropriate systems for their requirements.”

Chip technology – click to expand (Source: Top500)

There has been stagnation on the list for several iterations and turnover is historically low. Neither Summit or Sierra (the US CORAL machines, projected to achieve ~180 petaflops) nor the upgraded Tianhe-2A (projected 94.97 petaflops peak) made the cut for the 50th list as had been speculated. While HPC is seeing a time of increased architectural diversity at the system and processor level, the current list is less diverse by some measures. To wit, of the 136 new systems on the list, Intel is foundational to all of them (36 of these utilize accelerators*). So no new Power, no new AMD (it’s still early for EPYC) and nothing from ARM yet. In total 471 systems, or 94.2 percent, are now using Intel processors, up a notch from 92.8 percent six months ago. The share of IBM Power processors is at 14 systems, down from 21 systems in June. There are five AMD-based systems remaining on the list, down from seven one year ago.

Nvidia’s New SaturnV Volta system. Click to enlarge.

In the US, IBM Power9 systems Summit and Sierra are on track for 2018 installation at Oak Ridge and Livermore labs (respectively), and multiple other exascale-focused systems are in play in China, Europe and Japan, showcasing a new wave of architectural diversity. We expect there will be more exciting supercomputing trends to report on from ISC 2017 in Frankfurt.

*Breakdown of the 36 new accelerated systems: 29 have P100s (one with NVLink, an HPE SGI system at number 292 (Japan)), one internal Nvidia V100 Volta system (#149, SaturnV Volta); one K80-based system (#267, Lenovo); two Sugon-built P40 systems (#161, #300), and three PEZY systems (#260, #277, #308). Further, out of the 36, only the internal Nvidia machine is US-based. 30 are Chinese (by Lenovo, Inspur, Sugon); the remaining five are Japanese (by NTT, HPE, PEZY).

The post Flipping the Flops and Reading the Top500 Tea Leaves appeared first on HPCwire.

Ellexus Releases I/O Profiling Tool Suites Based on the Arm Architecture

HPC Wire - Mon, 11/13/2017 - 07:38

CAMBRIDGE, England, Nov. 13, 2017 — Ellexus, the I/O profiling company, has released versions of its flagship products BreezeHealthcheck and Mistral, all based on the Armv8-A architecture. The move comes as part of the company’s strategy to provide cross-platform support that gives engineers a uniform tooling experience across different hardware platforms.

Accompanying the release, Ellexus is also announcing that its tools will be integrated with Arm Forge and Arm Performance Reports, market-leading tools for debugging, profiling and optimizing high performance applications, previously known as Allinea.

The integration takes advantage of a custom metrics API in the Arm tools, allowing third parties to plug into them and enable contextual analysis of more targeted performance metrics. The integration with Arm tools will provide an even more comprehensive suite of I/O profiling tools at a time when optimization has never been so important.

Unlike other profiling tools, Ellexus’ technology can be run continuously at scale. The reports generated give enough information to make every engineer an I/O expert. These tools will help organizations to deploy an I/O profiling solution as part of software qualification, as a live monitoring tool, or as a way to understand and prevent I/O problems from returning.

Ellexus Mistral is designed to run in real time on a cluster, identifying rogue jobs before they can cause a problem. In contrast, Ellexus Breeze provides an extremely detailed profile of a job or application, providing dependency analysis that makes cloud migration or migration to a different architecture easy. Ellexus’ latest tool, Healthcheck, produces a simple I/O report that tells the user what their application is doing wrong and why, giving all users the power to optimise I/O for the cluster.

Ellexus Mistral, Breeze and Healthcheck add a comprehensive layer of I/O profiling information to what is already on offer from the Arm tool suite, and can drill down to which files have been accessed. They provide additional monitoring for IT managers and dev ops engineers, in particular those who run continuous integration and testing frameworks.

Tim Whitfield, vice president and general manager, Technology Services Group, Arm, said: “Arm is always looking for ways to further optimize our high-performance application estate, and as we continue to scale up and out this has never been more important. Arm and Ellexus are continuing a deep collaboration in this space to provide a comprehensive tools suite for HPC.”

On the decision to release versions based on Arm, Dr Rosemary Francis, CEO of Ellexus, said, “As the high-performance computing industry targets new compute architectures and cloud infrastructures, it’s never been more important to optimise the way programs access large data sets. Bad I/O patterns can harm shared storage and will limit application performance, wasting millions in lost engineering time.

“We are extremely excited to announce the integration of our tools with the Arm tool suite. Together we will be able to help more organisations to get the most out of their compute clusters.”

About Ellexus

Ellexus is an I/O profiling company. From a detailed analysis of one application or workflow pipeline to whole-cluster, lightweight monitoring and reporting, it provides solutions that solve all I/O profiling needs.

Source: Ellexus

The post Ellexus Releases I/O Profiling Tool Suites Based on the Arm Architecture appeared first on HPCwire.

Tensors Come of Age: Why the AI Revolution Will Help HPC

HPC Wire - Mon, 11/13/2017 - 07:00

A Quick Retrospect

Thirty years ago, parallel computing was coming of age. A bitter battle began between stalwart vector computing supporters and advocates of various approaches to parallel computing. IBM skeptic Alan Karp, reacting to announcements of nCUBE’s 1024-microprocessor system and Thinking Machines’ 65,536-element array, made a public $100 wager that no one could get a parallel speedup of over 200 on real HPC workloads. Gordon Bell softened that to an annual award for the best speedup, what we now know as the Gordon Bell Prize.

John Gustafson

This year also marks the 30th Supercomputing Conference. At the first SC in 1988, Seymour Cray gave the keynote, and said he might consider combining up to 16 processors. Just weeks before that event, Sandia researchers had managed to get thousand-fold speedups on the 1024-processor nCUBE for several DOE workloads, but those results were awaiting publication.

The magazine Supercomputing Review was following the battle with interest, publishing a piece by a defender of the old way of doing things, Jack Worlton, titled “The Parallel Processing Bandwagon.” It declared parallelism a nutty idea that would never be the right way to build a supercomputer. Amdahl’s law and all that. A rebuttal by Gustafson titled “The Vector Gravy Train” was to appear in the next issue… but there was no next issue of Supercomputing Review. SR had made the bold step of turning into the first online magazine, back in 1987, with a new name.

Lenore Mullin

Happy 30th Anniversary, HPCwire!

What better occasion than to write about another technology that is coming of age, one we will look back on as a watershed? That technology is tensor computing: Optimized multidimensional array processing using novel arithmetic[1].

Thank you, AI

You can hardly throw a tchotchke on the trade show floor of SC17 without hitting a vendor talking about artificial intelligence (AI), deep learning, and neural nets. Google recently open-sourced its TensorFlow AI library and Tensor Processing Unit. Intel bought Nervana. Micron, AMD, ARM, Nvidia, and a raft of startups are suddenly pursuing an AI strategy. Two key ideas keep appearing:

  • An architecture optimized for tensors
  • Departure from 32-bit and 64-bit IEEE 754 floating-point arithmetic

What’s going on? And is this relevant to HPC, or is it unrelated? Why are we seeing convergent evolution to the use of tensor processors, optimized tensor algebras in languages, and nontraditional arithmetic formats?

What’s going on is that computing is bandwidth-bound, so we need to make much better use of the bits we slosh around a system. Tensor architectures place data closer to where it is needed. New arithmetic represents the needed numerical values using fewer bits. This AI-driven revolution will have a huge benefit for HPC workloads. Even if Moore’s law stopped dead in its tracks, these approaches increase computing speed and cut space and energy consumption.

Tensor languages have actually been around for years. Remember APL and Fortran 90, all you old-timers? However, now we are within reach of techniques that can automatically optimize arbitrary tensor operations on tensor architectures, using an augmented compilation environment that minimizes clunky indexing and unnecessary scratch storage[2]. That’s crucial for portability.

Portability suffers, temporarily, as we break free from standard numerical formats. You can turn float precision down to 16-bit, but then the shortcomings of IEEE format really become apparent, like wasting over 2,000 possible bit patterns on “Not a Number” instead of using them for numerical values. AI is providing the impetus to ask what comes after floats, which are awfully long in the tooth and have never followed algebraic laws. HPC people will someday be grateful that AI researchers helped fix this long-standing problem.

The Most Over-Discovered Trick in HPC

As early as the 1950s, according to the late numerical analyst Herb Keller, programmers discovered they could make linear algebra go faster by blocking the data to fit the architecture. Matrix-matrix operations in particular run best when the matrices are tiled into submatrices, and even sub-submatrices. That was the beginning of dimension lifting, an approach that seems to get re-discovered by every generation of HPC programmers. It’s time for a “grand unification” of the technique.

Level N BLAS

The BLAS developers started in the 1970s with loops on lists (level 1), then realizing doubly nested loops were needed (level 2), then triply nested (level 3), and then LAPACK and SCALAPACK introduced blocking to better fit computer architectures. In other words, we’ve been computing with tensors for a long time, but not admitting it! Kudos to Google for naming their TPU the way they did. What we need now is “level N BLAS.”

Consider this abstract way of thinking about a dot product of four-element vectors:

Notice the vector components are not numbered; think of them as a set, not a list, because that allows us to rearrange them to fit any memory architecture. The components are used once in this case, multiplied, and summed to some level (in this case, all the way down to a single number). Multiplications can be completely parallel if the hardware allows, and summation can be as parallel as binary sum reduction allows.

Now consider the same inputs, but used for 2-by-2 matrix-matrix multiplication:

Each input is used twice, either by a broadcast method or re-use, depending on what the hardware supports. The summation is only one level deep this time.

Finally, use the sets for an outer product, where each input is used four times to create 16 parallel multiplications, which are not summed at all.

All these operations can be captured in a single unified framework, and that is what we mean by “Level N BLAS.” The sets of numbers are best organized as tensors that fit the target architecture and its cost functions. A matrix really isn’t two-dimensional in concept; that’s just for human convenience, and semantics treat it that way. An algebra exists for index manipulation that can be part of the compiler smarts, freeing the programmer from having to worry about details like “Is this row-major or column-major order[4]?” Tensors free you from imposing linear ordering that isn’t required by the algorithm and that impedes optimal data placement.

Besides linear algebra, tensors are what you need for Fast Fourier Transforms (FFTs), convolutions for signal and image processing, and yes, neural networks. Knowledge representation models like PARAFAC or CANDECOMP use tensors. Most people aren’t taught tensors in college math, and tensors admittedly look pretty scary with all those subscripts. One of Einstein’s best inventions was a shorthand notation that gets rid of a lot of the subscripts (because General Relativity requires tensor math), but it still takes a lot of practice to get a “feel” for how tensors work. The good news is, computer users don’t have to learn that skill, and only a few computer programmers have to. There now exists a theory[4], and many prototypes[5], for handling tensors automatically. We just need a few programmers to make use of the existing theory of array indexing to build and maintain those tools for distribution to all[6]. Imagine being able to automatically generate a Fast Fourier Transform (FFT) without having to worry about the indexing! That’s already been prototyped[7].

Which leads us to another HPC trend that we need for architecture portability…

The Rise of the Installer Program

In the old days, code development meant edit, compile, link, and load. Nowadays, people never talk about “linkers” and “loaders.” But we certainly talk about precompilers, makefiles and installer programs. We’ve also seen the rise of just-in-time compilation in languages like Java, with system-specific byte codes to get both portability and sometimes, surprisingly high performance. The nature of who-does-what has changed quite a bit over the last few decades. Now, for example, HPC software vendors cannot ship a binary for a cluster supercomputer because they cannot know which MPI library is in use; the installer links that in.

The compiler, or preprocessor, doesn’t have to guess what the target architecture is; it can instead specify what needs to be done, but not how, stopping at an intermediate language level. The installer knows what the costs are of all the data motions in the example diagrams above, and can predict precisely what the cost of a particular memory layout is. What you can predict, you can optimize. The installer takes care of the how.

James Demmel has often described the terrible challenge of building a ScaLAPACK-like library that gets high performance for all possible situations. Call it “The Demmel Dilemma.” It appears we are about to resolve that dilemma. With tensor-friendly architectures, and proper division of labor between the human programmer and the preprocessor, compiler, and installer, we can look forward to a day when we don’t need 50 pages of compiler flag documentation, or endless trial-and-error experimentation with ways to lay out arrays in storage that is hierarchical, parallel, and complicated. Automation is feasible, and essential.

The Return of the Exact Dot Product

There is one thing we’ve left out though, and it is one of the most exciting developments that will enable all this to work. You’ve probably never heard of it. It’s the exact dot product approach invented by Ulrich Kulisch, back in the late 1960s, but made eminently practical by some folks at Berkeley just this year[8].

With floats, because of rounding errors, you will typically get a different result when you change the way a sum is grouped. Floats disobey the associative law: (a + b) + c, rounded, is not the same as a + (b + c). That’s particularly hazardous when accumulating a lot of small quantities into a single sum, like when doing Monte Carlo methods, or a dot product. Just think of how often a scientific code needs to do the sum of products, even if it doesn’t do linear algebra. Graphics codes are full of three-dimensional and two-dimensional dot products. Suppose you could calculate sums of products exactly, rounding only when converting back to the working real number format?

You might think that would take a huge, arbitrary precision library. It doesn’t. Kulisch noticed that for floating-point numbers, a fixed-size register with a few hundred bits suffices as scratch space for perfect accuracy results even for vectors that are billions of floats long. You might think it would run too slowly, because of the usual speed-accuracy tradeoff. Surprise: It runs 3–6 times faster than a dot product with rounding after every multiply-add. Berkeley hardware engineers discovered this and published their result just this summer. In fact, the exact dot product is an excellent way to get over 90 percent of the peak multiply-add speed of a system, because the operations pipeline.

Unfortunately, the exact dot product idea has been repeatedly and firmly rejected by the IEEE 754 committee that defines how floats work. Fortunately, it is an absolute requirement in posit arithmetic[9] and can greatly reduce the need for double precision quantities in HPC programs. Imagine doing a structural analysis program with 32-bit variables throughout, yet getting 7 correct decimals of accuracy in the result, guaranteed. That’s effectively like doubling bandwidth and storage compared to the 64-bits-everywhere approach typically used for structural analysis.

A Scary-Looking Math Example

If you don’t like formulas, just skip this. Suppose you’re using a conjugate gradient solver, and you want to evaluate its kernel as fast as possible:

A theory exists to mechanically transform these formulas to a “normal form” that looks like this:

That, plus hardware-specific information, allows automatic data layout that minimizes indexing and temporary storage, and maximizes locality of access for any architecture. And with novel arithmetic like posits that supports the exact dot product, you get a bitwise identical result no matter how the task is organized to run in parallel, and at near-peak speed. Programmers won’t have to wrestle with data placement, nor will they have to waste hours trying to figure out if the parallel answer is different because of a bug or because of rounding errors.

What People Will Remember, 30 Years from Now

By 2047, people may look back on the era of IEEE floating-point arithmetic the way we now regard the EBCDIC character set used on IBM mainframes (which many readers may never have heard of, but it predates ASCII). They’ll wonder how people ever tolerated the lack of repeatability and portability and the rounding errors that were indistinguishable from programming bugs, and they may reminisce about how people wasted 15-decimal accuracy on every variable as insurance, when they only needed four decimals in the result. Not unlike the way some of us old-timers remember “vectorizing” code in 1987 to get it to run faster, or “unrolling” loops to help out the compiler.

Thirty years from now, the burden of code tuning and portability for arrays will be back where it belongs: on the computer itself. Programmers will have long forgotten how to tile matrices into submatrices because the compiler-installer combination will do that for tensors for any architecture, and will produce bitwise-identical results on all systems.
The big changes that are permitting this watershed are all happening now. This year. These are exciting times! □

[1] A. Acar et al., “Tensor Computing for Internet of Things,” Dagstuhl Reports, Vol. 6, No. 4, 2016, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, doi:10.4230/DagRep.6.4.57, http://drops.dagstuhl.de/opus/volltexte/2016/6691 pp. 57–79.

[2] Rosencrantz et al., “On Minimizing Materializations of Array-Valued Temporaries,” ACM Trans. Program. Lang. Syst., Vol. 28, No. 6, 2006, http://doi.acm.org/10.1145/118663, pp.1145–1177.

[3] L. Mullin and S. Thibault, “Reduction Semantics for Array Expressions: The Psi Compiler,” Technical Report, University of Missouri-Rolla Computer Science Dept., 1994.

[4] K. Berkling, Arrays and the Lambda Calculus, SU0CIS-90-22, CASE Center and School of CIS, Syracuse University, May 1990.

[5] S. Thibault et al., “Generating Indexing Functions of Regularly Sparse Arrays for Array Compilers,” Technical Report CSC-94-08, University of Missouri-Rolla, 1994.

[6] L. Mullin and J. Raynolds, Conformal Computing: Algebraically Connecting the Hardware/Software Boundary using a Uniform Approach to High-Performance Computation for Software and Hardware Applications, arXiv:0803.2386, 2008.

[7] H. Hunt et al., “A Transformation-Based Approach for the Design of Parallel/Distributed Scientific Software: The FFT,” CoRR, 2008, http://dblp.uni-trier.de/rec/bib/journals/corr/abs-0811-2535.

[8] http://arith24.arithsymposium.org/slides/s7-koenig.pdf.

[9] http://www.posithub.org.

About the Authors

John L. Gustafson
john.gustafson@nus.edu.sg

John L. Gustafson, Ph.D., is currently Visiting Scientist at A*STAR and Professor of Computer Science at National University of Singapore. He is a former Senior Fellow and Chief Product Architect at AMD, and a former Director at Intel Labs. His work showing practical speedups for distributed memory parallel computing in 1988 led to his receipt of the inaugural Gordon Bell Prize, and his formulation of the underlying principle of “weak scaling” is now referred to as Gustafson‘s law. His 2015 book, “The End of Error: Unum Computing” has been an Amazon best-seller in its category. He is a Golden Core member of IEEE. He is also an “SC Perennial” who has been to every Supercomputing conference since the first one in 1988. He is an honors graduate of Caltech and received his MS and PhD from Iowa State University.

Lenore Mullin
lenore@albany.edu

Lenore M. Mullin, Ph.D., is an Emeritus Professor, Computer Science, University at Albany, SUNY,  a Research Software Consultant to REX Computing, Inc. and Senior Computational Mathematician at Etaphase, Inc. Dr. Mullin invented a new theory of n-dimensional tensors/arrays in her 1988 Dissertation, A Mathematics of Arrays (MoA) that includes an indexing calculus, The Psi Calculus. This theory built on her tenure at IBM Research working with Turing Award Winner, Kenneth Iverson. She has built numerous software and hardware prototypes illustrating both the power and mechanization of MoA and the Psi Calculus. MoA was recognized by NSF with the 1992 Presidential Faculty Fellowship, entitled “Intermediate Languages for Enhanced Parallel Performance”, awarded to only 30 nationally. Her binary transpose was accepted and incorporated into Fortran 90. On sabbatical at MIT Lincoln Laboratory, she worked to improve the standard missile software through MoA design. As an IPA, she ran the Algorithms, Numerical and Symbolic Computation program in NSF’s CISE CCF. While on another leave she was Program Director in DOE’s ASCR Program. She lives in Arlington, Va.

The post Tensors Come of Age: Why the AI Revolution Will Help HPC appeared first on HPCwire.

CoolIT Systems Launches Rack DCLC AHx2 Heat Exchange Module

HPC Wire - Sat, 11/11/2017 - 21:26

CALGARY, AB, November 10, 2017 – CoolIT Systems (CoolIT), world leader in energy efficient liquid cooling solutions for HPC, Cloud, and Enterprise markets, has expanded its Rack DCLC product line with the release of the AHx2 Heat Exchange Module. This compact Liquid-to-Air heat exchanger makes it possible for Direct Contact Liquid Cooling (DCLC) enabled servers to be thermally tested during the factory burn-in process, without additional liquid cooling infrastructure. CoolIT will officially launch the AHx2 at the Supercomputing Conference 2017 (SC17) in Denver, Colorado.

The AHx2 is a vital addition to CoolIT’s broad range of liquid cooling products. It is a compact, easy to transport air heat exchanger designed to enable factory server burn-in when liquid is not present in the facility. As a Liquid-to-Air heat exchanger, the AHx2 dissipates heat from the coolant in the server loop to the ambient environment. AHx2 provides direct liquid cooling to four DCLC enabled servers, and provides 2kW of heat load management. The design and size allows the unit to safely sit on top or adjacent to a server chassis during manufacturing.

“The Rack DCLC AHx2 Module is the ideal way for OEMs and System Integrators to conduct thermal testing during the factory burn-in process,” said Patrick McGinn, VP of Product Marketing, CoolIT Systems. “Our customers will appreciate having access to such robust testing potential in such a compact design, without needing to invest in supplementary liquid cooling infrastructure.”

The AHx2 Heat Exchange Module is a product designed to meet a critical customer need, and as such, is an important part of CoolIT’s modular product array. SC17 attendees can learn more about the solution by visiting CoolIT at booth 1601 from November 13-16. To set up an appointment, contact Lauren Macready at lauren.macready@coolitsystems.com

About CoolIT Systems.

CoolIT Systems, Inc. is the world leader in energy efficient liquid cooling technology for the Data Center, Server and Desktop markets. CoolIT’s Rack DCLC platform is a modular, rack-based, advanced cooling solution that allows for dramatic increases in rack densities, component performance, and power efficiencies. The technology can be deployed with any server and in any rack making it a truly flexible solution. For more information about CoolIT Systems and its technology, visit www.coolitsystems.com.

About Supercomputing Conference (SC17)

Established in 1988, the annual SC conference continues to grow steadily in size and impact each year. Approximately 5,000 people participate in the technical program, with about 11,000 people overall. SC has built a diverse community of participants including researchers, scientists, application developers, computing center staff and management, computing industry staff, agency program managers, journalists, and congressional staffers. This diversity is one of the conference’s main strengths, making it a yearly “must attend” forum for stakeholders throughout the technical computing community. For more information, visit https://sc17.supercomputing.org/.

Source: CoolIT Systems, Inc.

The post CoolIT Systems Launches Rack DCLC AHx2 Heat Exchange Module appeared first on HPCwire.

IBM Announces Advances to IBM Quantum Systems & Ecosystem

HPC Wire - Sat, 11/11/2017 - 15:33

YORKTOWN HEIGHTS, N.Y., Nov. 11, 2017 — IBM announced two significant quantum processor upgrades for its IBM Q early-access commercial systems. These upgrades represent rapid advances in quantum hardware as IBM continues to drive progress across the entire quantum computing technology stack, with focus on systems, software, applications and enablement.

  • The first IBM Q systems available online to clients will have a 20 qubit processor, featuring improvements in superconducting qubit design, connectivity and packaging. Coherence times (the amount of time available to perform quantum computations) lead the field with an average value of 90 microseconds, and allow high-fidelity quantum operations.
  • IBM has also successfully built and measured an operational prototype 50 qubit processor with similar performance metrics. This new processor expands upon the 20 qubit architecture and will be made available in the next generation IBM Q systems.

Clients will have online access to the computing power of the first IBM Q systems by the end of 2017, with a series of planned upgrades during 2018. IBM is focused on making available advanced, scalable universal quantum computing systems to clients to explore practical applications. The latest hardware advances are a result of three generations of development since IBM first launched a working quantum computer online for anyone to freely access in May 2016. Within 18 months, IBM has brought online a 5 and 16 qubit system for public access through the IBM Q experience and developed the world’s most advanced public quantum computing ecosystem.

An IBM cryostat wired for a prototype 50 qubit system. (PRNewsfoto/IBM)

“We are, and always have been, focused on building technology with the potential to create value for our clients and the world,” said Dario Gil, vice president of AI and IBM Q, IBM Research. “The ability to reliably operate several working quantum systems and putting them online was not possible just a few years ago. Now, we can scale IBM processors up to 50 qubits due to tremendous feats of science and engineering. These latest advances show that we are quickly making quantum systems and tools available that could offer an advantage for tackling problems outside the realm of classical machines.”

Over the next year, IBM Q scientists will continue to work to improve its devices including the quality of qubits, circuit connectivity, and error rates of operations to increase the depth for running quantum algorithms. For example, within six months, the IBM team was able to extend the coherence times for the 20 qubit processor to be twice that of the publicly available 5 and 16 qubit systems on the IBM Q experience.

In addition to building working systems, IBM continues to grow its robust quantum computing ecosystem, including open-source software tools, applications for near-term systems, and educational and enablement materials for the quantum community. Through the IBM Q experience, over 60,000 users have run over 1.7M quantum experiments and generated over 35 third-party research publications. Users have registered from over 1500 universities, 300 high schools, and 300 private institutions worldwide, many of whom are accessing the IBM Q experience as part of their formal education. This form of open access and open research is critical for accelerated learning and implementation of quantum computing.

“I use the IBM Q experience and QISKit as an integral part of my classroom teaching on quantum computing, and I cannot emphasize enough how important it is. In prior years, the course was interesting theoretically, but felt like it described some far off future,” said Andrew Houck, professor of electrical engineering, Princeton University. “Thanks to this incredible resource that IBM offers, I have students run actual quantum algorithms on a real quantum computer as part of their assignments! This drives home the point that this is a real technology, not just a pipe dream.  What once seemed like an impossible future is now something they can use from their dorm rooms. Now, our enrollments are skyrocketing, drawing excitement from top students from a very wide range of disciplines.”

To augment this ecosystem of quantum researchers and application development, IBM rolled out earlier this year its QISKit (www.qiskit.org) project, an open-source software developer kit to program and run quantum computers. IBM Q scientists have now expanded QISKit to enable users to create quantum computing programs and execute them on one of IBM’s real quantum processors or quantum simulators available online. Recent additions to QISKit also include new functionality and visualization tools for studying the state of the quantum system, integration of QISKit with the IBM Data Science Experience, a compiler that maps desired experiments onto the available hardware, and worked examples of quantum applications.

“Being able to work on IBM’s quantum hardware and have access through an open source platform like QISKit has been crucial in helping us to understand what algorithms–and real-world use cases–might be viable to run on near-term processors,” said Matt Johnson, CEO, QC Ware. “Simulators don’t currently capture the nuances of the actual quantum hardware platforms, and nothing is more convincing for a proof-of-concept than results obtained from an actual quantum processor.”

Quantum computing promises to be able to solve certain problems – such as chemical simulations and types of optimization – that will forever be beyond the practical reach of classical machines. In a recent Nature paper, the IBM Q team pioneered a new way to look at chemistry problems using quantum hardware that could one day transform the way new drugs and materials are discovered. A Jupyter notebook that can be used to repeat the experiments that led to this quantum chemistry breakthrough is available in the QISKit tutorials. Similar tutorials are also provided that detail implementation of optimization problems such as MaxCut and Traveling Salesman on IBM’s quantum hardware.

This ground-breaking work demonstrates it is possible to solve interesting problems using near term devices and that it will be possible to find a quantum advantage over classical computers. IBM has made significant strides tackling problems on small scale universal quantum computing systems. Improvements to error mitigation and to the quality of qubits are our focus for making quantum computing systems useful for practical applications in the near future. As well, IBM has industrial partners exploring practical quantum applications through the IBM Research Frontiers Institute, a consortium that develops and shares a portfolio of ground-breaking computing technologies and evaluates their business implications. Founding members include Samsung, JSR, Honda, Hitachi Metals, Canon, and Nagase.

These quantum advances are being presented today at the IEEE Industry Summit on the Future Of Computing as part of IEEE Rebooting Computing Week.

IBM Q is an industry-first initiative to build commercially available universal quantum computing systems for business and science applications. For more information about IBM’s quantum computing efforts, please visit www.ibm.com/ibmq.

Source: IBM

The post IBM Announces Advances to IBM Quantum Systems & Ecosystem appeared first on HPCwire.

Early Cluster Comp Betting Odds Favor China, Taiwan, and Poland

HPC Wire - Sat, 11/11/2017 - 10:15

So far the early action in the betting pool favors Taiwan’s NTHU, China’s Tsinghua, and, surprisingly, Poland’s University of Warsaw. Other notables include Team Texas at 9 to 1, the German juggernaut FAU/TUC team at 12 to 1, and the University of Illinois at 13 to 1.

There are several teams that haven’t seen any action yet, including last year’s winner USTC, third place 2016 winner Team Peking, and up and comer Nanyang University.

I’m also not seeing any betting love for perennial favorite Team Chowder (Boston).

If you want to find out more about the teams before laying down your (virtual) money, you can see our exhaustive profiles of each team here. That should give you enough info to start laying down some money on the win line.

The betting window will be open until this coming Tuesday, so get in and get paid. Here’s a link to the betting pool.

The post Early Cluster Comp Betting Odds Favor China, Taiwan, and Poland appeared first on HPCwire.

Indiana University Showcases SC17 Activities

HPC Wire - Sat, 11/11/2017 - 09:41

DENVER, Colo., Nov. 11 — Computing and networking experts from Indiana University will gather in the Mile High City next week for SC17, the International Conference for High Performance Computing, Networking, Storage and Analysis taking place November 12-17 in Denver. SC17 is one of the world’s foremost tech events, annually attracting thousands of scientists, researchers, and IT experts from across the world.

IU’s Pervasive Technology InstituteGlobal Research Network Operations Center, and School of Informatics, Computing and Engineering (SICE) will team up to host a research-oriented booth (#601) in the exhibition portion of the conference, showcasing current research and educational initiatives.

With the theme “We put the ‘super’ in computing,” the IU booth will showcase staff and faculty members and projects that are pushing the boundaries of what’s possible in computing and networking. Although they may not sport capes, the IU team devotes its considerable abilities to harnessing the cloud, achieving maximum throughput, engineering intelligent systems, and thwarting real-life cybervillains.

“SC17 marks the 20th anniversary of IU’s first display at the Supercomputing Conference, a milestone that underscores our deep commitment to leveraging high performance computing and networking to benefit the IU community, the state of Indiana, and the world,” said Brad Wheeler, IU vice president for IT and chief information officer. “In that time span, our researchers, scientists, and technologists have not only put IU on the map in the world of HPC, but their talents and discoveries have made IU a true leader in this increasingly important realm.”

One highlight of IU’s participation in SC17 is Judy Qiu’s invited talk, “Harp-DAAL: A Next Generation Platform for High Performance Machine Learning on HPC-Cloud.” Qiu is an associate professor in the intelligent systems engineering department in SICE. She will discuss growth in HPC and machine learning for big data with cloud infrastructure, and introduce Harp-DAAL, a high performance machine learning framework.

The Supercomputing Conference is always a fantastic opportunity to showcase the work that is being conducted at SICE and provides a spotlight for our wonderful faculty.

Raj Acharya, dean of the IU School of Informatics, Computing and Engineering

“The Supercomputing Conference is always a fantastic opportunity to showcase the work that is being conducted at SICE and provides a spotlight for our wonderful faculty,” said Raj Acharya, dean of SICE. “The conference itself is so valuable because it brings together the greatest minds in supercomputing in an atmosphere of collaboration that is as inspiring as it is informative. We’re always thrilled to be a part of it.”

This year, the IU team continues its leadership role in organizing the conference. Matt Link, associate vice president and director of systems for IU Research Technologies, serves as a member of the SC Steering Committee. Scott Michael, manager of research analytics, is vice chair of the Students@SC committee, and Jenett Tillotson, senior system administrator for high performance systems, is a member of the Student Cluster Competition committee.

Additionally, IU network engineers will continue a decades-long tradition of helping to operate SCinet, one of the most powerful and advanced networks in the world. Created each year for the conference, SCinet is a high-capacity network to support the applications and experiments that are the hallmark of the SC conference. Laura Pettit, SICE director of intelligent systems engineering research operations, is the SCinet volunteer services co-chair, and ISE doctoral students Lucas Brasilino and Jeremy Musser are also volunteering with SCinet.

This year, the IU booth will include a range of presentations and demonstrations:

  • Current Trends and Future Challenges in HPC by Jack Donagarra, University of Tennessee and Oak Ridge National Laboratory.
  • Special event: Jetstream and OpenStack by Dave Hancock and partners. OpenStack is the emerging standard for deploying cloud computing capabilities, and cloud-based infrastructure is increasingly able to handle HPC workloads. During this special event, members of the Jetstream team and the OpenStack Foundation Scientific Working Group will discuss how they use OpenStack to serve HPC customers.
  • Science Gateways with Apache Airavata by Marlon Pierce, Eroma Abeysinghe and Surresh Marru. Science gateways are user interfaces and user-supporting services that simplify access to advanced resources for novice users and provide new modes of usage for power users. Apache Airavata is open source cyberinfrastructure software for building science gateways. During this demonstration, the presenters provide an overview of recent developments.
  • Big Data Toolkit Spanning HPC, Grid, Edge and Cloud Computing by Geoffrey Fox. This demonstration looks at big data programming environments such as Hadoop, Spark, Flink, Heron, Pregel; HPC concepts such as MPI and asynchronous many-task runtimes; and cloud/grid/edge ideas such as event-driven computing, serverless computing, workflow and services.
  • Cybersecurity for Science by Von Welch. The Center for Applied Cybersecurity Research, affiliated with the Pervasive Technology Institute at Indiana University, specializes in cybersecurity for R&D. In this scope, the center works with science communities across the country, including leading the National Science Foundation’s Cybersecurity Center of Excellence. This talk will provide an overview of what cybersecurity means in the context of science and how it can enable productive, trusted scientific research.
  • Enabling High-Speed Networking for Researchers by Chris Robb. With data networking becoming increasingly complex and opaque, researchers are often unsure how to address poor performance between their endpoints. This talk will introduce the IRNC NOC Performance Engagement Team (PET) and show how it can help researchers determine the best approach to achieving their maximum bandwidth potential.
  • Scientific Workflow Integrity for Pegasus by Von Welch and partners. The Pegasus Workflow Management System is a popular system for orchestrating complex scientific workflows. In this talk, the PIs of the NSF-funded Scientific Workflow Integrity for Pegasus project will talk about scientific data integrity challenges and their work to add greater assurances to Pegasus for data integrity.
  • Macroscopes from the “Places & Spaces: Mapping Science” Exhibition by Katy Börner. See up to 100 large-format maps that showcase effective visualization techniques to communicate science to the general public. These interactive visualizations, called macroscopes, help people see patterns in data that are too large or complex to view unaided.
  • Proteus: A Configurable FPGA Cluster for High Performance Networking by Martin Swany. Proteus is new HPC cluster and research testbed that will enable investigation of novel and advanced architectures in HPC. Using FPGAs to optimize the performance of common parallel operations, this serves as a model for hardware accelerated network “microservices.”
  • International Networks at IU by Jennifer Schopf. International Networks at IU is a multi-million dollar NSF-funded program that supports the use of international links between the United States, Europe, Asia and Africa. Demos will review our currently supported links, as well as the measurement and monitoring services deployed on the links.

About the IU School of Informatics, Computing, and Engineering
The School of Informatics, Computing, and Engineering’s rare combination of programs—including informatics, computer science, library science, information science and intelligent systems engineering—makes SICE one of the largest, broadest and most accomplished of its kind. The extensive programs are united by a focus on information and technology.

About the Pervasive Technology Institute
The Pervasive Technology Institute (PTI) at Indiana University is a world-class organization dedicated to the development and delivery of innovative information technology to advance research, education, industry and society. Since 2000, PTI has received more than $50 million from the National Science Foundation to advance the nation’s research cyberinfrastructure.

About the Global Research Network Operations Center
The Global Research Network Operations Center (GlobalNOC) supports advanced international, national, regional and local high-performance research and education networks. GlobalNOC plays a major role in transforming the face of digital science, research and education in Indiana, the United States, and the world by providing unparalleled network operations and engineering needed for reliable and cost-effective access to specialized facilities for research and education.

Source: Indiana University

The post Indiana University Showcases SC17 Activities appeared first on HPCwire.

Pages

Subscribe to www.rmacc.org aggregator