XSEDE News

Subscribe to XSEDE News feed
This is the XSEDE user news RSS feed. You can view news in the XSEDE User Portal at https://www.xsede.org/news?p_p_id=usernews_WAR_usernewsportlet&p_p_lifecycle=0&_usernews_WAR_usernewsportlet_view=default.
Updated: 19 hours 41 min ago

SDSC Comet maintenance (06/28/2017) and start of GPU resource allocations (07/01/2017)

Thu, 06/22/2017 - 09:43

SDSC Comet will be under maintenance 9AM-5PM (PT), 06/28/2017. During this time we will continue the work of integrating the new P100 GPU nodes into Comet. We have a reservation in place to prevent jobs from running during the maintenance period. Jobs that do not fit in the time window before the maintenance will run after the maintenance is complete.

Also, as a reminder starting July 1, 2017, the GPU nodes at SDSC will be allocated as a separate resource. If you are a current GPU node user, for your group to continue utilizing GPU nodes on Comet at SDSC, the PI or the allocation manager on your project will need to request a transfer of time from ‘SDSC Dell Cluster with Intel Haswell Processors (Comet)’ to the new resource ‘SDSC Comet GPU Nodes (Comet GPU)’ via the XSEDE portal at:

https://portal.xsede.org/group/xup/submit-request#/

When transferring time, please use the conversion factor: 1 Comet GPU SU == 14 Comet SUs.

As part of the GPU resource transition we will put in a reservation to prevent GPU jobs from running on 06/30/2017. Any jobs that don’t fit in the window before the reservation will be run after the transition is complete.

Please email help@xsede.org if you have any questions.

Short XSEDE SSO Hub maintenance, 11:00 EDT today

Thu, 06/22/2017 - 08:24

There will be a brief maintenance outage of the XSEDE SSO Hub, starting at 11:00am EDT today. Service should return in a matter of minutes thereafter. . Please email help@xsede.org if you have any questions.

TACC status 18 July 2017

Wed, 06/21/2017 - 10:06

Access to all TACC systems will be unavailable from 7:00 AM CDT on July 18 until 8:00 AM CDT on July 19 to allow for upgrades to the core network switch hardware and to perform system maintenance on the Stockyard global filesystem. Users will have intermittent access to all TACC services and systems until the core network switch upgrade is complete. During this maintenance window, the production systems Stampede2, Stampede, Lonestar5, Maverick, Wrangler and Hikari will also be down for system maintenance and unavailable to users until after the Stockyard global filesystem maintenance has been completed. Updates to user news will be sent once network services have been restored and when the production systems are restored to normal operations.

SDSC Comet maintenance 10AM-Noon(PT), Monday, 06/19/2017

Sat, 06/17/2017 - 07:10

Comet will have a short maintenance 10AM-Noon (PT) on Monday, 06/19/2017. We have a reservation in place to prevent jobs from running during this period. Any jobs that don’t fit in the time window before the maintenance will wait till the reservation is released. The maintenance is part of the work to add 36 new GPU nodes. Please email help@xsede.org if you have any questions.

Ranch Status 6/15

Thu, 06/15/2017 - 16:35

TACC’s Ranch system experienced a hardware failure around 4:05PM, and the administrators are working to resolve the problem. We will update this announcement when the system is back in production.

Thanks, TACC Team

XSEDE ALLOCATION REQUESTS Open Submission, Guidelines, Resource and Policy Changes

Thu, 06/15/2017 - 07:04

XSEDE is now accepting Research Allocation Requests for the allocation period, October 1, 2017 to September 30, 2018. The submission period is from June 15, 2017 thru July 15, 2017. Please review the new XSEDE systems and important policy changes (see below) before you submit your allocation request through the XSEDE User Portal
————————————————————
Important
A recent change to the submission of proposals is that the Allocations proposal submission system(XRAS) will force submissions to adhere to the uploaded document page limits, these can be found at https://portal.xsede.org/group/xup/allocation-policies#63

NEW XSEDE Resources:
See the Resource Catalog for a list of XSEDE compute, visualization and storage resources, and more details on the new systems (https://portal.xsede.org/web/guest/resources/overview).

  • The Texas Advanced Computing Center(TACC) introduces their new resource Stampede 2. Stampede 2 will enter full production in the Fall 2017 as the 18 petaflop national resource that builds on the successes of the original Stampede system it replaces. The first phase, available now, of the Stampede 2 rollout features the second generation of processors based on Intel’s Many Integrated Core (MIC) architecture. These 4,200 Knights Landing (KNL) nodes represent a radical break with the first generation Knights Corner (KNC) MIC coprocessor. Unlike the legacy KNC, a Stampede KNL is not a coprocessor: each 68-core KNL is a stand-alone, self-booting processor that is the sole processor in its node. Phase 2, scheduled for deployment in October 2017, will add approximately 50% of additional compute power to the system as a whole via the integration of 1,736 Intel Xeon (Skylake) nodes. When fully deployed, Stampede 2 will deliver twice the performance of the original Stampede system. Please note that Stampede 2 is allocated in service units (SU)s. An SU is defined as 1 wall-clock node hour not core hours!

Continuing this submission period both the Pittsburgh Supercomputing Center(PSC) and the San Diego Supercomputer Center(SDSC) will be allocating their GPU compute resources separately from their standard compute nodes, please see details below:

  • SDSC’s Comet GPU has 36 general purpose GPU nodes, with 2 Tesla K80 GPU graphics cards per node, each with 2 GK210 GPUs (144 GPUs in total). Each GPU node also features 2 Intel Haswell processors of the same design and performance as the standard compute nodes (described separately). The GPU nodes are integrated into the Comet resource and available through the SLURM scheduler for either dedicated or shared node jobs (i.e., a user can run on 1 or more GPUs/node and will be charged accordingly). Like the Comet standard compute nodes, the GPU nodes feature a local SSD which can be specified as a scratch resource during job execution – in many cases using SSD’s can alleviate I/O bottlenecks associated with using the shared Lustre parallel file system.**Comet’s GPUs are a specialized resource that performs well for certain classes of algorithms and applications. There is a large and growing base of community codes that have been optimized for GPUs including those in molecular dynamics, and machine learning. GPU-enabled applications on Comet include: Amber, Gromacs, BEAST, OpenMM, TensorFlow, and NAMD.
  • PSC introduces Bridges GPU, a newly allocatable resource within Bridges that features 32 NVIDIA Tesla K80 GPUs and 64 NVIDIA Tesla P100 GPUs. Bridges GPU complements Bridges’ Regular, Bridges Large, and its Pylon storage system to accelerate deep learning and a wide variety of application workloads. The 16 GPU nodes, each with 2 NVIDIA Tesla K80 GPU cards, 2 Intel Xeon CPUs (14 cores each), and 128GB of RAM and 32 GPU nodes, each with 2 NVIDIA Tesla P100 GPU cards, 2 Intel Xeon CPUs (16 cores each), and 128GB of RAM. The PSC’s Bridges is a uniquely capable resource for empowering new research communities and bringing together HPC and Big Data. Bridges integrates a uniquely flexible, user-focused, data-centric software environment with very large shared memory, a high-performance interconnect, and rich file systems to empower new research communities, bring desktop convenience to HPC and drive complex workflows. Bridges supports new communities through extensive interactivity, gateways, persistent databases and web servers, high productivity programming languages, and virtualization. The software environment is extremely robust, supporting enabling capabilities such as Python, R, and MATLAB on large-memory nodes, genome sequence assembly on nodes with up to 12TB of RAM, machine learning and especially deep learning, Spark and Hadoop, complex workflows, and web architectures to support gateways.

Storage Allocations: Continuing this submission period, access to XSEDE storage resources along with compute resources will need to be requested and justified, both in the XRAS application and the body of the proposal’s main document. The following XSEDE sites will be offering allocatable storage facilities, these are:

    • SDSC (Data Oasis)
    • TACC (Ranch)
    • TACC (Wrangler storage)
    • PSC (Pylon)
    • IU-TACC (Jetstream)

Storage needs have always been part of allocation requests, however, XSEDE will be enforcing the storage awards in unison with the storage sites. Please see (https://www.xsede.org/storage).

Estimated Available Service Units/GB for upcoming meeting:
Indiana University/TACC (Jetstream) 5,000,000
LSU (SuperMIC) 6,500,000
Open Science Grid (OSG) 2,000,000
PSC Bridges(Regular Memory) 44,000,000
PSC Bridges (Large Memory) 700,000
PSC Bridges (Bridge GPU) 500,000
PSC Persistent disk storage (Pylon) 2,000,000
SDSC Dell Cluster with Intel Haswell Processors (Comet) 80,000,00
SDSC Dell Cluster with Intel Haswell Processors (Comet GPU) 800,000
SDSC Medium-term disk storage (Data Oasis) 300,000
Stanford Cray CS-Storm GPU Supercomputer(XStream) 500,000
TACC HP/NVIDIA Interactive Visualization and Data Analytics System (Maverick) 4,000,000
TACC Dell/Intel Knight’s Landing System (Stampede2 – Phase 1) 10,000,000 node hours
TACC Data Analytics System (Wrangler) 180,000 node hours
TACC Long-term Storage (Wrangler Storage) 500,000
TACC Long-term tape Archival Storage (Ranch) 2,000,000

Allocation Request Procedures:

  • In the past code performance and scaling was to be a section addressed in all research requests main document, this section seems to have been overlooked by many PIs in the recent quarterly research submission periods which has led to severe reductions or even complete rejection of both new and renewal requests. Continuing this quarterly submission period it will be mandatory to upload a scaling and code performance document detailing your code efficiency. Please see section 7.2 Review Criteria, of the Allocations Policy document (https://portal.xsede.org/group/xup/allocation-policies).
  • Also, it has become mandatory to discuss/detail, in the main document, the disclosure of access to other cyberinfrastructure resources(e.g. NSF Blue Waters, DOE INCITE resources, local campus, …) should be detailed in the main document. Please see section 7.3 Review Criteria, of the Allocations Policy document(https://portal.xsede.org/group/xup/allocation-policies). The failure to disclose access to these resources could lead to severe reductions or even complete rejection of both new and renewal requests. If there is no access to other cyberinfrastructure resources this should be made clear as well.
  • The XRAC review panel has asked that the PIs include the following: "The description of the computational methods must include explicit specification of the integration time step value, if relevant (e.g. Molecular Dynamics Simulations). If these details are not provided a 1 femtosecond (1fs) will be assumed with this information being used accordingly to evaluate the proposed computations."
  • All funding used to support the Research Plan of an XRAC Research Request must be reported in the Supporting Grants form in the XRAS submission. Reviewers use this information to assess whether the PI has enough support to accomplish the Research Plan, analyze data, prepare publications, etc.
  • Publications that have resulted from the use of XSEDE resources should be entered into your XSEDE portal profile which you will be able to attach to your Research submission.
  • Also note that it is expected that the scaling and code performance information is from the resource(s) being requested in the research request.

Policy Changes: Allocations Policy document(https://portal.xsede.org/group/xup/allocation-policies)

  • Storage allocation requests for Archival Storage in conjunction with compute and visualization resources and/or Stand Alone Storage need to be requested explicitly both in your proposal (research proposals) and also in the resource section of XRAS.
  • Furthermore, the PI must describe the peer-reviewed science goal that the resource award will facilitate. These goals must match or be sub-goals of those described in the listed funding award for that year.
  • After the Panel Discussion of the XRAC meeting, the total Recommended Allocation is determined and compared to the total Available Allocation across all resources. Transfers of allocations may be made for projects that are more suitable for execution on other resources; transfers may also be made for projects that can take advantage of other resources, hence balancing the load. When the total Recommended considerably exceeds Available Allocations a reconciliation process adjusts all Recommended Allocations to remove oversubscription. This adjustment process reduces large allocations more than small ones and gives preference to NSF-funded projects or project portions. Under the direction of NSF, additional adjustments may be made to achieve a balanced portfolio of awards to diverse communities, geographic areas, and scientific domains.
  • Conflict of Interest (COI) policy will be strictly enforced for large proposals. For small requests, the PI/reviewer may participate in the respective meeting, but leave the room during the discussion of their proposal.
  • XRAC proposals for allocations request resources that represent a significant investment of the National Science Foundation. The XRAC review process therefore strives to be as rigorous as for equivalent NSF proposals.
  • The actual availability of resources is not considered in the review. Only the merit of the proposal is. Necessary reductions due to insufficient resources will be made after the merit review, under NSF guidelines, as described in Section 6.4.1.
  • 10% max advance on all research requests, as described in Section 3.5.4

Examples of well-written proposals:
For more information about writing a successful research proposal as well as examples of successful research allocation requests please see: (https://portal.xsede.org/allocations/research#examples)

If you would like to discuss your plans for submitting a research request please send email to the XSEDE Help Desk at help@xsede.org. Your questions will be forwarded to the appropriate XSEDE Staff for their assistance.

Ken Hackworth
XSEDE Resource Allocations Coordinator
help@xsede.org

SDSC Comet: Scheduler and compiler license issues resolved

Wed, 06/14/2017 - 16:03

We are currently seeing problems with one of the Comet service nodes and that is impacting the scheduler and compiler license availability. We will update as soon as the issue is resolved. Please email help@xsede.org if you have any questions.

June ECSS Symposium

Tue, 06/13/2017 - 13:59

Please join us for the June ECSS Symposium.

Tuesday 6/20, 10am Pacific/1pm Eastern

Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/350667546
 
Or iPhone one-tap (US Toll):  +14086380968,350667546# or +16465588656,350667546#
 
Or Telephone:
    Dial: +1 408 638 0968 (US Toll) or +1 646 558 8656 (US Toll)
    Meeting ID: 350 667 546
    International numbers available: https://zoom.us/zoomconference?m=4t505OnHv36W845OhVQ8SlxRAP07A-_n

David Bock (NCSA) will be talking about his visualization work for PI Doron Kushnir who is simulating white dwarf collisions. An adaptive mesh refinement grid simulates the varying levels of detail and a custom volume renderer is used to visualize density, temperature, and the resulting nickel production during the collision. Campus Champion Fellow Emily Dragowsky (Case Western) was involved in this work as well.

Alan Craig (Shodor) will be talking about his work with a variety of digital humanities projects. Alan has been instrumental in bringing many projects to XSEDE and will answer questions such as “Where do these projects come from?”, “What kinds of things are humanities scholars doing with XSEDE?”, “What are some hurdles that need to be overcome for successful projects?”, “How do the ECSS collaborations work?”, and “How do we know if a project is successful?” in the context of several example projects.

For further information, see http://www.xsede.org/ecss-symposium.

HPC@LSU SuperMIC Maintenance June 20

Mon, 06/12/2017 - 13:40

SuperMIC’s XSEDE login node will be down on Tuesday, June 13 for upgrades to XSEDE software.

Thank-you,
HPC@LSU System Administration

Jetstream Atmosphere Upgrades - Tuesday, June 13, 2017 - 12p-8p

Mon, 06/12/2017 - 05:20

Jetstream’s Atmosphere interface will be offline on Tuesday, June 13, 2017 from 12pm Eastern through 8pm Eastern for upgrades/enhancements.

Existing/running instances will still be available via ssh. API users will not be affected by this upgrade.
Please contact help@xsede.org with any questions or concerns.

Recent updates from XCI

Fri, 06/09/2017 - 11:45

A new XSEDE InCommon Identity Provider (IdP) idp.xsede.org allows XSEDE users to sign in to web sites that are part of the InCommon Federation (for example, GENI and ORCID) using their XSEDE accounts. This capability is especially useful for users who do not have an existing InCommon IdP provided thru their home institution. When signing in to a service that supports InCommon IdPs simply select the XSEDE from the list of identity providers. Your web browser will be redirected to idp.xsede.org to complete the sign-in process. The XSEDE IdP will prompt for Duo authentication if you have enabled it via the XUP according to https://portal.xsede.org/mfa and you have not logged in to idp.xsede.org recently. The XSEDE IdP implements optional single sign-on (SSO), meaning that if you have already authenticated at idp.xsede.org recently, you will not be prompted again for your password.

XSEDE has published two new sets of use cases describing how XSEDE enables an open and extensible “community infrastructure” by enabling infrastructure information sharing and discovery, and how XSEDE enables Service Providers (SPs) to integrate their resources with the rest of the XSEDE system thru an explicit, transparent, and tracked integration process. These use cases are available publicly thru the University of Illinois’ IDEALS system.

We want your software ideas! If you have ideas on new software and software based services that would help you use cyber infrastructure more effectively, or how XSEDE could improve existing software and services, please send them to help@xsede.org. The mission of the XSEDE Community Infrastructure (XCI) team is to enable users to leverage national CI using advanced software and services. We look forward to your input.

HPC Systems Engineer

Thu, 06/08/2017 - 08:48

HPC Systems Engineer:

HPC System Engineer is a newly created position at Advanced Research Computing (ARC), a unit of Information Technology. The HPC systems engineer will work in concert with computational scientists at ARC and with servers, storage and networking team members to advance HPC infrastructure, help develop web and cloud services for VT researchers, and maximize the productive use of HPC and visualization systems, both on campus as well as at federally-funded national centers and cloud-based commercial providers. The successful candidate will investigate emerging technologies in operation of research computing, storage and networking. They will help develop web-based portal, visualization tools, and implement private and hybrid cloud technologies for increased productivity and collaboration.

To apply/more information: https://listings.jobs.vt.edu/postings/76194

Senior Data and HPC Specialist

Mon, 06/05/2017 - 08:49

Title: Senior Data and HPC Specialist
Deadline to Apply: 2017-07-04
Deadline to Remove: 2017-07-05
Job Summary: The Senior Research Technology Specialist, with a focus on High Performance Computing (HPC) and data technology, is responsible for providing consulting, supporting faculty, students and staff utilizing research technology services including our HPC cluster, research software, research storage and research data management with a focus on large datasets. S/he will participate and contribute to the planning, development and management of research technology services at Tufts. Working closely with the Director of Research Technology, this position will assist in developing the vision and support for existing HPC solutions as well as evolving paradigms in big data.
Job URL: http://tufts.taleo.net/careersection/ext/jobdetail.ftl?job=17001284&lang=en#.WSbrZ5McOtY.link
Job Location: Medford, MA
Institution: Tufts University
Requisition Number:
Posting Date: 2017-06-04
Job Posting Type: Job
Please visit http://hpcuniversity.org/careers/ to view this job on HPCU.
Please contact jobs@hpcuniversity.org with questions.

Systems Architect

Mon, 06/05/2017 - 08:48

Title: Systems Architect
Deadline to Apply: 2017-06-18
Deadline to Remove: 2017-06-19
Job Summary:
Information Technology Services organization (ITS), a campus-wide provider of technology services for academic, research, and service missions is seeking a Research Data Storage Administrator (PIF2) which functions under the direction of the Associate Director of Research Services. The successful candidate will have a strong customer focus and will represent Research Services in partnerships with campus academic and IT departments. This will require understanding and cooperation with other campus staff to identify and implement the best shared or centrally hosted solutions.
Job URL: https://jobs.uiowa.edu/pands/view/71166
Job Location: Iowa City, IA
Institution: The University of Iowa
Requisition Number: 71166
Posting Date: 2017-06-04
Job Posting Type: Job
Please visit http://hpcuniversity.org/careers/ to view this job on HPCU.
Please contact jobs@hpcuniversity.org with questions.

Senior Scientific Applications Engineer

Mon, 06/05/2017 - 08:46

Title: Senior Scientific Applications Engineer
Deadline to Apply: 2017-06-18
Deadline to Remove: 2017-06-19
Job Summary: The Ohio Supercomputer Center (OSC) seeks a Senior Scientific Applications Engineer to support the overall mission of the group and in particular to contribute to the overall mission of the Center with a focus on application support for OSC’s industrial engagement and research programs. Role and Responsibilities: The individual in this position will optimize relevant open source modeling and simulation codes for high performance execution on OSC and other HPC systems, provide consulting on code optimization, parallel programming or accelerator programming to OSC clients, and perform application software builds, investigate software development tools, make improvements to OSC’s software deployment infrastructure, and create user-facing documentation.
Job URL: https://www.jobsatosu.com/postings/78836
Job Location: Columbus, OH
Institution: The Ohio State University
Requisition Number: 428699
Posting Date: 2017-06-04
Job Posting Type: Job
Please visit http://hpcuniversity.org/careers/ to view this job on HPCU.
Please contact jobs@hpcuniversity.org with questions.

Computational Scientist/Researcher

Mon, 06/05/2017 - 08:44

Title: Computational Scientist/Researcher
Deadline to Apply: 2017-06-16
Deadline to Remove: 2017-06-17
Job Summary: Applications are invited from suitably qualified candidate for a full-time, fixed term position as computational scientist with the Irish Centre for High End Computing (ICHEC) at the National University of Ireland, Galway. This position is funded as part of ICHEC’s membership of the Horizon 2020 (H2020) READEX project and can be based in either ICHEC’s Dublin or Galway offices. ICHEC is the national High-Performance Computing (HPC) in Ireland and is a centre of domain, systems and software expertise that provides high performance software solutions to academia, industry and state bodies, through partnership, knowledge transfer, project delivery and service provision. ICHEC operates the national HPC service providing compute resources and software expertise for the research communities across all the main science disciplines through collaborative partnerships and programs of education and outreach.
Job URL: https://www.ichec.ie/job_specs/NUIG%20092-17_Comp_Scientist_READEX_2017.pdf
Job Location: Dublin, Ireland or Galway, Ireland
Institution: Irish Centre for High End Computing
Requisition Number:
Posting Date: 2017-06-04
Job Posting Type: Job
Please visit http://hpcuniversity.org/careers/ to view this job on HPCU.
Please contact jobs@hpcuniversity.org with questions.

Analyst

Mon, 06/05/2017 - 08:43

Title: Analyst
Deadline to Apply: 2017-07-04
Deadline to Remove: 2017-07-05
Job Summary: The Analyst will provide end user support services to faculty, staff and students on the use of Research Computing hardware and software as well as scientific applications. Responsibilities of this position will include, creating and maintaining training materials for Research Computing related software and hardware, conducting training, installing, upgrading and maintain scientific software and communicating information pertaining to scientific software upgrades and installations.
Job URL: https://lehigh.hiretouch.com/position-details?jobID=40727
Job Location: Bethlehem, PA
Institution: Lehigh University
Requisition Number: 85750
Posting Date: 2017-06-04
Job Posting Type: Job
Please visit http://hpcuniversity.org/careers/ to view this job on HPCU.
Please contact jobs@hpcuniversity.org with questions.

HPC System Administrator

Mon, 06/05/2017 - 08:41

Title: HPC System Administrator
Deadline to Apply: 2017-08-04
Deadline to Remove: 2017-08-05
Job Summary: The University of Chicago is seeking a highly qualified HPC system administrator to join its system and operation team that builds and manages RCC HPC systems and facility operations. The individual in this position will be involved in the procurement and management of HPC hardware and software. Responsibilities include but are not limited to installing, configuring, and maintaining large computer clusters/servers and software, day-to-day operations of the systems including systems administration, monitoring and storage performance up to and including network components, management of the system’s network switch, parallel file system and HPC software stack and tools, and configuration of the scheduling and queuing system.
Job URL: https://jobopportunities.uchicago.edu/applicants/jsp/shared/position/JobDetails_css.jsp?postingId=671278
Job Location: Chicago, IL
Institution: University of Chicago
Requisition Number: 102411
Posting Date: 2017-06-04
Job Posting Type: Job
Please visit http://hpcuniversity.org/careers/ to view this job on HPCU.
Please contact jobs@hpcuniversity.org with questions.

HPC System Administrator

Mon, 06/05/2017 - 08:40

Title: HPC System Administrator
Deadline to Apply: 2017-08-04
Deadline to Remove: 2017-08-05
Job Summary: The University of Chicago is seeking a highly qualified HPC system administrator to join its system and operation team that builds and manages RCC HPC systems and facility operations. The individual in this position will be involved in the procurement and management of HPC hardware and software.
Job URL: https://jobopportunities.uchicago.edu/applicants/jsp/shared/position/JobDetails_css.jsp?postingId=671278
Job Location: Chicago, IL
Institution: University of Chicago
Requisition Number: 102411
Posting Date: 2017-06-04
Job Posting Type: Job
Please visit http://hpcuniversity.org/careers/ to view this job on HPCU.
Please contact jobs@hpcuniversity.org with questions.

Stampede status 2 June 2017

Fri, 06/02/2017 - 16:02

The $WORK filesystem has been intermittently unavailable on Stampede since today at 3:20pm CST. TACC staff are working to resolve this issue as quickly as possible.

Pages