Conference Workshops & Tutorials Schedule
PEARC21’s theme is “Evolution Across All Dimensions” and we look forward to this year’s virtual meeting, where we can share scientific discovery and craft the future infrastructure.
The workshops and tutorials below will be featured at PEARC21’s virtual conference.
The conference will be held in Pacific Time (PT) and the times listed below are in Pacific Time.
|Date: 7/19/21||Time: 8:00 am – 11:00 am||WORKSHOP: ACM SIGHPC SYSPROS Symposium 2021||John Blaas and Matthew Cawood|
In order to meet the demands of high-performance computing (HPC) researchers, large-scale computational and storage machines require many staff members who design, install, and maintain these systems. These HPC systems professionals include system engineers, system administrators, network administrators, storage administrators and operations staff who face problems that are unique to high performance computing systems. While many conferences exist for the HPC field and the system administration field, none exist that focus on the needs of HPC systems professionals. Support resources can be difficult to find to help with the issues encountered in this specialized field. Often systems staff turn to the community as a support resource and opportunities to strengthen and grow those relationships are highly beneficial.
This Workshop is designed to share solutions to common problems, provide a platform to discuss upcoming technologies, and to present state of the practice techniques so that HPC centers will get a better return on their investment, increase performance and reliability of systems, and researchers will be more productive. Additionally, this Workshop is affiliated with the systems professionals’ chapter of the ACM SIGHPC (SIGHPC SYSPROS Virtual ACM Chapter). This session would serve as an opportunity for chapter members to meet face-to-face, discuss the chapter’s yearly workshop held at SC, and continue building our community’s shared knowledge base.
|7/19/21||8:00 am – 11:00 am||WORKSHOP: Building a Strategic Plan for your Research Computing and Data Program||Patrick Schmitz, Claire Mizumoto, Dana Brunson, Doug Jennewein and Scotty Strachan||This workshop will gather Research Computing and Data (RCD) professionals to learn leading practices for developing effective strategic plans for their Research Computing and Data programs. The workshop is open to RCD professionals who are familiar with issues around supporting Research Computing and Data, have experience contributing to strategic planning, and have some exposure to the RCD Capabilities Model. Attendees will hear the experiences of universities (Arizona State University, University of Nevada, Reno, University of Hawaii, plus one more) who are currently using the RCD Capabilities Model as part of their strategic planning work, including lessons learned. Attendees will discuss the range of RCD strategic planning models across the community, and approaches to building a strong strategic planning practice. Finally, participants will define requirements for a new effort to develop a shared community resource to support strategic planning for RCD, identifying potential elements of such a resource and a near-term roadmap for development. Workshop organizers will document the findings of the workshop in a report shared with the community.|
|7/19/21||8:00 am – 11:00 am||WORKSHOP: Fifth Workshop on Trustworthy Scientific Cyberinfrastructure (TrustedCI@PEARC21)||Jim Basney|
The Fifth Workshop on Trustworthy Scientific Cyberinfrastructure (TrustedCI@PEARC21) provides an opportunity for sharing experiences, recommendations, and solutions for addressing cybersecurity challenges in research computing. The half-day workshop provides a forum for information sharing and discussion among a broad range of attendees, including cyberinfrastructure operators, developers, and users.
Implementing cybersecurity for open science across the diversity of scientific research projects presents a significant challenge. There is no one-size-fits-all approach to cybersecurity for open science that the research community can adopt. Even NSF Major Facilities, the largest of the NSF projects, struggle to develop effective cybersecurity programs. To address this challenge, practical approaches are needed to manage risks while providing both flexibility for project-specific adaptations and access to the necessary knowledge and human resources for implementation. This workshop brings community members together to further develop a cybersecurity ecosystem, formed of people, practical knowledge, processes, and cyberinfrastructure, that enables research projects to both manage cybersecurity risks and produce trustworthy science.
|7/19/21||12:00 pm – 3:00 pm||WORKSHOP: Fourth Workshop on Strategies for Enhancing HPC Education and Training (SEHET21)||Nitin Sukhija, Scott Lathrop, Julia Mullen, Kate Cahill, Susan Mehringer and Dana Brunson|
High performance computing is becoming central for empowering scientific progress in the most fundamental research in various science and engineering, as well as society domains. It is remarkable to observe that the recent rapid advancement in the mainstream computing technology has facilitated the ability to solve complex large-scale scientific applications that perform advanced simulations of the implementation of various numerical models corresponding to numerous complex phenomena pertaining to diverse scientific fields. The inherent wide distribution, heterogeneity, and dynamism of the today’s and future computing and software environments provide both challenges and opportunities for cyberinfrastructure facilitators, trainers and educators to develop, deliver, support, and prepare a diverse community of students and professionals for careers that utilize high performance computing to advance discovery.
The SEHET21 workshop is an ACM SIGHPC Education Chapter coordinated effort aimed at fostering collaborations among the practitioners from traditional and emerging fields to explore strategies to enhance computational, data-enabled and HPC educational needs. Attendees will discuss approaches for developing and deploying HPC training, as well as identifying new challenges and opportunities for keeping pace with the rapid pace of technological advances – from collaborative and online learning tools to new HPC platforms. The workshop will provide opportunities for: learning about methods for conducting effective HPC education and training; promoting collaborations among HPC educators, trainers and users; and for disseminating resources, materials, lessons learned and good/best practices.
|7/19/21||12:00 pm – 3:00 pm||WORKSHOP: PEARC 21 Workshop Proposal:Refining Your Research Computing Pitch||Jane Combs and Richard Knepper||Speak above the noise. In the age of over communication via many channels, one common need among research service providers, particularly Research Computing Centers, is reaching their audience and getting the word out that their services exist. We propose a PEARC workshop for professionals (center leaders, facilitators, faculty champions, etc) of existing or emerging research computing organizations (even those “one-person shops”) to get feedback on and develop their communication materials: new faculty handouts or introductory slides that are used to provide the first communication to faculty, students and administrators to familiarize them with research computing resources available at our institutions. Some existing materials will be reviewed and workshop members can provide feedback, then individual groups will convene to improve some sample materials provided by collaborators. The group will reconvene and identify lessons learned and feedback on the overall workshop. We hope through this workshop to provide a clearing-house of template materials that will make campus outreach easier for research computing professionals, which can be reviewed and improved by participants over time.|
|7/19/21||12:00 pm – 3:00 pm||WORKSHOP: What Does it Mean to be a Campus Champion?||Shelley Knuth, Jay Alameda, Torey Battelle, Dana Brunson, Thomas Cheatham, Douglas Jennewein, Bj Lougee, Julie Ma, Timothy Middelkoop and Henry Neeman|
The importance of research computing and data infrastructure for scientific discovery and scholarly achievement has grown as research questions become more complicated and datasets get larger. Many higher education institutions have built infrastructure utilized by researchers and supported by local research computing staff. Some of these research computing groups are large, with several infrastructure and support staff, while others may be supported by only one or two staff members. For both of these groups, the Campus Champions program has provided an opportunity for knowledge exchange, professional development, and growth (Brazil 2019). Over the past twelve years, the Campus Champions program has grown to nearly 720 Champions at over 300 research institutions. A significant number of Champions attend the PEARC conference; in 2020, over 23% of PEARC attendees were Champions.
With on average a net gain of 40 new Champions per year, a cohesive and all-encompassing onboarding program must be in place to ensure that the Champions get exposure to information and resources to assist members at their institutions as well as their peers in the community. For several years, an informal onboarding process has served as a basic introduction, with members of the community relying on the web or their colleagues to determine where appropriate resources and information might be located. A more extensive process that includes mentoring and exposure to resources is desired.
Given the number of Campus Champions who attend PEARC, and its vision to foster “the creation of a dynamic and connected community of advanced research computing professionals who promote leading practices and the frontiers of research, scholarship, teaching, and industry application” [PEARC 2019], hosting a workshop dedicated to the onboarding of Campus Champions would have maximum impact.
At PEARC 2020, this workshop was hosted for the first time (also entitled, “What Does It Mean to be a Campus Champion?”), with approximately 60 participants remaining engaged throughout the day. Anecdotally, we received a lot of positive feedback about this workshop from the participants. While the targeted audience was new or recent Campus Champions, more experienced Champions as well as those who were not Champions benefitted from the workshop. The more experienced Champions learned about changes in the program or about resources of which they were not aware, while people who were not Champions were either introduced to the community or learned practices to best support researchers at their institutions (if they support infrastructure) or to best utilize infrastructure (if they do not).
|7/19/21||8:00 am – 3:00 pm||TUTORIAL: A Deep Dive into Constructing Containers for Scientific Computing and Gateways||Eric Coulter, Peter Vaillancourt, Jeremy Fischer, Stephen Bird, Sanjana Sudarshan and Suresh Marru||In recent years, using containers has been rapidly gaining traction as a solution to to lower the barriers to using more software on HPC and cloud resources. However, significant barriers still exist to actually doing this in practice, particularly for well-established community codes which expect to run on a particular operating system version or resource. Additional barriers exist for researchers unfamiliar with containerization technologies. While many beginner tutorials are available for building containers, they often stop short of covering the complexities that can arise when containerizing scientific computing software. The goal of this full-day tutorial is to demonstrate and work through building and running non-trivial containers with users. We will containerize community scientific software, exhibit how to share with a larger community via a container registry, and then run on a completely separate HPC resource,with and without the use of a Science Gateway. The subject matter will be approachable for intermediate to advanced users, and is expected to be of interest to a diverse audience including researchers,support staff, and teams building science gateways.|
|7/19/21||8:00 am – 3:00 pm||TUTORIAL: Build a quick, effective coding tutorial||Christina Maimone and Colby Witherup Wood||Research Computing and Data professionals are often called upon to train others in technical skills, but creating effective workshop and training materials is time-consuming, especially when instructors need to adapt to both virtual and in-person formats. This tutorial will introduce participants to a template for creating a successful, one-hour, interactive coding workshop that can easily transition between remote and in-person instruction. Through a combination of individual work sessions, peer feedback, and instructor advice, participants will refine their topic, outline their examples, and write effective exercises. They will also write a description of their workshop, think through the technology needed, and share tips and ideas with others. In addition to ending the day with materials for their new workshop, participants will leave with a quick template for building future effective workshops.|
|7/19/21||8:00 am – 3:00 pm||TUTORIAL: Lucata Pathfinder-S Tutorial: Next-generation Computationwith the Rogues Gallery||Jeffrey Young, Semir Sarajlic, William Powell, Janice McMahon and Jason Riedy|
The Rogues Gallery is a new experimental testbed hosted at Georgia Tech that is focused on tackling “rogue” architectures for the post-Moore era of computing including those in areas like high-performance, near-memory, neuromorphic, and quantum computing. More recently, the Rogues Gallery has been awarded an NSF grant to serve as a novel architecture testbed as part of the CISE Community Research Infrastructure (CCRI) program. This tutorial will provide an introduction to this new community resource and will focus on hands on development with the signature architecture, Lucata’s newly designed Pathfinder-S.
This virtually hosted tutorial will present a brief overview of the Rogues Gallery testbed and how NSF researchers can access and utilize unique hardware in the neuromorphic, smart networking, HPC, and near memory spaces to carry out related research goals for CISE-oriented research. Attendees will have an opportunity to learn about and program for the Lucata Pathfinder-S system, a near-memory computing architecture for sparse applications that has applications for database operations, graph analytics, and machine-learning related techniques. We will provide and work through a set of demonstration codes and will provide details on potential workflows that users can explore after the tutorial. Attendees will have an opportunity to continue their investigation into using the Pathfinder-S by requesting a free account to access the Rogues Gallery at the end of the tutorial.
|7/19/21||8:00 am – 3:00 pm||TUTORIAL: Managing HPC Software Complexity with Spack||Gregory Becker, Robert Blake, Massimiliano Culpo, Tammy Dahlgren, Adam Stewart and Todd Gamblin||The modern scientific software stack includes thousands of packages, from C, C++, and Fortran libraries, to packages written in interpreted languages like Python and R. HPC applications may depend on hundreds of packages spanning all of these ecosystems. To achieve high performance, they must also leverage low-level and difficult-to-build libraries such as MPI, BLAS, and LAPACK. Integrating this stack is extremely challenging. The complexity can be an obstacle to deployment at HPC sites and deters developers from building on each other’s work.|
Spack is an open source tool for HPC package management that simplifies building, installing, customizing, and sharing HPC software stacks. In the past few years, its adoption has grown rapidly: by end-users, by HPC developers, and by the world’s largest HPC centers. Spack provides a powerful and flexible dependency model, a simple Python syntax for writing package build recipes, and a repository of over 5,000 community-maintained packages. This tutorial provides a thorough introduction to Spack’s capabilities: installing and authoring packages, integrating Spack with development workflows, and using Spack for deployment at HPC facilities. Attendees will learn foundational skills for automating day-to-day tasks, as well as deeper knowledge of Spack for advanced use cases.
|7/19/21||8:00 am – 3:00 pm||TUTORIAL: Open OnDemand, Open XDMoD, and ColdFront: an HPC center management toolset||Dori Sajdak, Andrew Bruno, Eric Franz, Jeff Ohrstrom and Ryan Rathsam|
The University at Buffalo Center for Computational Research (UB CCR) and Ohio Supercomputer Center (OSC) team up to offer HPC systems personnel a step-by-step tutorial for installing, configuring and using what many centers now consider vital software products for managing and enabling access to their resources. UB CCR offers two open source products – an allocations management system, ColdFront, and an HPC metrics & data analytics tool, Open XDMoD. OSC provides the open source OnDemand portal for easy, seamless web-based access for users to HPC resources. These three products have been designed to work together to provide a full package of HPC center management and access tools. In this tutorial the system administrators and software developers from OSC and UB CCR will walk attendees through the installation and configuration of each of these software packages. We’ll show how to use these three products in conjunction with each other and the Slurm job scheduler.
We will begin the tutorial with a short overview of each software product and how they tie together to provide seamless management of an HPC center. We’ll spend the first part of the tutorial demoing the installation and configuration of ColdFront and Open XDMoD. The second half will be spent on the installation of Open OnDemand and examples of configuring interactive apps. We’ll end with instructions on how to tie together Open XDMoD with Open OnDemand for access to job metrics within OnDemand.
|7/19/21||8:00 am – 3:00 pm||TUTORIAL: Programming and Profiling Modern Multicore Processors||Amit Ruhela, Lars Koesterke, John Cazes and Matthew Cawood|
Modern processors, such as Intel’s Xeon Scalable line, AMD’s EPYC architecture, ARM’s ThunderX2 design, and IBM’s Power9 architecture are scaling out rather than up and increasing in complexity. Because the base frequencies for the large core count chips hover somewhere between 2-3 GHz, researchers can no longer rely on frequency scaling to increase the performance of their applications. Instead, developers must learn to take advantage of the increasing core count per processor and learn how to eke out more performance per core.
To achieve good performance on modern processors, developers must write code amenable to vectorization, be aware of memory access patterns to optimize cache usage, and understand how to balance multi-process programming (MPI) with multi-threaded programming (OpenMP). This tutorial will cover serial and thread-parallel optimization including introductory and intermediate concepts of vectorization and multi-threaded programming principles. We will address CPU as well as GPU profiling techniques and tools and give a brief overview of modern HPC architectures.
The tutorial will include hands-on exercises in parallel optimization, and profiling tools will be demonstrated on TACC systems. This tutorial is designed for intermediate programmers, familiar with OpenMP and MPI, who wish to learn how to program for performance on modern architectures.
|7/19/21||8:00 am – 3:00 pm||TUTORIAL: Python 201: Building Better Scientific Software in Python||Geoffrey Lentner, Amiya Maji and Lev Gorenstein|
Scientists, engineers, researchers, and other CI professionals continue to be put in the position of being software developers. Nearly every pursuit includes the development of some final domain specific code, even if on top of robust core libraries and frameworks.
Writing code in the pursuit of science and data analysis brings with it the challenge of making that code deployable and accessible to collaborators. Novice programmers often find themselves writing difficult to maintain, difficult to manage code that their peers and collaborators have trouble using. This challenge is a hurdle to open, reproducible science.
The goal of this tutorial is to expose researchers to several best practices in scientific software engineering that may otherwise take several years to become acquainted with. Though the implementation of these lessons is Python-specific, the essential ideas can be exported to other languages or platforms.
The tutorial builds on several years of iteration and multiple instructors with a polished set of materials, hosted on GitHub (glentner.github.io/python201). Delivered in previous years, including PEARC ’18 and PEARC ’20, we believe the content here has been well received and remains as relevant and in demand as ever.
The tutorial is hands-on with full examples outlined in a “readthedocs” style website of materials. Participants are expected to already be familiar with the Python language to the extent that they understand what Python is, how to write code files in a text editor that they are comfortable with, and run that code at the command-line on their platform of choice (Linux, macOS, Windows).
Topics covered: Python packaging, automated testing, documentation management, logging, command-line interfaces, performance profiling and optimization.
|7/19/21||8:00 am – 11:00 am||TUTORIAL: An Introduction to Advanced Features in MPI||Victor Eijkhout and Amit Ruhela|
The MPI library is now in version 3, with version 4 forthcoming, but most programmers use mechanisms from MPI-1 or 2 at best.
This tutorial was taught at PEARC20 where it was attended by 50 people, receiving a 4.3/5 rating.
New for this year will be a brief discussion of some features of the upcoming MPI-4 standard.
|7/19/21||12:00 pm – 3:00 pm||TUTORIAL: Boosting Performance of Machine Learning/Deep Learning and Dask Applications using the MVAPICH2-GDR Library||Dhabaleswar Panda, Hari Subramoni, Arpan Jain and Aamir Shafi||The recent advances in Machine Learning (ML) and Deep Learning (DL) have led to many exciting challenges and opportunities for CS and AI researchers alike. Modern ML/DL and Data Science frameworks including TensorFlow, PyTorch, Dask, and several others have emerged that offer ease of use and flexibility to train, and deploy various types of ML models and Deep Neural Networks (DNNs). In this tutorial, we will provide an overview of interesting trends in ML/DL and how cutting-edge hardware architectures and high-performance interconnects are playing a key role in moving the field forward. We will also present an overview of different DNN architectures and ML/DL frameworks. Most ML/DL frameworks started with a single-node design. However, approaches to parallelize the process of model training are also being actively explored. The AI community has moved along different distributed training designs that exploit communication runtimes like gRPC, MPI, and NCCL. We highlight new challenges and opportunities for communication runtimes to exploit high-performance CPU and GPU architectures to efficiently support large-scale distributed training. We also highlight some of our co-design efforts to utilize MPI for large-scale DNN training on cutting-edge CPU and GPU architectures available on modern HPC clusters. The tutorial covers training traditional ML models including—K-Means, linear regression, nearest neighbors—using the cuML framework accelerated using MVAPICH2-GDR. Also, the tutorial resents accelerating GPU-based data science applications using the MPI4Dask package, which provides an MPI-based backend for Dask. Throughout the tutorial, we include hands-on exercises to enable attendees to gain first-hand experience of running distributed ML/DL training and Dask on a modern GPU cluster.|
|7/19/21||8:00 am – 11:00 am||TUTORIAL: Deploying Science Gateways with Apache Airavata||Marlon Pierce, Eroma Abeysinghe, Marcus Christie, Dimuthu Wannipurage, Sudhakar Pamidighantam and Suresh Marru||The authors present the Apache Airavata framework for deploying science gateways, illustrating how to request, administer, modify, and extend a basic science gateway tenant to hosted Apache Airavata middleware. We further show how to use this gateway to execute scientific software on XSEDE supercomputers and university-operated computing clusters. This tutorial builds on successful tutorials that the authors have presented at previous PEARC and Gateways conferences, including a completely online version at Gateways 2020; see https://www.youtube.com/watch?v=FhAHkOoVGh4 for a recording.|
|7/19/21||12:00 pm – 3:00 pm||TUTORIAL: Deploying XSEDE Endpoints Using Globus Connect Server version 5||Vas Vasiliadis, John-Paul Navarro and Lee Liming|
The XSEDE project operates Globus endpoints as a means for managing data on storage systems at service providers such as TACC and SDSC. The XSEDE community moves and shares tens of petabytes of data each year via these endpoints, using the Globus Connect Server (GCS) software to enable access to the Globus service. XSEDE endpoints currently run Globus Connect Server version 4 (GCSv4) and, in some cases, bespoke deployments of GridFTP. In 2021 XSEDE will introduce Globus Connect Server version 5 (GCSv5), with the intent of simplifying deployment and administration, en route to phasing out support for GCSv4 and other legacy GridFTP implementations in the future.
This tutorial will explore important new features of the GCSv5 software and help Globus endpoint operators prepare for the transition to GCSv5. We will compare and contrast the differences between GCSv4 and GCSv5, highlighting those changes that may require changes to user-facing and systems management processes. Participants will deploy a Globus endpoint using GCSv5 and will experiment with common configuration options. We will illustrate concepts using examples from the XSEDE ecosystem, but the material is equally relevant to system administrators at university research computing centers, national laboratories, and other advanced computing facilities that use Globus for data management.
|7/19/21||8:00 am – 11:00 am||TUTORIAL: Empowering Research Computing at Your Organization Through the Open Science Grid||Christina Koch, Lauren Michael, Mats Rynge and Emelie Fuchs||This training will provide researcher-facing cyberinfrastructure professionals with the information and hands-on skills they need to engage with the Open Science Grid. Attendees will explore the capabilities of the OSG through hands-on activities, discuss strategies for engaging researchers, and hear from organizations who have already partnered with the OSG. Attendees should leave this training with a clear understanding of where the OSG can be transformative for research, the process for moving work onto the OSG, and what options exist for engaging further with the OSG.|
|7/19/21||12:00 pm – 3:00 pm||TUTORIAL: Engineering your Application for Peak Performance with TAU and MVAPICH2||Sameer Shende and Hari Subramoni||This tutorial presents tools and techniques to optimize the runtime tunable parameters exposed by the MPI using the TAU Performance System® [http://tau.uoregon.edu]. MVAPICH2 [http://mvapich.cse.ohio-state.edu] exposes MPI performance and control variables using the MPI_T interface that is now part of the MPI-3 standard. The tutorial will describe how to use TAU and MVAPICH2 for assessing the application and runtime system performance. We present the complete workflow of performance engineering, including instrumentation, measurement (profiling and tracing, timing, and PAPI hardware counters), data storage, analysis, and visualization. Emphasis is placed on how tools are used in combination for identifying performance problems and investigating optimization alternatives. We will request remote access to the Stampede system at TACC for hands-on exercises. We will also provide the ECP E4S OVA image [https://e4s.io] containing all of the necessary tools (running within a virtual machine) for the hands-on sessions. Participants will learn how to use the TAU Performance System with MPI, OpenMP (OMPT), CUDA, HIP, and OneAPI runtimes, and use the MPI-T interface from the MVAPICH2 library on the Frontera system at TACC and on the VM. This will help to prepare participants to locate and diagnose performance bottlenecks in their own parallel programs.|
|7/19/21||12:00 pm – 3:00 pm||TUTORIAL: Interactive Scientific Computing on the Anvil Composable Platform||Erik Gough, Eric Adams, Alan Chalker, Brian Werts, Sam Weekly and Stephen Kelley||XSEDE capacity systems have traditionally provided batch access to large scale computing systems, meeting the high-performance computing (HPC) needs of domain scientists across numerous disciplines. New usage patterns have emerged in research computing that depend on the availability of custom services such as notebooks, databases, elastic software stacks, and science gateways alongside traditional batch HPC. Anvil, an XSEDE capacity system being deployed at Purdue University, integrates a high capacity, high performance computing cluster with a comprehensive ecosystem of software, access interfaces, programming environments, and composable services to form a seamless environment able to support a broad range of science and engineering applications. In this introductory-level tutorial, participants will get hands-on experience with interactive scientific computing using Anvil’s Thinlinc remote desktop and Open OnDemand (OOD) services as well as the Anvil Composable Platform, a service providing web-based access to a Kubernetes-based private cloud.|
|7/19/21||12:00 pm – 3:00 pm||TUTORIAL: Modern Tools for Supercomputers||Chun-Yaung Lu, Si Liu, Robert McLay and Kent Milfeld|
Powerful supercomputers have played an important role in the computational research community. However, the increasing complexity of modern systems may defer or hinder their work. A large amount of precious time and effort has been spent unnecessarily managing the user environment, reproducing standard workflow, handling large scale I/O work, profiling and monitoring users’ jobs, understanding and resolving unnecessary system issues, etc. To help supercomputer users focus on their scientific and technical work and to minimize the workload for the consulting team, we designed and developed a series of powerful tools for supercomputer users. These tools are portable and effective on almost all supercomputers and are now serving thousands of supercomputer users of TACC, XSEDE, and other institutions every day.
In this tutorial, we will present and practice with supercomputer tools specifically designed for complex user environment (LMod, Sanity Tool), tools for workflow management (ibrun, launcher, launcher-GPU, pylauncher), tools for job monitoring and profiling (Remora, TACC-Stat, core_usage, amask, etc.), and several other convenient tools. Attendees will learn how these tools are designed and used in their daily work. Detailed hands-on exercises are prepared beforehand and will be executed mainly on the Stampede2 and Frontera supercomputers at the Texas Advanced Computing Center (TACC).
|7/19/21||12:00 pm – 3:00 pm||TUTORIAL: Securing Science Gateways with Custos Services||Marlon Pierce, Suresh Marru, Enis Afgan, Jim Basney and Eroma Abeysinghe||The authors present a tutorial on Custos, a cybersecurity service based on open source software that helps science gateways manage user identities, integrate with federated authentication systems, manage secrets such as OAuth2 access tokens and SSH keys needed to connect to remote resources, and manage groups and access permissions to digital objects. This tutorial will provide an overview of Custos’s capabilities, provide hands-on exercises on using its features, demonstrate to gateway providers how to integrate the services into their gateways with software development kits for the Custos API, introduce developers to the code and how to review and contribute to it, supply gateway providers with information on how Custos services are deployed for high availability and fault tolerance, and how Custos operations handle incidence response. This tutorial builds on a successful online tutorial presented by the authors at Gateways 2020; see https://www.youtube.com/watch?v=CuBvFj194Kg for a recording.|
|7/19/21||8:00 am – 11:00 am||TUTORIAL: Security Log Analysis: Real world hands on methods and techniques to detect attacks||Mark Krenz and Ishan Abhinit||The goal of security log analysis is to more efficiently leverage log collection in order to identify threats and anomalies in your research organization. This half-day training will help you tie together various log and data sources to provide a more rounded, coherent picture of a potential security event. It will also help you understand log analysis as a life cycle (collection, event management, analysis, response) that continues to become more efficient over time. Interactive demonstrations will cover both automated and manual analysis using multiple log sources, with examples from real security incidents. 45% of the sessions will be devoted to hands-on exercises where students will analyze real log files in order to find security incidents. A knowledge of Unix commands such as grep, awk and wc are ideal for this class, but not required as the algorithmic methods can be applied to other systems. A brief primer on these commands will be provided. We have expanded our exercise this time to include both command line and Elastic Stack based analysis. This will be an interactive session allowing Q&A and also will feature interactive polls to enhance the audience’s learning experience.|
|7/19/21||8:00 am – 11:00 am||TUTORIAL: Visualize, Analyze, and Correlate Networking Activities for Parallel Programs on InfiniBand HPC Clusters using the OSU INAM Tool||Dhabaleswar Panda, Hari Subramoni and Pouya Kousha||As the computing, networking, heterogeneous hardware, and storage technologies continue to evolve in HEC platforms, Understanding the full-stack performance tradeoffs and interplay between HPC applications, MPI libraries, the communication fabric, the file system, and the job scheduler becomes more challenging endeavor. Such understanding will enable all involved parties to understand the bottlenecks to maximize the efficiency and performance of the individual components that comprise a modern HPC system and solve different grand challenge problems. Through this tutorial, the participants will learn how to use the OSU InfiniBand Network Analysis and Monitoring (INAM) tool in conjunction with live jobs running on various remote clusters at OSC and OSU to visualize, analyze, and correlate how the MPI runtime, high-performance network, I/O filesystem, and job scheduler interact and identify potential bottlenecks online. Emphasis is placed on how tools are used in combination for identifying performance problems and investigating optimization alternatives. We will request remote access to the Pitzer system at OSC and the RI/RI2 clusters at OSU for hands-on exercises. This will help to prepare participants to locate and diagnose performance bottlenecks in their own clusters and parallel programs.|
|7/19/21||8:00 am – 11:00 am||WORKSHOP: Dell – Win More Grants||TBA||Find out how you can win more grants! Get tips and strategies from Dell Technologies, AMD, Grants Office LLC and your PEARC peers. There will be short, impactful presentations, panel discussions and audience Q&A throughout the workshop. Also hear from the Dell Technologies HPC & AI Innovation Lab and from the Chief Technology Office regarding their research and forward-looking revelations on new technologies.|
|7/19/21||12:00 pm – 3:00 pm||WORKSHOP: AMD||TBA||TBA|
|7/19/21||8:00 am – 11:00 am||WORKSHOP: Intel||James Reinders, Chris Allison, Varsha Madananth & Anoop Madhusoodhanan Prabha|
In this workshop, PEARC21 conference attendees will learn how to take advantage of oneAPI for modern cross-architecture application development, analysis, and tuning.
Attendees will learn Intel’s oneAPI Toolkits are a useful upgrade to the popular and widely used Parallel Studio XE tools, to embrace the vision of oneAPI, making heterogeneous programming open and ubiquitous.
The session will educate participants on current resources, oneAPI Toolkits, and development vehicles available to the PEARC21 community. Join us to learn more!
NOTE: This session is ideal for research computing system administrators, application specialists, application developers, and other HPC and AI software stakeholders.
Speakers: James Reinders, co-author of the recent book on Data Parallel C++ (and nine other HPC focused books), will be joined by other Intel engineers and experts to teach this workshop and answer questions from participants.
|7/19/21||8:00 am – 11:00 am||WORKSHOP: |
NVIDIA: Best Practices for Operating a GPU system
Hear about all the tools and techniques we use at NVIDIA to operate our multiple research computing clusters. We will demonstrate how you can use these same tools and techniques on your systems to help solve your users’ research computing challenges.
We will describe the toolsets that system administrators can use to improve utilization of GPUs and GPU nodes, including some live demonstrations. We will also review how research computing support staff can help their users make more efficient and effective use of GPU acceleration.
|7/19/21||8:00 am – 11:00 am||WORKSHOP: Google and NVIDIA||TBA||Join Google Cloud experts and our partners at NVIDIA and SchedMD to learn more about ML, GPUs, and get hands on experience running popular workloads on GCP|