2024 Girls Astronomy Summer Camp Report

T. Andrew Manning, Research Scientist, National Center for Supercomputing Applications

Summary

I supported another successful UIUC Girls Astronomy Summer Camp, which was held July 22-26, 2024. This year I again deployed a dedicated JupyterHub service that allowed the students to run code in Jupyter notebooks independently on their own JupyterLab servers during the five one-hour tutorial sessions. My points of contact were the camp instructors Maggie Verrico, Janiris Rodriguez-Bueno, and Aadya Agrawal.

Cluster provisioning

Due to a last-minute failure of the cluster originally hosting the GASC services, I provisioned a new, dedicated cluster on Jetstream2 using our Terraform and DecentCI recipe. JS2 was stable and provided all the resources we needed.

Identity and access management

Because they are in high school, the students needed local accounts that do not rely on an email address or a third-party service. Unlike previous years, I used the NativeAuthenticator plugin that makes it easy for hub admins to dynamically authorize users as they sign up to create their local JupyterHub accounts during the first tutorial session. This solution is the simplest while providing the most flexibility and control to the GASC instructors; however, if additional web services requiring authentication are needed, the Keycloak server might be a better option so that there is a central source of user accounts. That being said, given the small number of students at the event, manually creating local accounts could be a desirable option to reduce the complexity introduced by a Keycloak deployment.

Resource allocation and limits

If students are all going to run the same code in a given session, then the memory limits can be much higher than T/N, where T is total node memory and N is the number of jupyterlab servers, because multiple servers access the same memory addresses when opening the same files.

This year we had 3 worker nodes (8 CPU / 30 GB each) and about 5 active servers on each node (19 total servers). Peak usage per node was roughly 3 CPU / 18 GB.

We had two 200GB OpenStack volumes mounted to two of the worker nodes, providing fast local storage for the individual server files (1GB) and the shared files (100GB) mounted at /home/jovyan/shared. At the end of the five tutorial sessions, 35GB were consumed.

Size  Used  Avail  Use%   Mounted on
 98G   35G    64G   36%   /home/jovyan/shared

Planning for next year

Instructors

  • Require estimates of file storage capacity and CPU & memory needs well in advance.

  • Give instructors a deadline for installing files and testing the performance.

NCSA staff

  • Create a test harness for running all notebooks simultaneously for robust testing before event.

  • Deploy a real-time load monitor showing CPU/memory per server vs time.

  • Enhance the Terraform deployment repo to automatically provision Cinder volumes for shared storage so that manual OpenStack volume provisioning and Longhorn are unnecessary.

Illinois Computes Research Notebooks

We should consider using the new Illinois Computes Research Notebooks service next year. Due to the resource limits in the docs (1 CPU core guaranteed, 4 possible; 2GB guaranteed system memory, 8GB possible; 100GB persistent file storage) there would need to be some custom resource limits for the camp, as well as a way for students to authenticate without UIUC accounts.