Monthly HPC Café: Handling many small files and managing AI data sets (February 11, hybrid event)
The next HPC Café will take place on February 11 at 4:00 p.m. as a hybrid event. As always, there will be plenty of time to get in touch with your favorite HPC group. We invite you to come to NHR@FAU to enjoy coffee, cake, and computing.
The event starts at 4:00 p.m. with an open coffee chat, and at 4:30 p.m., we will start the presentation.
Topic: Handling many small files and managing AI data sets
Speaker: Dr. Anna Kahler, NHR@FAU
Location: Seminar room 2.049 (RRZE, Martensstraße 1, 91058 Erlangen) and online via Zoom.
Access via Zoom: https://go-nhr.de/hpc-cafe
Abstract:
We invite you to join us for a discussion on data handling, including the possibility of NHR@FAU providing access to popular data sets. As part of this discussion, we will present an overview of the various file systems available for data storage at NHR@FAU, covering key topics such as data archive formats, data copying, archiving, compressing, and unpacking, as well as recommendations for the most effective programs to use. Additionally, we will share best practices for efficient data storage and access in your SLURM scripts.
By taking a few simple steps, many common data handling issues can be resolved, which is crucial given that NHR@FAU supports over 1,000 users and inefficient data usage can impact not only individual workflows but also those of colleagues. Despite the importance of this issue, we continue to observe inefficient data handling practices and believe it is essential to revisit this topic time and again.
Material from past events is available at: https://hpc.fau.de/teaching/hpc-cafe/