Index

I have to analyze over 2 million files in my job. What can I do?

Please go through the presentation we provide on Using File Systems Properly; there is also a video recording available.

If supported by the application, use containerized formats (e.g. HDF5) or file-based databases. Otherwise, pack your files into an archive (e.g. tar + optional compression) and use node-local storage that is accessible via $TMPDIR.