Glaciers Meet Neural Networks: Interview with AI Newcomer Nora Gourmelon
Nora Gourmelon is a Researcher and PhD student with the Pattern Recognition Lab at FAU. She received the “AI Newcomer Award 2023” in April this year. This Award is granted by the German Association of Computer Science (Gesellschaft für Informatik) to young researchers under 30 for innovative developments in the area of artificial intelligence. Since she used GPU resources at NHR@FAU for her research, we interviewed Nora to learn more about her work, what makes it so relevant, and the role of AI in it.
Nora, your research is about glaciers. We all know that shrinking glaciers are indicators of global warming, but what is it exactly that you are investigating?
I am interested in glaciers that end in an ocean, specifically their calving fronts. The calving front is where chunks of ice break off from the edge of the glacier. How these calving fronts evolve is important information for geoscientists.
What makes this so interesting? Isn’t it well known that glaciers are melting?
It’s not that simple. Glaciers shrink in different ways – there’s calving at the front, melting at the surface of the glacier and sub-marine melting. For example, if a calving front hits land, the glacier melts more slowly and only at the surface, which has an impact on the amount of water flowing into the sea.
The glacier melts more slowly over land because the sea is an almost infinite reservoir of heat, right?
Does this affect the global sea level?
Exactly. And current climate models don’t factor this in. That’s why we need current data about calving fronts.
So you travel around the world surveying glaciers? Sounds fun.
No, unfortunately not (laughing). We download grayscale radar satellite images of glaciers from NASA and ESA, and use them to train a neural network. Later, the network can then be employed to detect calving fronts in images it has not seen before.
Radar images? Can you explain what this is?
It’s called “Synthetic Aperture Radar” (SAR). Satellites in orbit constantly send radar signals to the earth’s surface and record what bounces back. As the satellite moves, the recorded signals are put together to compute an image.
Sounds complicated. Why not just take photos in the visible spectrum?
Radar is pretty independent of the weather and it can be used at night, these are the two major advantages.
What is the resolution of these images?
The image data I use shows between seven and twenty square meters per pixel, and the image size is a few megapixels. This is sufficient for my purposes. But there is a downside, too: What the glacier surface looks like in the SAR image can change, for example when the surface starts to melt or when sea ice forms in the water just before the calving front.
Isn’t this an issue? Can you separate glacier from “non-glacier” in this case at all?
Indeed, this is a tough nut for our algorithm, but humans also partly struggle to correctly delineate the front in these cases. But I hope that I can improve the algorithm such that the results of humans and the algorithm are closer together at least.
Let’s go into more detail about this algorithm. It ingests those satellite images and tries to identify the calving fronts automatically. How does it do that?
It’s basically a special form of neural network called a U-Net. We have tailored a U-Net so that it’s especially good at this particular problem of calving front delineation.
Did you use one of the popular AI frameworks?
Yes, I used PyTorch Lightning, this is a Python library that simplifies the use of PyTorch significantly.
In your recent paper , you describe the benchmark data set that you used, and it had 700 images. For neural networks that can identify kittens and people and objects and “stuff,” it’s known that they require thousands if not millions of images for training. Why are 700 enough in your case?
Because it’s a segmentation problem, not a classification problem. The network doesn’t try to find out whether there is a glacier (or a kitten) or not. Its task is to find the dividing line between glacier and ocean, and that’s a one-pixel wide path through the image. Put simply, it’s a classification problem for every single pixel of each image: Is it on the line or not? So 700 images are plenty.
What’s the difference between a U-Net and a standard classification network on a technical level?
A pure classification network has a number of outputs for the different objects it can recognize: dog, kitten, toaster. A U-Net ingests an image and spits out an image of the same size but augmented with – in our case – a line, the calving front. The hard part is that the line we’re looking for is only a small fraction of all the pixels in the image. So, I had to find a new way to quantify how “close” the solution of the network is to the ground truth.
“Ground truth,” that’s all the images annotated manually by human beings?
How many people were involved in that, and does it matter who does the labeling?
One person labeled the entire training set, but currently I am conducting a test to find out how much manual annotations vary across individuals and then come to a better ground truth eventually. I’d like to get twenty people for that, but we’re not quite there yet.
How do you make sure that the trained network can actually handle unknown images that were not in the training set?
My data set has images from seven glaciers; I split it into two parts: Five of seven glaciers for training, the remaining images of two glaciers for testing. This way I checked if the trained network works correctly on data it was not trained with.
And did it?
Indeed, there is one glacier that caused problems because it shows not only one but three calving fronts. Of course, it’s hard for any network to handle a situation it did not encounter in training, and in this case, it mostly identified only two out of three.
Was the handling of the image data a problem at all?
700 images of a few megapixels are not a problem, especially if you know how to handle them. Of course, I keep the images in a single archive file and unpack the bunch on a compute node only when needed.
Very good, that’s exactly the recommended procedure.
You published the full data set as part of your paper so that other researchers can build upon it, correct?
Yes. We did some preprocessing of the images and packaged the originals with the annotated ones. This way, everyone can refer to a well-defined data set, and comparisons of different AI methods become possible.
Let’s come to resources and compute power. For this particular project, you used about 1000 GPU hours on the A100 GPUs in the Alex cluster at NHR@FAU, but this included many trials and all the research to find the right configuration. How long does it actually take to train the final network once with a data set of the size that you used?
It takes between 48 and 72 hours on a single A100 GPU. The segmentation of the test data set using the trained network takes about 15-20 minutes on my RTX 2080 Ti, that’s between seven and ten seconds per image.
What’s the next big thing for you?
We have two new projects coming up. The first will look into the innards of glaciers using images of cross sections; the second is concerned with bird monitoring, where we will detect single birds in aerial photographs. For that, we will collaborate with the Federal Agency for Nature Conservation (Bundesamt für Naturschutz). And, of course, I’d like to wrap up my dissertation in the not-too-distant future.
We wish you all the best for that, and thanks a lot for this interview!
 Gourmelon, N., Seehaus, T., Braun, M., Maier, A., and Christlein, V.: Calving fronts and where to find them: a benchmark dataset and methodology for automatic glacier calving front extraction from synthetic aperture radar imagery, Earth Syst. Sci. Data, 14, 4287–4313, https://doi.org/10.5194/essd-14-4287-2022, 2022.