Distributed Deep Learning Pipeline for Seismo-volcanic data (Etna)
Solid Earth Sciences, Seismology
Research area
The project creates a distributed workflow operating on High Performance Computing platforms to process complex seismic waveforms recorded in volcanic areas. The use case was a dataset of waveforms recorded in the Etna area (Sicily) and tested on the Leonardo platform, with a possible extension to Galileo100. The workflow standardizes the waveforms into a uniform format and then applies a set of pre-trained Deep Learning models provided by the SeisBench toolbox (Woollam et al., 2022) to perform the P-S phases on the waveforms. The predicted picks are then linked to the predicted earthquakes by the GaMMA association tool (Zhu et al., 2022).
Project goals
This project enables data analysis of complex seismic waveforms in volcanic areas where the signals are influenced by multiple seismic sources mainly related to the presence of fluids and the brittle fractures under stress within the volcanic structure. Conventional picking methods and phase associators can encounter difficulties with such complex waveforms. The ultimate goal is to find an automatic way to distinguish between the two main families of seismic signals caused by Volcano-Tectonic earthquakes and Long-Period events using machine learning models and seismological techniques.
Computational approach
Seismological waveforms are recorded by multiple networks and occupy storage space in the order of terabytes annually, so this can be considered a big data problem. Processing them can be a computational and temporal challenge. The application of Deep Learning models in conjunction with the parallelisation strategies enabled by High Performance Computing architectures offer significant advantages in terms of acceleration and deliver remarkable results compared to operations performed by humans. However, a major challenge is still the classification of seismic phases and earthquakes, which currently requires comparison tests with labelled seismic truth datasets.
Andrea Carducci
Istituto Nazionale di Oceanografia e di Geofisica Sperimentale
I am an applied geologist and seismologist, have a PhD in Earthquake and Environmental Hazards and work as a PostDoc fellow at OGS in Trieste. In December 2024, I completed a Master's degree in High Performance Computing at the International Centre for Theoretical Physics in Trieste. My thesis was about creating a workflow for phase picking and association on CINECA's Leonardo platform. My current project deals with the application of deep learning models and unsupervised learning for the processing of seismological data.