Tutorial on seismic signal/noise classification; from linear to deep classifiers
This jupyter notebook tutorial is meant to be a general introduction to machine and deep learning. We use seismic time series data from i) real earthquakes and ii) nuisance signals to train a suite of supervised keras classifiers to discriminate between the two signal classes. We start from linear classifiers and gradually increase their complexity, to demonstrate to what extent deep convnet classifiers outperform shallower and linear ones. We also explore how to evaluate binary classifiers, and how much data we actually need to train deep classifiers.
No prior knowldedge on seismology or machine learning is required; much of the tutorial builds on concepts from undergraduate-level applied mathematics (calculus, linear algebra, optimization). No GPUs or other special hardware is required, your laptop should work just fine. The repository contains training and testing data set files that together are ~100Mb in size, so it may take a minute or two for downloading.
I recommend you use the Anaconda Python distribution to set up a working environment with TensorFlow (I used version 1.5.0) and keras (2.2.4). If you are using unix and have installed conda you can set everything up by typing the following line in the terminal:
$ conda create -c conda-forge -n seismoDL101 python=3.6\
jupyter numpy scipy obspy keras tensorflow scikit-learn\
seaborn pandas h5py
Then activate the environment (type conda activate seismoDL101
in
terminal), and open the notebook (type jupyter notebook
in terminal), and you
should be ready to ... explore machine and deep learning!
I hope you enjoy the tutorial (_/) For comments and questions please email [email protected]; last update: April 29, 2019; v1.0