https://pycode-conference.org/
- October 14-16
- Gdansk, Poland
Sound is a rich source of information about the world around us with many applications within music and speech domains, as well as specific tasks in industry and science.
This talk will show you how to build practical models for sound classification, using Convolutional Neural Networks on audio spectrograms. Tricks for dealing with small amounts of data will also be covered, including transfer learning, audio embeddings and data augmentation.
A basic understanding of machine learning is recommended.
Jon started to program in Python in 2009. Since then he has worked as a Software Developer and Data Engineer within Embedded Systems and Software-as-a-Service projects. With a Bachelor in Electronics Engineering and a Master in Data Science, he is an expert on Machine Learning applied to the Internet of Things.
These days Jon is the CTO of Soundsensing, and also does freelance consulting on Machine Learning, audio processing and Internet of Things.
30 minutes. 25 minutes talk, 5 min questions.
Get in touch if you are interested in applying machine learning to sound! I love to discuss challenges, usecases and approaches.
Twitter. @jononor
People that can program in Python. But not neccesarily a Machine Learning practitioner.
Needs to be a bit more approachable than EuroPython talk, which assumed ML practitioner experience...
Need to cut down content. 30 minutes instead of 45 minutes.
Goal:
Audience knows how do a basic audio classification problem
- How to design a system that solves this problem
- How to set it up with common ML framework
- Available tools, tips and tricks.
Need to
- Define the relevant ML terms.
- Ground it in a concrete example. Environmental Sound Classification?
- Give practical and actionable recommendations
Kill:
- Applications
- Details on Mel-filters, normalization
Things to skip?
- Hyperparameter tuning
- Loss functions
- Gradient decent
- Backprop
- Weak labeling
We want a system that. When hearing a sound, can categorize what it is.
How to do this? Using Machine Learning
Supervised Learning. Train with labels (targets=. Validation set. Unseen during training. Used for checking generalization, hyperparameter tuning. Test set. Used for evaluating performance of final model.
Optimize a chosen metric. E.g.
Split audio stream into short time-frames. Say 1 second. Convert audio waveform to log Mel-spectrogram. Classify each such time-window
- Test OpenL3 on Urbansound8k?