I am an astronomer at the Max Planck Institute for Astronomy in Heidelberg. My work aims at identifying and characterising star clusters and stellar structures in the Milky Way, by applying efficient data mining and machine learning methods to large datasets. This in turns allows to to map the current structure of the Milky Way in three dimensions, and to track its evolution through time.
I have long been active in spectroscopic surveys, notably writing data processing and analysis software for large batches of observational data for the Gaia-ESO Survey, preparing target catalogues for 4MOST, and in the target selection of WEAVE.
The data collected by the ESA mission Gaia is published as several huge datasets, with the main table having about 2 billion rows and over 100 columns. With efficient methods, we can identify groups of stars with common properties and travelling together through the Galaxy. The projects I have led and participated in use a variety of clustering methods, including k-means
clustering, DBSCAN
, HDBSCAN
, Gaussian Mixture Models
, and dimensionality reduction techniques such as t-SNE
, PCA
, and UMAP
. Gaia allowed us to discover hundreds of new clusters, some in remote regions of the Milky Way, but many of them in the Solar neighbourhood!
The age of a stellar cluster can be estimated by looking at the distribution of its stars in a colour-magnitude diagram. A complete modelling of this distribution can provide deep insight into the cluster's properties, but can also be extremely time consuming, potentially requiring hours for a single cluster. In order to be able to quickly process thousands of clusters at once, I have trained a Neural Network to return cluster ages, and showed that the young clusters trace a fragmented spiral pattern in the Milky Way. In another paper (Castro-Ginard et al. 2020) we used a Convolutional Neural Network
to automatically classify clusters and asterisms. In this study (Cavallo et al. 2024) we make use of optical and near-infrared photometry, and show that our Multi-Layer Perceptron
is also able to estimate the overall chemical content of cluster stars.
Studying star clusters also allows us to take a detailed look at stellar evolution. This HR diagram is constructed from publicly available data, using clusters for which I have established the stellar members and the main parameters:
The code to produce this figure is available in this notebook.
Click here for an overview of my scientific work in astronomy: 🔭 TristanCantatGaudin.github.io
and here for a full list of the studies I have authored and co-authored.
The rest of this page lists some of my professional and hobby projects:
I am currently the lead developer of the Python package GaiaUnlimited, a package for querying and constructing selection functions for the Gaia survey developed by the GaiaUnlimited collaboration.
Provides daily updates for the number of citations to the Gaia collaboration papers, using the ADS API. The graph is updated every day through GitHub actions. It is also available as an interactive plotly figure.
A template for a simple Python package. It actually does something: applies colour gradients to strings! It works in the terminal and in Jupyter notebooks.
Streamlit app to display yearly shot volume and efficiency for NBA players. Deployed on Streamlit cloud. Collects data from BasketBall-Reference.
This repository hosts Jupyter and Marimo notebooks, with data visualisation projects related to: astronomy, weather, geospatial data (from Flickr, OpenStreetMap, satellite imagery), mathematics and statistics, stock market, sports, deep learning...