Skip to content

Code and lecture materials for the "Data Science for Physicists" short course held at the 2025 Joint March and April Meetings

License

Notifications You must be signed in to change notification settings

mxliu6/DataScienceForPhysicists2025

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Short Course on Data Science for Physicists

Code and lecture materials for the "Data Science for Physicists" short course held at the 2025 APS Global Physics Summit.

This short course is co-sponsored by the Topical Group on Data Science (GDS), the Division of Computational Physics (DCOMP), the Division of Soft Matter (DSOFT), the Division of Particles and Fields (DPF), and the Division of Biological Physics (DBIO).

The schedule for the short course is available here.

Course Description

Data science is playing an ever-increasing role in physics. In this two-day tutorial, we will introduce data science as it applies to a variety of fields in physics. The first day of the course is an introduction to the fields of data science and machine learning (ML) as they apply to physics data. We will then provide an introduction to machine learning, including both regression and classification algorithms. This session will explain why neural networks work and describe the practical steps needed to train a model, such as feature engineering, hyperparameter tuning, and validation. We will conclude the first day of the tutorial with an introduction to unsupervised learning techniques (including clustering and random forests), as well as a session that will introduce both neural networks (NNs) and convolutional networks (CNNs). The second day of this course will provide sessions on advanced topics in data science and machine learning. The first three sessions will cover graph neural networks (GNNs) and large language models (LLMs), introducing the topics and then focusing on their applications to physics. The final four sessions of the tutorial will cover a range of applications of both machine learning and data science. The session “Assessing Training Data: Material Data APIs” will cover accessing large, online databases of materials data to use as training data for machine learning algorithms. The session “Introduction to neural-network quantum states (NQS)” aims to provide a clear understanding of NQS and their broader applications in quantum many-body physics by introducing the theoretical and computational background necessary for constructing NQS, focusing on the quantum harmonic oscillator. The third session of the afternoon, “Using Data Science to Understand Complexity in Soft Matter Systems”, will discuss recent applications of data science and machine learning to understanding the complexity in soft matter systems. Finally, the session “Applications of Machine Learning to Biology” will focus on using AI to build “mechanistic foundation models” capable of physics simulations of the brain and the body of the fruit fly.

Topics

  • Data visualization and exploratory data analysis
  • Regression and classification models
  • Unsupervised machine learning
  • Neural networks
  • Convolutional neural networks
  • Graph neural networks
  • Large language models
  • Databases and APIs

Speakers

About

Code and lecture materials for the "Data Science for Physicists" short course held at the 2025 Joint March and April Meetings

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.9%
  • Python 0.1%