Skip to content

Latest commit

 

History

History
449 lines (291 loc) · 14.7 KB

2024-liu-org.md

File metadata and controls

449 lines (291 loc) · 14.7 KB

class: center, middle, gray-background

CodeRefinery logo

The CodeRefinery project for training in research software engineering

Johan Hellsvik, PDC center for high performance computing, KTH Royal Institute of Technology, Sweden

Talk at LiU Open & Reproducible Research Group meeting – May 2024


Team and project: coderefinery.org

.left-column50[

What we are

  • A hub for FAIR research software practices
  • Since 2016, now phase 3 until 2025
  • Currently funded by NeIC
  • Training network
  • Community

What we do

  • We teach and co-organize
  • Share lessons, video recordings, manuals
  • All open source ]

.right-column50[ Pyramid image with Carpentries as base, in the middle CodeRefinery providing expert training, and on top: specialist training ]


.center[ 6 helpful steps for reproducible research: file organization, naming, documentation, version control, stabilizing computing environment, publishing cresearch outputs ]

.cite[Heidi Seibold, CC-BY 4.0, https://twitter.com/HeidiBaya/status/1579385587865649153]

Similar projects: UNIVERSE-HPC, DIGITAL RESEARCH ACADEMY, INTERSECT, and probably many more ...


.left-column50[

  • Introduction to version control: Git and GitHub for own projects
  • Collaborative version control: Branching, pull/merge requests, forks, and collaboration.
  • Reproducible research: Reproducible dependencies, environments, and computational steps.
  • Social coding and open software: Software and data licensing and software citation.
  • How to document your research software
  • Reusable and reproducible Jupyter notebooks ]

.right-column50[

  • Automated testing: Motivation, test design, and tools.
  • Modular code development: Organizing projects as they grow from one screen-full to larger.

Tested in 9 online and 28 in-person workshops


Relation to research software engineering

.center[ From researcher to researcher who codes to CodeRefinery to Research Software Engineer ]


Lessons

We use Sphinx/sphinx-lesson to build our lessons from Markdown.

.center[ Screenshot of a lesson in Sphinx format, showing tabs for different programming languages ]


Another example: Git lesson

.center[ Screenshot of a the Git lesson in Sphinx format offering 4 tabs for different paths ]

You can try our lesson template


How to participate as a learner

.center[ Learner participation modes ]


  • Sent out to workshop participants from 2022 and 2023
  • 129 answers

Plot estimating time saving


Plot about whether code is more reusable

Plot about whether collaboration is easier


Plot about whether colleagues have been introduced

How likely are you to recommend?


Collaboration across funding borders

Air traffic control tower Streaming setup during Python for Scientific Computing

0.9 FTE (2 persons) + 10 persons in-kind + volunteers

logo: Aalto Scientific Computing

logo: CSC - IT Center for Science

logo: Center for Humanities Computing

logo: Danish e-Infrastructure Consortium

logo: EuroCC National Competence Center Sweden (ENCCS)

logo: National Academic Infrastructure for Super­computing in Sweden (NAISS)

logo: NRIS/Sigma2

logo: NRIS/Sigma2

Co-advertize and co-organize with us

TU Delft logo The Netherlands eScience Center logo VU Amsterdam logo


Connection to high-performance computing


What we have learned

About motivating/teaching

  • Teaching isn't a lecture anymore. It's more .emph[like a live TV production], which can be as interactive as people in a room.

  • .emph[Co-teaching] is a great way to onboard, get better quality, and reduce stress

  • .emph[Good enough practices] better than perfect practices not applied

  • Instead of "good for others": ".emph[good for your future you] and as side effect good for others"


What we have learned

About scaling

  • .emph["bring your own classroom"] seems to be a way to scale

  • .emph[Installation instructions and on-boarding] become more important

  • We don't "see" classrooms -> .emph[feedback mechanism] in Q&A doc

  • Make exercises longer to .emph[give classrooms the chance to interact]


Collaborative document: Markdown

  • Interactive, anonymous, parallel, async
  • New question every 1-2 minutes!
  • ASCII-graph feedback

Screenshot of exercise title and questions in collaborative notes

We publish Q&A for each workshop: Example


.left-column50[

Future: Community project

  • .emph[Communicate value] for volunteers and organizations

  • Research groups send their students to us instead of creating isolated material

  • .emph[More collaboration] with similar projects ("helper exchange program")

  • Governance is .emph[community-driven] ]

.right-column50[

Teaching format

  • Continue .emph[large-scale workshops]

  • Support .emph[local events]

  • More asynchronous content coupled with online events (".emph[flipped classroom] approach") ]


How you or your organization can participate

.center[ Graphics that summarizes how organizations can participate: by advertizing, by sending observers or organizing local teams or through in-kind support ]

  • Join our next workshop autumn 2024; follow our newsletter to get involved
  • Tell your students and researchers about it
  • Send one or more exercise teams or join as observer
  • Use our material and give feedback

What is in it for you?

  • .emph[Joining is easier than organizing]: It is easier to bring 10% to an event than to organize the 100% yourself

  • .emph[Material exchange]: let's not reinvent the wheel

  • .emph[Train-the-trainer]: we can help you to get started

  • .emph[Community as test-bed]: let's try out new ideas together


We try to make it easier to join

.left-column60[

.right-column40[ Screenshot of a weekly team meeting summary posted on Mastodon ]


Research software engineering for computational materials science

  • How can skills taught and learned in CodeRefinery workshops be put to use for the field of computational materials science?

  • The National Academic Infrastructure for Supercomputing in Sweden (NAISS) caters for users of high performing computing resources at higher education institutes in Sweden.

  • The PDC center for higher permance computing, KTH, is part of NAISS. PDC is operating the Dardel HPE Cray EX supercomputer, equipped with AMD CPUs and GPUs

  • Staff at PDC collaborate with staff at other NAISS centers and the LUMI supercomputer. Expertise on one specific application program at one center, can be shared between centers.


Computational material science codes available on Dardel (selected)

  • The Relativistic Spin Polarized tookit (RSPt), a code for electronic structure calculations based on the Full-Potential Linear Muffin-Tin Orbital (FP-LMTO) method.

  • The Quantum ESPRESSO integrated suite of open-source computer codes for electronic-structure calculations and materials modeling at the nanoscale

  • CP2K, a program to perform atomistic and molecular simulations of solid state, liquid, molecular, and biological systems.


Building and testing programs

  • For maintaining and installing (new versions) of materials theory codes on Dardel, we are mainly using the EasyBuild system.

  • A program that has been EasyBuilt and installed on Dardel can (often) be straightforwardly ported to a build for LUMI.

  • Vice versa, a build on LUMI can be ported for Dardel. The easyconfig build configuration for Elk on Dardel has been ported to LUMI.

  • Example: For the Elk electronic structure code, compare

  • To build Elk 9.2.12 on Dardel under CPE 23.03

    ml PDC/23.03 easybuild-user/4.8.2
    eb elk-9.2.12-cpeGNU-23.03.eb --robot
    

Reference page: Installing software using EasyBuild


Development of teaching material for application programs


Porting programs to GPU architectures

  • Nvidia's GPUs and the CUDA framework, has been the dominating paradigm for GPU computing for ~15 years.

  • For AMD GPUs, the corresponding framework is heterogeneous interface for portability (HIP)

  • Many application programs need yet to be ported to GPU architecture

    • Hang on to existing code base, or start from scratch?
    • How to develop and maintain code for different backends (Nvidia, AMD, Intel, etc)?
    • Porting projects requires a team with expertise in the domain field, general research software engineering, and selected GPU coding paradigms.

class: center, middle, inverse

.center[ Nordic-RSE name with map of Nordics and Baltics ]

Nordic RSE Conference

May 30 - 31, at Aalto University campus in Otaniemi, Espoo, Finland

https://nordic-rse.org/events/2024-in-person-conference/


Thank you for your attention!

Credits and license

Text

  • All text: CodeRefinery project, CC-BY 4.0

Images

  • Slide 3: H. Seibold, "6 helpful steps for reproducible research", CC-BY 4.0
  • Slides 5, 8, 18: S. Wittke
  • Slide 12: ATC tower, P. R. Miller, CC-BY 2.0
  • Slide 12: Monitor setup, R. Darst
  • Slide 12: Logos, (c) respective organizations
  • All other images: CodeRefinery project, CC-BY 4.0