From 36eda543a1b026bb230cf5f16978e45a8e1d378e Mon Sep 17 00:00:00 2001 From: Ivelina Momcheva Date: Mon, 3 Jun 2024 15:54:18 +0200 Subject: [PATCH] Added LLMs. --- faq/chapters/publishing/old-code.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/faq/chapters/publishing/old-code.md b/faq/chapters/publishing/old-code.md index 1f7adfd..aa51c6a 100644 --- a/faq/chapters/publishing/old-code.md +++ b/faq/chapters/publishing/old-code.md @@ -26,6 +26,10 @@ before you put it in a public repository (see [How to publish software?](https:/ At his point, it is also useful to preserve any input/output files if these were not provided by the original author. Can you install and run the code? Do you have the needed inputs? Can you generate some outputs? If you can, document all these steps in a simple readme file. Write down what you had to do to get the code running. Archive any inputs or outputs together with the code. This may already be too much effort. Timebox this work. A day of solid effort trying to do this is good. More than that and you are probably going down a rabbit hole. +## Take Advantage of our Robot Overlords + +Large language models (LLMs) can be very helpful in the task of understanding legacy code. You can paste a function or a snipet of code into an LLM and ask it to explain the code. Even more useful can be loading the codebase into a integrated development environment (IDE) like VSCode or PyCharm where tools like GitHub Copilot are available as extensions. These tools can "see" the whole codebase, create docstrings, tests and explain code. These tools can also be extremely helpful for the steps in the next section. + If you are trying to replicate results from existing work, this may be where you stop. This could also be the endpoint if you are trying to run the code on new data. More likely than not, however, you will have to make some changes to the code to accommodate new data (e.g., new file formats, new calibrations, etc.). This leads us to... ## Now What? @@ -58,6 +62,7 @@ DO NOT go line by line and transpose the code. Never. Ever. - Many legacy codes will have their implementations of common functions (e.g., a Gaussian). Use common libraries to replace these. - Just because the legacy code made some questionable decisions about structure, hard-coding paths and looping over matrixes, it doesn't mean you should make the same mistakes. - Talk to colleagues on how to re-implement complex functionality such as computationally intensive pieces, interactive graphics, etc. +- You can also leverage LLMs to translate from one programming language to another (as in [this example](https://youtu.be/jXu_-edmMzc?si=6IdWK4YXA9ASelTu) of converting from IDL to Python). ## Publish your code.