From d9a1bb9ec842f772abd25c54a3667ce26805f167 Mon Sep 17 00:00:00 2001 From: Andrew Spriggs Date: Mon, 29 Jan 2024 17:00:07 +1100 Subject: [PATCH 1/2] Minor changes to chapters 4-7 ahead of Data School cohort 1 --- episodes/04-collaboration.md | 2 +- episodes/05-project_organization.md | 26 +++++++++++++++++++------- episodes/06-track_changes.md | 13 ++++--------- episodes/07-manuscripts.md | 14 +++----------- 4 files changed, 27 insertions(+), 28 deletions(-) diff --git a/episodes/04-collaboration.md b/episodes/04-collaboration.md index 4dd5aaf3..e9041d8c 100644 --- a/episodes/04-collaboration.md +++ b/episodes/04-collaboration.md @@ -198,7 +198,7 @@ tasks in various ways. ![](fig/ms-tasks-list-view.png){alt="An example of Teams Tasks list view"} - [Jira](https://jira.csiro.au/) is another tool supported and deployed in CSIRO. Developed by -Australian software company [Atlassian](https://www.atlassian.com/software/jira, it allows +Australian software company [Atlassian](https://www.atlassian.com/software/jira), it allows tracking of to-do tasks/issues and sub-tasks, lets you assign tasks to people, and lets you track and view tasks in the context of worflows, timelines, and "board" visualisations, such as the "Kanban board". Jira can directly integrate with both BitBucket diff --git a/episodes/05-project_organization.md b/episodes/05-project_organization.md index 6689ba80..44b6c54b 100644 --- a/episodes/05-project_organization.md +++ b/episodes/05-project_organization.md @@ -113,14 +113,14 @@ files that perform the core analysis of the research, such as data cleaning or statistical analyses. These files can be thought of as the "scientific guts" of the project. -The second type of file in `src` is controller or driver scripts -that contains all the analysis steps for the entire project +Another type of file that might go in `src` is controller/driver/workflow scripts +that contain all the analysis steps of a project from start to finish, with particular parameters and data input/output commands. A controller script for a simple project, for example, may read a raw data table, import and apply several cleanup and analysis functions from the other files in this directory, and create and save a numeric result. For a small project with one main -output, a single controller script should be placed in the main +output, a single controller script could be placed in the main `src` directory and distinguished clearly by a name such as "runall". The short example below is typical of scripts of this kind; note how it uses one variable, `TEMP_DIR`, to @@ -140,9 +140,21 @@ avoid repeating the name of a particular directory four times. rm -rf $(TEMP_DIR) ``` +::::::::::::::::::::::::::::::::::::::::: callout + +**Important note:** Don't place information specific to your own computer/system +or self in these types of files, especially if they are being Git-tracked. Use +relative paths instead of full paths where possible (e.g. input as `../data/` rather +than `/home/xyz123/project/data`). Don't include any passwords or keys. +If personal or system-specific information is required for your workflow, then make +use of locally set environment variables and/or git-ignored files and then document +how to set up these inputs again for anyone (or future self) re-using your work. + +:::::::::::::::::::::::::::::::::::::::::::::::::: + ## Put compiled programs in the `bin` directory -`bin` contains +A directory named `bin` is usually used to contain executable programs compiled from code in the `src` directory. Projects that do not have any will not require `bin`. @@ -193,9 +205,9 @@ simple project might be organized following these recommendations: ``` . - |-- CITATION - |-- README - |-- LICENSE + |-- CITATION.cff + |-- README.md + |-- LICENSE.md |-- requirements.txt |-- data | -- birds_count_table.csv diff --git a/episodes/06-track_changes.md b/episodes/06-track_changes.md index 1e872cf3..665508ec 100644 --- a/episodes/06-track_changes.md +++ b/episodes/06-track_changes.md @@ -217,12 +217,7 @@ approach—the one we use in our own projects–don't just accelerate the manual process: they also automate some steps while enforcing others, and thereby require less self-discipline for more reliable results. -1. ***Use a version control - system***, to manage changes to a - project. - -Box 2 briefly explains how version control systems work. It's hard to -know what version control tool is most widely used in research today, +It's hard to know what version control tool is most widely used in research today, but the one that's most talked about is undoubtedly Git. This is largely because of GitHub, a popular hosting site that combines the technical infrastructure for collaboration via Git with a modern web interface. GitHub is free for public and open source projects @@ -231,11 +226,11 @@ GitLab is a well-regarded alternative that some prefer, because the GitLab platform itself is free and open source. Bitbucket provides free hosting for both Git and Mercurial repositories, but does not have nearly as -many scientific users. +many scientific users. CSIRO hosts it's own instance of BitBucket for employee use. ::::::::::::::::::::::::::::::::::::::::: callout -## Box 2: How Version Control Systems Work +## How Version Control Systems Work A version control system stores snapshots of a project's files in a repository. Users modify their working copy of the project, and then @@ -244,7 +239,7 @@ and/or share their work with colleagues. The version control system automatically records when the change was made and by whom along with the changes themselves. -Crucially, if several people have edited files simultaneously, the +Crucially for collaboration, if several people have edited files simultaneously, the version control system will detect the collision and require them to resolve any conflicts before recording the changes. Modern version control systems also allow repositories to be synchronized with each diff --git a/episodes/07-manuscripts.md b/episodes/07-manuscripts.md index d4d1afb7..3e396ed0 100644 --- a/episodes/07-manuscripts.md +++ b/episodes/07-manuscripts.md @@ -117,15 +117,9 @@ Our first alternative will already be familiar to many researchers: With the document online, everyone's changes are in one place, and hence don't need to be merged manually. -We realize that in many cases, even this solution is asking too much -from collaborators who see no reason to move forward from desktop GUI -tools. To satisfy them, the manuscript can be converted to a desktop -editor file format (e.g., Microsoft Word `.docx` or LibreOffice -`.odt`) after major changes, then downloaded and saved in the `doc` -folder. Unfortunately, this means merging some changes and suggestions -manually, as existing tools cannot always do this automatically when -switching from a desktop file format to text and back (although -[Pandoc](https://pandoc.org/) can go a long way). +This is easy under our current Microsoft Office organisational setup, +where Word documents (and others) may be converted to shared online +documents automatically when sharing through Outlook or Teams. ## Text-based Documents Under Version Control @@ -193,8 +187,6 @@ In groups, discuss: ## Getting started writing text-based version control -[Version Control with Git](https://swcarpentry.github.io/git-novice/) Carpentries lesson introduces text-based version control, that you could use for a collaborative manuscript. - [Manubot](https://manubot.org) is an open-source system for writing scholarly manuscripts via GitHub, with tutorials. From 56301124b9b9494b502b45466469066d916dd619 Mon Sep 17 00:00:00 2001 From: Andrew Spriggs Date: Mon, 29 Jan 2024 22:13:55 +1100 Subject: [PATCH 2/2] Added a page on the Agile methodology --- config.yaml | 1 + episodes/09-agile.md | 46 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 47 insertions(+) create mode 100644 episodes/09-agile.md diff --git a/config.yaml b/config.yaml index ac770084..f70a2307 100644 --- a/config.yaml +++ b/config.yaml @@ -53,6 +53,7 @@ episodes: - 06-track_changes.md - 07-manuscripts.md - 08-what_next.md +- 09-agile.md # Information for Learners learners: diff --git a/episodes/09-agile.md b/episodes/09-agile.md new file mode 100644 index 00000000..d8358130 --- /dev/null +++ b/episodes/09-agile.md @@ -0,0 +1,46 @@ +--- +title: 'Agile' +teaching: 60 +exercises: 0 +--- + +::::::::::::::::::::::::::::::::::::::: objectives + +- Learn some basic concepts of the 'Agile' methodology + +:::::::::::::::::::::::::::::::::::::::::::::::::: + +## What is 'Agile' + +'Agile' is a project management methodology, particularly for software development, +built around a 4 point philosophical [manifesto](https://agilemanifesto.org/) +and a 12 point set of [principles](https://agilemanifesto.org/principles.html). + +Agile is typified by small teams that self-organise ('scrum') on how they will +address a backlog of requested work, in short cycles ('sprints'), by breaking +problems into small tasks, with frequent feedback and result delivery. It is a +highly iterative approach to planning, that allows for high flexibility and less +forward planning. A sprint may last 1-4 weeks, in which time an entire cycle of +planning, designing, implmenting, testing and delivering takes place, with small +tasks hopefully addressed to completion, followed by a review and retrospective +that may or may not end up influencing the next sprint cycle. + +[Framework at a glance diagram](https://www.planview.com/resources/guide/agile-methodologies-a-beginners-guide/basics-benefits-agile-method/) + +[Contrast to waterfall model](https://www.guru99.com/agile-methodology-in-software-testing.html) + +[Roles and user stories](https://www.tutorialspoint.com/agile/agile_primer.htm) + +[Atlassian on scrums, Kanban and Jira visualisations](https://www.atlassian.com/agile/project-management) + + +:::::::::::::::::::::::::::::::::::::::: keypoints + +- The Agile approach is to break problems into smaller tasks and fully address them +in short, iterative work cycles (sprints), with each cycle ending in review and discussion +before planning the next cycle. +- Aspects of this approach may be useful in data science work. + +:::::::::::::::::::::::::::::::::::::::::::::::::: + +