- one folder per project = one contained unit
- definition of “contained unit” depends on your project, style, requirements
- make sure that this is your working directory in R/R Studio
- examples of folder structures to use [http://nicercode.github.io/blog/2013-04-05-projects]
- contains analysis script
- often used functions (potentially in different function script file)
- if you open R/RStudio by double-clicking on the script file in your file explorer, this will also automatically set the working directory to this location, and all your file names will be relative to this location.
- data in .csv file
- sometimes also excel file
- data organization [http://kbroman.org/dataorg/]
- read_csv(“../data/ “ ) with tab completion to import data file (if the working directory is set to this folder)
- if YYYY-MM-DD format, import will convert it into a DateTime object automatically
- data management [http://www.britishecologicalsociety.org/wp-content/uploads/Publ_Data-Management-Booklet.pdf]
- data creation
- data description = meta-data
- data process (potentially in addition to analysis file)
- report(s) to share with collaborators/advisors
- [http://ecoevoevoeco.blogspot.com/2021/12/guidelines-for-archiving-data-and-code.html]
- important figures
- sharing with collaborators/advisors
- publication figures
- copy-paste from console
- model output
- p-values
- script file = where everything happens
- plain text => readable by anybody
- easy to save
- easy to repeat
- easy to document
- try to avoid going back to a spreadsheet to process the data
- one exception: if errors are detected during data processing
- e.g. from Google [https://google.github.io/styleguide/Rguide.xml], some points from that guide
- applying a style guide automatically [https://www.tidyverse.org/blog/2017/12/styler-1.0.0/]
- spacing
- commenting [https://stackoverflow.blog/2021/12/23/best-practices-for-writing-code-comments/]
- Rule 1: Comments should not duplicate code, focus on the why, not the what
- Rule 2: Good comments do not excuse unclear code
- Coding explanations (#, often after the code, but not exclusively)
- Code organization, see examples below (Rule 4) (## XXXXX -----)
- Justification for a section of code, including links (Rules 6 and 7) ## XXX
- Dead end analyses because it did not work, or not pursuing this line of inquiry (but leave it in as a trace of it, to potentially solve this issue, or avoid making the same mistake in the future (Rule 8) # (>_<)
- Solutions/results/interpretations (Rule 7) (#==> XXX)
- Reference to manuscript pieces, figures, results, tables, ... # (*_*)
- TODO items (Rule 9) #TODO
- names for data frames (df_name), for lists (ls_name), for vectors (vc_name) (Thanks Jacqueline May)
- attach: avoid using it
- use common sense and BE CONSISTENT
- 4 standard windows
- script file = all the necessary code
- should be able to run line-by-line to repeat whole analysis
- console = output
- try code, but if final solution
- copy-paste to script file
- or use history window
- script file = all the necessary code
- tab completion => you can use descriptive (longer) file names
- soft-wrap R source files (preferences > Code editing) => no need to scroll left-right
_ Heading 1 --------
__ Subheading 1.1 ----------
___ subsub heading 1.1.1 -------------
- any comment line with 4 trailing dashes (-) , equal signs (=), or pound signs (#)
- you can add more trailing dashes to help subdivide your code visually
- I start the headings with different
_
because this will indent the subheadings, and if you convert it to some Markdown version that will make the transition easier (Thank you Brent for that tip). - you can fold the code to hide lines that you are not working on
- to navigate between code sections, use “Jump To” menu available at bottom of the editor
- create zip file of your project folder
- send it to somebody else
- naive intelligent observer should be able to repeat the whole analysis
- understand every step of the way
The best introduction to data analysis with R. This online, free, continuously updated, book makes all our jobs of guiding people through their first (or 6th) steps using R. Very highly recommended!
https://www.rstudio.com/resources/cheatsheets/
And you can sign up to receive updates to these sheets!
http://www.britishecologicalsociety.org/wp-content/uploads/2017/12/guide-to-reproducible-code.pdf