Skip to content

nelsonroque/contextlab_reproduciblescience_workshops

Repository files navigation

Introduction to Reproducible Science: A 3-day Summer Workshop

  • Author: Nelson Roque, PhD
  • [email protected]
  • Director of the Context Lab at University of Central Florida

Intentions of this Book and Web Course

  • To train the next-generation of scientists to work with data - regardless of the type, format, or volume.
  • Make available a set of open-source materials to learn how to engage in reproducible science, leveraging code-based techniques.
  • This repository is intended to house various sample workflows, and code snippets, to support research + data science activities.

Background

A reproducibility crisis (Ioannidis, 2005; Open Science Collaboration, 2015) has emerged as a threat to the scientific enterprise. Over the last decade I've engaged in learning opportunities to become proficient across topics including data wrangling and modeling of text, image, video, and eye-tracking data, as well as more recently sensor data, and look forward to training the next generation of scientists on code-based methods to apply in their research.

  • Ioannidis, John P A. 2005. “Why Most Published Research Findings Are False.” PLoS Medicine 2 (8): e124.doi:10.1371/journal.pmed.0020124.
  • Open Science Collaboration. 2015. “Estimating the Reproducibility of Psychological Science.” Science 349 (6251):aac4716–aac4716. doi:10.1126/science.aac4716.

Workshop Format

Before the Workshop

  1. Install R - Download R
  2. Install RStudio - Download RStudio
  3. Install packages for various analyses
```
install.packages(c('tidyverse', 'devtools', 'readr', 'tidytext', 'textdata',
'topicmodels', 'wordcloud', 'ggwordcloud'))
```

Learning Objectives

  • Describe various tools and techniques supportive of open and reproducible science.
  • List and describe the FAIR Principles (https://www.go-fair.org/fair-principles)
  • Develop a code-only pipeline to allow reproducibility of data prep and analyses.
  • Develop a long-term learning plan for practicing reproducible science tools and techniques.

Workshop Schedule

  • Day 1 | July 6, 2022
    • What is Reproducible Science?
    • Reproducible & FAIR Data Workflows
    • Tools Supporting Reproducible Science
    • Overview of available tools
      • Skill 1: Using Endnote for Reference Management
      • Skill 2: Using Git (and Github) for code management and collaboration
    • Orientation to R, RStudio, RMarkdown
      • Skill 3: R syntax primer
    • Data Science: Latest trends
    • Long-term Learning Recommendations
  • Day 2 | July 8th, 2022
    • Data wrangling and visualization of Big Data
      • Skill 1: Data wrangling the Google Mobility dataset
    • Reproducible survey research
      • Qualtrics survey design tips
      • Skill 2: Data wrangling Qualtrics data
    • Working with JSON data
      • Skill 3: cleaning and visualizing keystroke JSON data
  • Day 3 | July 11th, 2022
    • Text mining
      • Skill 1: word and bigram frequency analysis
      • Skill 2: generating wordclouds
      • Skill 3: sentiment analysis
    • Interacting with APIs and JSON data
      • Skill 4: querying API for results and data aggregation
    • Closing Discussion & Q/A

Submit your questions

Do you have any questions about the workshop or related content? Submit your questions here

Resources

Books

Cheatsheets

Visualization

Stats

Blogs

Interactive Learning Tools

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages