- Thursday 28th November 2019; 12:30 - 17:00; eLearning 3 - School of Clinical Medicine, Addenbrookes site
Slot | Topics covered |
---|---|
12:30 - 13:30 | Introduction, Data Management Plans & Data formatting (MF) |
13:30 - 14:30 | OpenRefine practical (Live coding)DMP-Data formatting exercise (MF+JS+AP) |
14:30 - 14:45 | Break - Tea, coffee & cookies |
14:45 - 15:35 | File management (JS) |
15:35 - 16:20 | Data Sharing & Backup (AP) |
16:20 - 16:30 | Wrap-up & close |
It has been said that 80% of data analysis is spent on the process of cleaning and preparing the data for
computer-based analysis.
Not only does this represent a significant time investment for the data analyst, but is often a hurdle for the
non-specialist trying to get to grips with analysing their own data after attending an R or Python course.
Despite the best intentions, a spreadsheet that is intuitive and easily-understandable by human eyes can
lead to disaster when trying to process computationally.
This workshop will go through the basic principles that we can all adopt in order to work with data more effectively and “think like a computer”. Moreover, we will discuss the best practices for data management and organisation so that our research is auditable and reproducible by ourselves, and others, in the future. We have updated this course to be centred on the concept of Data Management Plans (described in more detail in course) which cover the life-cycle of your project data. DMPs depend on good data practices (which obviously can be vital even without a DMP).
- Do you know what a Data Management Plan is and what it covers?
- How much data would you lose if your laptop was stolen?
- Have you ever emailed your colleague a file named 'final_final_versionEDITED'?
- Have you ever struggled to import your spreadsheets into R?
As a researcher, you will encounter research data in many forms, ranging from measurements, numbers and images to documents and publications. Whether you create, receive or collect data, you will certainly need to organise it at some stage of your project. This workshop will provide an overview of some basic principles on how we can work with data more effectively. We will discuss the best practices for research data management and organisation so that our research is auditable and reproducible by ourselves, and others, in the future.
- What Research Funders expect
- Options for backing up your computer
- Ideas for naming and organising your files
- Strategies for exchanging files with collaborators
- Tips and tricks to make sure that your spreadsheets are readable by programming languages such as R
- Learn how to use tools like the OpenRefine software for data cleaning
- Preparing high-throughput biological data for submission to a public repository
- Select an appropriate backup strategy for your data
- Organise your files in a more structured and consistent manner
- Avoid common pitfalls in spreadsheet manipulation
- Known what resources are available at The University of Cambridge for Research Data Management