forked from swcarpentry/r-novice-inflammation
-
Notifications
You must be signed in to change notification settings - Fork 0
/
06-best-practices-R.html
117 lines (117 loc) · 9.85 KB
/
06-best-practices-R.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<title>Software Carpentry: Programming with R</title>
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap.css" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap-theme.css" />
<link rel="stylesheet" type="text/css" href="css/swc.css" />
<link rel="alternate" type="application/rss+xml" title="Software Carpentry Blog" href="http://software-carpentry.org/feed.xml"/>
<meta charset="UTF-8" />
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
</head>
<body class="lesson">
<div class="container card">
<div class="banner">
<a href="http://software-carpentry.org" title="Software Carpentry">
<img alt="Software Carpentry banner" src="img/software-carpentry-banner.png" />
</a>
</div>
<article>
<div class="row">
<div class="col-md-10 col-md-offset-1">
<a href="index.html"><h1 class="title">Programming with R</h1></a>
<h2 class="subtitle">Best practices for using R and designing programs</h2>
<section class="objectives panel panel-warning">
<div class="panel-heading">
<h2><span class="glyphicon glyphicon-certificate"></span>Learning Objectives</h2>
</div>
<div class="panel-body">
<p>Define some best practices when working with R</p>
</div>
</section>
<ol style="list-style-type: decimal">
<li>Start your code with a description of what it is:</li>
</ol>
<pre class="sourceCode r"><code class="sourceCode r"><span class="co">#This is code to replicate the analyses and figures from my 2014 Science paper.</span>
<span class="co">#Code developed by Sarah Supp, Tracy Teal, and Jon Borelli</span></code></pre>
<ol start="2" style="list-style-type: decimal">
<li>Run all of your import statments (<code>library</code>):</li>
</ol>
<pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(ggplot2)
<span class="kw">library</span>(reshape)
<span class="kw">library</span>(vegan)</code></pre>
<ol start="3" style="list-style-type: decimal">
<li>Set your working directory before <code>source()</code>ing a script, or start <code>R</code> inside your project folder:</li>
</ol>
<p>One should exercise caution when using <code>setwd()</code>. Changing directories in your script can limit reproducibility:</p>
<ul>
<li><code>setwd()</code> will throw an error if the directory you’re trying to change to doesn’t exit, or the user doesn’t have the correct permissions to access it. This becomes a problem when sharing scripts between users who have organized their directories differently.</li>
<li>If/when your script terminates with an error, you might leave the user in a different directory to where they started, and if they call the script again this will cause further problems. If you must use <code>setwd()</code>, it is best to put it at the top of the script to avoid this problem.</li>
</ul>
<p>The following error message indicates that R has failed to set the working directory you specified:</p>
<pre><code>Error in setwd("~/path/to/working/directory") : cannot change working directory</code></pre>
<p>Consider using the convention that the user running the script should begin in the relevant directory on their machine and then use relative file paths (see below).</p>
<ol start="4" style="list-style-type: decimal">
<li><p>Use <code>#</code> or <code>#-</code> to set off sections of your code so you can easily scroll through it and find things.</p></li>
<li><p>If you have only one or a few functions, put them at the top of your code, so they are among the first things run. If you have written many functions, put them all in their own .R file, and <code>source</code> them. Source will define all of these functions so that you can use them as you need them. For the reasons listed above, try to avoid using <code>setwd()</code> (or other functions that have side-effects in the user’s workspace) in scripts you <code>source</code>.</p></li>
</ol>
<pre class="sourceCode r"><code class="sourceCode r"><span class="kw">source</span>(<span class="st">"my_genius_fxns.R"</span>)</code></pre>
<ol start="6" style="list-style-type: decimal">
<li><p>Use consistent style within your code.</p></li>
<li><p>Keep your code modular. If a single function or loop gets too long, consider breaking it into smaller pieces.</p></li>
<li><p>Don’t repeat yourself. Automate! If you are repeating the same piece of code on multiple objects or files, use a loop or a function to do the same thing. The more you repeat yourself, the more likely you are to make a mistake.</p></li>
<li><p>Manage all of your source files for a project in the same directory. Then use relative paths as necessary. For example, use</p></li>
</ol>
<pre class="sourceCode r"><code class="sourceCode r">dat <-<span class="st"> </span><span class="kw">read.csv</span>(<span class="dt">file =</span> <span class="st">"/files/dataset-2013-01.csv"</span>, <span class="dt">header =</span> <span class="ot">TRUE</span>)</code></pre>
<p>rather than:</p>
<pre class="sourceCode r"><code class="sourceCode r">dat <-<span class="st"> </span><span class="kw">read.csv</span>(<span class="dt">file =</span> <span class="st">"/Users/Karthik/Documents/sannic-project/files/dataset-2013-01.csv"</span>, <span class="dt">header =</span> <span class="ot">TRUE</span>)</code></pre>
<ol start="10" style="list-style-type: decimal">
<li>R can run into memory issues. It is a common problem to run out of memory after running R scripts for a long time. In order to inspect your R session objects, you can list the objects, search current packages and remove objects that are currently not in use. A good practice when running long lines of computationally intensive sequential code is to remove temporary objects after they have served their purpose. Sometimes R will not clean up unused memory for a while after you delete objects. You can force R to tidy up its memory by using <code>gc()</code>.</li>
</ol>
<pre class="sourceCode r"><code class="sourceCode r">interim_object <-<span class="st"> </span><span class="kw">data.frame</span>(<span class="kw">rep</span>(<span class="dv">1</span>:<span class="dv">100</span>,<span class="dv">10</span>),<span class="kw">rep</span>(<span class="dv">101</span>:<span class="dv">200</span>,<span class="dv">10</span>),<span class="kw">rep</span>(<span class="dv">201</span>:<span class="dv">300</span>,<span class="dv">10</span>)) <span class="co"># Sample dataset of 1000 rows</span>
<span class="kw">object.size</span>(interim_object) <span class="co"># Reports the memory size allocated to the object</span>
<span class="kw">rm</span>(interim_object) <span class="co"># Removes only the particular object</span>
<span class="kw">gc</span>() <span class="co"># Force R to release memory it is no longer using</span>
<span class="kw">ls</span>() <span class="co"># Lists all the objects in your current workspace</span>
<span class="kw">rm</span>(<span class="dt">list =</span> <span class="kw">ls</span>()) <span class="co"># If you want to delete all the objects in the workspace and start with a new slate</span></code></pre>
<ol start="11" style="list-style-type: decimal">
<li><p>Don’t save a session history (the default option in R, when it asks if you want an <code>RData</code> file). Instead, start in a clean environment so that older objects don’t contaminate your current environment. This can lead to unexpected results, especially if the code were to be run on someone else’s machine.</p></li>
<li><p>Where possible keep track of <code>sessionInfo()</code> somewhere in your project folder. Session information is invaluable since it captures all of the packages used in the current project. If a newer version of a project changes the way a function behaves, you can always go back and reinstall the version that worked (Note: At least on CRAN all older versions of packages are permanently archived).</p></li>
<li><p>Collaborate. Grab a buddy and practice “code review”. We do it for methods and papers, why not code? Our code is a major scientific product and the result of a lot of hard work!</p></li>
<li><p>Develop your code using version control and frequent updates!</p></li>
</ol>
<section class="challenge panel panel-success">
<div class="panel-heading">
<h2><span class="glyphicon glyphicon-pencil"></span>Discussion - Best practice</h2>
</div>
<div class="panel-body">
<ol style="list-style-type: decimal">
<li>What other suggestions do you have?</li>
<li>How could we restructure the code we worked on today, to make it easier to read? Discsuss with your neighbor.</li>
<li>Make two new R scripts called inflammation.R and inflammation_fxns.R</li>
<li>Copy and paste the code so that inflammation.R “does stuff” and inflammation_fxns.R holds all of your functions. <strong>Hint</strong>: you will need to add <code>source</code> code to one of the files.</li>
</ol>
</div>
</section>
</div>
</div>
</article>
<div class="footer">
<a class="label swc-blue-bg" href="http://software-carpentry.org">Software Carpentry</a>
<a class="label swc-blue-bg" href="https://github.com/swcarpentry/r-novice-inflammation">Source</a>
<a class="label swc-blue-bg" href="mailto:[email protected]">Contact</a>
<a class="label swc-blue-bg" href="LICENSE.html">License</a>
</div>
</div>
<!-- Javascript placed at the end of the document so the pages load faster -->
<script src="http://software-carpentry.org/v5/js/jquery-1.9.1.min.js"></script>
<script src="css/bootstrap/bootstrap-js/bootstrap.js"></script>
</body>
</html>