Skip to content

Commit

Permalink
Built site for gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
Quarto GHA Workflow Runner committed Mar 12, 2024
1 parent 30bf510 commit 05271fa
Show file tree
Hide file tree
Showing 6 changed files with 232 additions and 232 deletions.
2 changes: 1 addition & 1 deletion .nojekyll
Original file line number Diff line number Diff line change
@@ -1 +1 @@
5e654d6d
d8525642
144 changes: 72 additions & 72 deletions 404.html

Large diffs are not rendered by default.

144 changes: 72 additions & 72 deletions index.html

Large diffs are not rendered by default.

22 changes: 11 additions & 11 deletions posts/2018-01-22-a-set-seed-ggplot2-adventure/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -417,29 +417,29 @@ <h1>Background</h1>
<h1>Why did this happen?</h1>
<p>The <code>set.seed()</code> function sets the starting number used to generate a sequence of random numbers – it ensures that you get the same result if you start with that same seed each time you run the same process. For example, if I use the <code>sample()</code> function immediately after setting a seed, I will always get the same sample.</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">1</span>)</span>
<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">7</span>)</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="fu">sample</span>(<span class="dv">3</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 1 2 3</code></pre>
<pre><code>[1] 2 1 3</code></pre>
</div>
</div>
<div class="cell">
<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">1</span>)</span>
<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">7</span>)</span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="fu">sample</span>(<span class="dv">3</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 1 2 3</code></pre>
<pre><code>[1] 2 1 3</code></pre>
</div>
</div>
<p>If I run <code>sample()</code> twice after setting a seed, however, I would not expect them to be the same. I’d expect the first result to match those above, and the second to be different.</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">1</span>)</span>
<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">7</span>)</span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a><span class="fu">sample</span>(<span class="dv">3</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 1 2 3</code></pre>
<pre><code>[1] 2 1 3</code></pre>
</div>
<div class="sourceCode cell-code" id="cb7"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a><span class="fu">sample</span>(<span class="dv">3</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 1 2 3</code></pre>
<pre><code>[1] 3 2 1</code></pre>
</div>
</div>
<p>The second is different because I have already performed one random process, so now my starting point prior to running the latter <code>sample()</code> function is no longer <code>1</code>.</p>
Expand Down Expand Up @@ -469,17 +469,17 @@ <h1>Why did this happen?</h1>
<div class="no-row-height column-margin column-container"><div class="">
<p>Note: If you are quite clever, you will notice that this is <em>not</em> an interactive document and therefore neither the first nor the second chunk should run <code>.onAttach()</code> – this is true! I’ve run them interactively and included the output for demonstration purposes 🙊.</p>
</div></div><div class="cell">
<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">1</span>)</span>
<div class="sourceCode cell-code" id="cb10"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">7</span>)</span>
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(ggplot2)</span>
<span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a><span class="fu">sample</span>(<span class="dv">3</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</div>
<pre><code>## [1] 2 3 1</code></pre>
<pre><code>[1] 3 2 1</code></pre>
<div class="cell">
<div class="sourceCode cell-code" id="cb11"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">1</span>)</span>
<div class="sourceCode cell-code" id="cb11"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">7</span>)</span>
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(ggplot2)</span>
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a><span class="fu">sample</span>(<span class="dv">3</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 1 2 3</code></pre>
<pre><code>[1] 2 1 3</code></pre>
</div>
</div>
<p>Notice the second chunk gave us the same result as above, but the first chunk was different. That is because the first chunk runs the <code>.onAttach()</code> function between when I set my seed and when I drew my sample.</p>
Expand Down
2 changes: 1 addition & 1 deletion search.json
Original file line number Diff line number Diff line change
Expand Up @@ -1642,7 +1642,7 @@
"href": "posts/2018-01-22-a-set-seed-ggplot2-adventure/index.html",
"title": "A set.seed() + ggplot2 adventure",
"section": "",
"text": "Recently I tweeted a small piece of advice re: when to set a seed in your script:\n\ntip: always set.seed() AFTER loading ggplot2\n\nJenny pointed out that this may be blog post-worthy, so here we are!\n\nBackground\n\n\n\nA little bit about where this tweet came from: I was out to lunch with my friend Jonathan – over our scrumptious Thai food, he mentioned he was having some simulation trouble. In particular, some of his simulations were breaking, but when he tried to run the script locally, everything seemed fine. He was using the same seed in both instances, and in fact was running the exact same script – what could be causing this discrepancy! I recalled that ggplot2 would generate a random message about 10% the time, and asked whether he thought loading this package could somehow be affecting his simulation results. After some further investigation 🕵️‍♂️, it looked like indeed this may be the culprit!\n\n\nWhy did this happen?\nThe set.seed() function sets the starting number used to generate a sequence of random numbers – it ensures that you get the same result if you start with that same seed each time you run the same process. For example, if I use the sample() function immediately after setting a seed, I will always get the same sample.\n\nset.seed(1)\nsample(3)\n\n[1] 1 2 3\n\n\n\nset.seed(1)\nsample(3)\n\n[1] 1 2 3\n\n\nIf I run sample() twice after setting a seed, however, I would not expect them to be the same. I’d expect the first result to match those above, and the second to be different.\n\nset.seed(1)\nsample(3)\n\n[1] 1 2 3\n\nsample(3)\n\n[1] 1 2 3\n\n\nThe second is different because I have already performed one random process, so now my starting point prior to running the latter sample() function is no longer 1.\nThere is a small function in ggplot2 that runs when the library is loaded for the first time.\n\nNote: This is a little different than this code looks now, it has been updated since this discussion began!\n\n\n.onAttach &lt;- function(...) {\n if (!interactive() || stats::runif(1) &gt; 0.1) return()\n\n tips &lt;- c(\n \"RStudio Community is a great place to get help: https://community.rstudio.com/c/tidyverse.\",\n \"Find out what's changed in ggplot2 at https://github.com/tidyverse/ggplot2/releases.\",\n \"Use suppressPackageStartupMessages() to eliminate package startup messages.\",\n \"Need help? Try Stackoverflow: https://stackoverflow.com/tags/ggplot2.\",\n \"Need help getting started? Try the cookbook for R: http://www.cookbook-r.com/Graphs/\",\n \"Want to understand how all the pieces fit together? See the R for Data Science book: http://r4ds.had.co.nz/\"\n )\n\n tip &lt;- sample(tips, 1)\n packageStartupMessage(paste(strwrap(tip), collapse = \"\\n\"))\n}\n\nThis function has a sample() call, which will move the starting place of your random sequence of numbers. The main piece that caused Jonathan some 😫 is the !interactive() logic, which only runs the remainder of the code if the session is interactive. Another thing that can cause a bit of confusion is this .onAttach() function is only run the first time the library is loaded, so if I run what looks like the exact same code twice during the same session, I can get different results. For example,\n\n\nNote: If you are quite clever, you will notice that this is not an interactive document and therefore neither the first nor the second chunk should run .onAttach() – this is true! I’ve run them interactively and included the output for demonstration purposes 🙊.\n\nset.seed(1)\nlibrary(ggplot2)\nsample(3)\n\n## [1] 2 3 1\n\nset.seed(1)\nlibrary(ggplot2)\nsample(3)\n\n[1] 1 2 3\n\n\nNotice the second chunk gave us the same result as above, but the first chunk was different. That is because the first chunk runs the .onAttach() function between when I set my seed and when I drew my sample.\n\n\nReproducibility crisis?\n\n\n\nIt turns out this isn’t cause for concern re: reproducibility, thanks to Jim Hester’s new patch to ggplot2 phew! Prior to this update, non-interactive scripts will be reproducible (if always run non-interactively), however interactive scripts can cause some issues, as shown above. In general, I like to provide future Lucy with as much assistance as possible, so I will likely avoid setting seeds prior to loading packages on the off chance that it will make my debugging trickier in the future.\n\n\nWhat should I do?\n\nFor this particular issue, Jim Hester has already patched the development version of ggplot2 to preserve your seed 🌱 if you set it before loading ggplot2 (he did it with a slick function, withr::with_preserve_seed, love it!), so you can go download the development version and set your seed wherever you please.\n\nIn general, it seems somewhat prudent to set your seeds after loading packages 📦, as it can be tricky to know exactly what is going on under the hood. The wise R-Lady Steph Locke advised in a conversation on this topic to generally try to set seeds as close to the random component as possible to avoid any confusion – this seems like easy and good advice to follow 👯!\n\n\n\nWhat have we learned?\n\nI’ve learned a lot just from thinking through where to set different parts of my code and how that can affect things downstream\n\nWe now know a bit more about how seeds work!\n\nWe’ve learned about the withr::with_preserve_seed() function 🎉\n\nWe’ve seen the potential consequences of changing the global state in a package – Jenny recently added this as an issue to discuss in a future version of r-pkgs, which she eloquently summarizes as “don’t touch things that don’t belong to you and if you have to, you need to be super careful to wipe all your sticky fingerprints off everything”\n\nThe #rstats community is so helpful and responsive! A small debugging situation led to lots of helpful advice and a quick fix from Jim 👷"
"text": "Recently I tweeted a small piece of advice re: when to set a seed in your script:\n\ntip: always set.seed() AFTER loading ggplot2\n\nJenny pointed out that this may be blog post-worthy, so here we are!\n\nBackground\n\n\n\nA little bit about where this tweet came from: I was out to lunch with my friend Jonathan – over our scrumptious Thai food, he mentioned he was having some simulation trouble. In particular, some of his simulations were breaking, but when he tried to run the script locally, everything seemed fine. He was using the same seed in both instances, and in fact was running the exact same script – what could be causing this discrepancy! I recalled that ggplot2 would generate a random message about 10% the time, and asked whether he thought loading this package could somehow be affecting his simulation results. After some further investigation 🕵️‍♂️, it looked like indeed this may be the culprit!\n\n\nWhy did this happen?\nThe set.seed() function sets the starting number used to generate a sequence of random numbers – it ensures that you get the same result if you start with that same seed each time you run the same process. For example, if I use the sample() function immediately after setting a seed, I will always get the same sample.\n\nset.seed(7)\nsample(3)\n\n[1] 2 1 3\n\n\n\nset.seed(7)\nsample(3)\n\n[1] 2 1 3\n\n\nIf I run sample() twice after setting a seed, however, I would not expect them to be the same. I’d expect the first result to match those above, and the second to be different.\n\nset.seed(7)\nsample(3)\n\n[1] 2 1 3\n\nsample(3)\n\n[1] 3 2 1\n\n\nThe second is different because I have already performed one random process, so now my starting point prior to running the latter sample() function is no longer 1.\nThere is a small function in ggplot2 that runs when the library is loaded for the first time.\n\nNote: This is a little different than this code looks now, it has been updated since this discussion began!\n\n\n.onAttach &lt;- function(...) {\n if (!interactive() || stats::runif(1) &gt; 0.1) return()\n\n tips &lt;- c(\n \"RStudio Community is a great place to get help: https://community.rstudio.com/c/tidyverse.\",\n \"Find out what's changed in ggplot2 at https://github.com/tidyverse/ggplot2/releases.\",\n \"Use suppressPackageStartupMessages() to eliminate package startup messages.\",\n \"Need help? Try Stackoverflow: https://stackoverflow.com/tags/ggplot2.\",\n \"Need help getting started? Try the cookbook for R: http://www.cookbook-r.com/Graphs/\",\n \"Want to understand how all the pieces fit together? See the R for Data Science book: http://r4ds.had.co.nz/\"\n )\n\n tip &lt;- sample(tips, 1)\n packageStartupMessage(paste(strwrap(tip), collapse = \"\\n\"))\n}\n\nThis function has a sample() call, which will move the starting place of your random sequence of numbers. The main piece that caused Jonathan some 😫 is the !interactive() logic, which only runs the remainder of the code if the session is interactive. Another thing that can cause a bit of confusion is this .onAttach() function is only run the first time the library is loaded, so if I run what looks like the exact same code twice during the same session, I can get different results. For example,\n\n\nNote: If you are quite clever, you will notice that this is not an interactive document and therefore neither the first nor the second chunk should run .onAttach() – this is true! I’ve run them interactively and included the output for demonstration purposes 🙊.\n\nset.seed(7)\nlibrary(ggplot2)\nsample(3)\n\n[1] 3 2 1\n\nset.seed(7)\nlibrary(ggplot2)\nsample(3)\n\n[1] 2 1 3\n\n\nNotice the second chunk gave us the same result as above, but the first chunk was different. That is because the first chunk runs the .onAttach() function between when I set my seed and when I drew my sample.\n\n\nReproducibility crisis?\n\n\n\nIt turns out this isn’t cause for concern re: reproducibility, thanks to Jim Hester’s new patch to ggplot2 phew! Prior to this update, non-interactive scripts will be reproducible (if always run non-interactively), however interactive scripts can cause some issues, as shown above. In general, I like to provide future Lucy with as much assistance as possible, so I will likely avoid setting seeds prior to loading packages on the off chance that it will make my debugging trickier in the future.\n\n\nWhat should I do?\n\nFor this particular issue, Jim Hester has already patched the development version of ggplot2 to preserve your seed 🌱 if you set it before loading ggplot2 (he did it with a slick function, withr::with_preserve_seed, love it!), so you can go download the development version and set your seed wherever you please.\n\nIn general, it seems somewhat prudent to set your seeds after loading packages 📦, as it can be tricky to know exactly what is going on under the hood. The wise R-Lady Steph Locke advised in a conversation on this topic to generally try to set seeds as close to the random component as possible to avoid any confusion – this seems like easy and good advice to follow 👯!\n\n\n\nWhat have we learned?\n\nI’ve learned a lot just from thinking through where to set different parts of my code and how that can affect things downstream\n\nWe now know a bit more about how seeds work!\n\nWe’ve learned about the withr::with_preserve_seed() function 🎉\n\nWe’ve seen the potential consequences of changing the global state in a package – Jenny recently added this as an issue to discuss in a future version of r-pkgs, which she eloquently summarizes as “don’t touch things that don’t belong to you and if you have to, you need to be super careful to wipe all your sticky fingerprints off everything”\n\nThe #rstats community is so helpful and responsive! A small debugging situation led to lots of helpful advice and a quick fix from Jim 👷"
},
{
"objectID": "posts/2023-04-24-causal-quartets/index.html",
Expand Down
Loading

0 comments on commit 05271fa

Please sign in to comment.