diff --git a/content/files/codespaces/codespaces_landing.png b/content/files/codespaces/codespaces_landing.png new file mode 100644 index 0000000..5090305 Binary files /dev/null and b/content/files/codespaces/codespaces_landing.png differ diff --git a/content/python_overview.ipynb b/content/python_overview.ipynb index 0bc4518..dd2e3aa 100644 --- a/content/python_overview.ipynb +++ b/content/python_overview.ipynb @@ -3,7 +3,9 @@ { "cell_type": "markdown", "id": "f1125d5b-1a63-4df3-8abd-5a2680c9892e", - "metadata": {}, + "metadata": { + "tags": [] + }, "source": [ "# Python Overview\n", "\n", @@ -13,6 +15,8 @@ " - [Python interactive interpreter](#python-interactive-interpreter)\n", " - [Python scripts](#python-scripts)\n", " - [Jupyter notebooks](#jupyter-notebooks)\n", + " - [Python in the Browser](#python-in-the-browser)\n", + " - [Coding in the Cloud](#coding-in-the-cloud)\n", "\n", "\n", "## Intro\n", @@ -118,7 +122,54 @@ "\n", "![jupyter example](files/jupyter_demo.png)\n", "\n", - "[Jupyter]: https://jupyter.org/" + "The traditional way to run Jupyter Notebooks is to install the Jupyter Lab software on your machine and start the program from the command line. \n", + "\n", + "It's important to note, however, that there are also hosted Jupyter environments such as [Google Colab][], [Kaggle Notebooks][], etc where third parties run the Jupyter Notebook or Jupyter Lab software for you. These can be very convenient, providing a nice combination of zero overhead with the ability to do real work, in some cases including features such as real-time collaboration. However, these environments also have limitations (e.g. the amount of data you can process or analyze) as well as their own non-standard workflows.\n", + "\n", + "### Python in the Browser\n", + "\n", + "In the last few years, we've also seen the rise of a technology called [WebAssembly][], which among other things allows you to run more computationally heavy software that is not native web code (e.g. Javascript) directly in your browser. This power extends to lower-level programming languages such as Python and its Jupyter Lab environment, which traditionally have been run on our own machines, on virtual machines in the cloud, or hosted for us by third parties such as [Google Colab][].\n", + "\n", + "The ability to run Python and Jupyter directly in your browser means that you don't need to install the programming language or the Jupyter Lab software. Nor do you need to use a third party such as Google to host a Jupyter notebook for you.\n", + "\n", + "It's a super-convenient way to learn without having to slog through the process of setting up your own local installation. And in fact, many of the tutorials we'll use in this course -- including the one you're currently reading -- run on JupyterLite.\n", + "\n", + "However, there are drawbacks. JupyterLite installations are not intended for handling large quantities of data, and there are limitations and friction points when it comes to saving work and normal day-to-day usages of Python, such as idiosyncratic workflows for the very common case of obtaining files from other websites, e.g. when scraping a government agency for data or documents.\n", + "\n", + "\n", + "## Coding in the Cloud\n", + "\n", + "![codespaces](files/codespaces/codespaces_landing.png)\n", + "\n", + "Perhaps the most exciting development in the last few years has been the rise of cloud coding environment such as GitHub Codespaces. These environments combine the simplicity of setup with the flexibility to support standard and customized workflows, without the idiosyncracies common to platforms such as Google Colab and JupyterLite.\n", + "\n", + "They run on small virtual machines in the cloud, and they allow you to save work directly to a GitHub code repository and save the state of your virtual machine so you can pick up where you left off at your next coding session.\n", + "\n", + "And of course, there are a caveats. Most importantly, these environments typically operate on a freemium model, where you get a certain number of \"compute\" time for free (e.g. at the time of writing you can run a basic machine with 2GB of RAM on Codespaces for 60 hours before incurring hourly charges of $0.18 per hour. You also get 15GB of storage for free, with each additional GB costing $.07 cents per month.\n", + "\n", + "In this course, we'll make regular use of GitHub Codespaces for our assignments, since they offer a nice balance of standardized workflows and a reasonable free tier. \n", + "\n", + "Equally important -- you can trust that as you learn to code in this environment, it transfers readily to a \"local\" workflow on your machine using the same tools and environments.\n", + "\n", + "## So what Python environment should I use?\n", + "\n", + "In our opinion, there's a time and a place for each of these different coding contexts.\n", + "\n", + "JupyterLite -- ie Python in your Browser -- is a great way to start ramping up immediately. It's so handy that the First Python Notebook is actually a JupyterLite instance that requires no installation of Python or related libraries for you to get started.\n", + "\n", + "But when you're working on projects, we prefer other options. A plain old code editor is handy for whipping up Python scripts or multi-step pipelines which need to run on a regular schedule on a virtual machine in the cloud. These types of machines typically have no graphical interface, and while you *can* run Jupyter Notebooks as scripts in a shell, it's far more common and convenient to use plain old Python scripts.\n", + "\n", + "For data analysis, we of course recommend Jupyter Notebooks/Lab, either running in your browser or using a third party provider such as Google Colab. \n", + "\n", + "When starting out, it can be tempting to choose convenience (e.g. Google Colab) over learning the slightly harder but more standard way of doing things. In this course, we'll take the latter route, primarily because we want you to learn standard workflows that most teams in the news use, and many of the tutorials and blog posts assume out on the wider Internet. That said, we're very excited about CodeSpaces, which combine standard workflows with a zero setup environment based entirely in the cloud. While it too has limitations in terms of pricing and resources, it's a convenient way to get up and running on real work, using standard practices.\n", + "\n", + "Last but not least, even the humble Python interpreter in your shell can be handy for quickly testing out code snippets and exploring a library, without the overhead of having to install and run a Jupyter Notebook.\n", + "\n", + "\n", + "[Jupyter]: https://jupyter.org/\n", + "[Google Colab]: https://research.google.com/colaboratory/\n", + "[Kaggle Notebooks]: https://www.kaggle.com/docs/notebooks\n", + "[WebAssembly]: https://webassembly.org/" ] }, { @@ -146,7 +197,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.12" + "version": "3.11.4" } }, "nbformat": 4,