diff --git a/tutorials/notebooks/Object-Oriented/Object-Oriented.ipynb b/tutorials/notebooks/Object-Oriented/Object-Oriented.ipynb new file mode 100644 index 000000000..f743f501a --- /dev/null +++ b/tutorials/notebooks/Object-Oriented/Object-Oriented.ipynb @@ -0,0 +1,593 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Intro to Object-Oriented Programming\n", + "\n", + "## Authors\n", + "Brett Morris\n", + "\n", + "## Learning Goals\n", + "* Learn how to design your own objects by creating a `LightCurve` object\n", + "* Create methods on an object\n", + "* Generate new instances of the object with a class method\n", + "* Learn best practices for object-oriented programming\n", + "\n", + "## Keywords\n", + "Intro, Python, Object-Oriented\n", + "\n", + "## Summary\n", + "In this tutorial, we will create an object which stores astronomical light curves, and add methods to the `LightCurve` object which perform common functions on the light curves." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Object-oriented programming\n", + "\n", + "You've heard that Python is _objected-oriented_, but haven't found a reason to use an object yourself. Here we'll develop an example where designing an object makes life easier. \n", + "\n", + "Let's say you've got a series of 1000 light curves, each of which contain: \n", + "* times\n", + "* fluxes\n", + "* uncertainties\n", + "* flags\n", + "\n", + "Instead of managing a bunch of arrays to handle these values, we could instead create an object to store light curves. \n", + "\n", + "### Defining a new object\n", + "\n", + "To create a new object, you use the `class` command, rather than the `def` command that you would use for functions,\n", + "\n", + "```python\n", + "class LightCurve(object):\n", + "```\n", + "\n", + "We've named the new object LightCurve - object names in python should be uppercase without underscores separating words (whereas functions are usually all lowercase and words are separated by underscores). The `object` in parentheses is the class that `LightCurve` inherits from. \n", + "\n", + "\n", + "### The `__init__` method\n", + "Now we will define how you call the `LightCurve` constructor (the call that creates new `LightCurve` objects). Let's say you want to be able to create a light curve like this...\n", + "```python\n", + "new_light_curve = LightCurve(times=times, fluxes=fluxes, \n", + " uncertainties=uncertainties, flags=flags)\n", + "```\n", + "All Python objects get initialized with a function called `__init__` defined within the class, like this: \n", + "```python\n", + "class LightCurve(object):\n", + " def __init__(self, times=None, fluxes=None, uncertainties=None, flags=None):\n", + "```\n", + "You define the `__init__` function like all other functions, except that the first argument is always called `self`. This `self` is the shorthand variable that you use to refer to the `LightCurve` object within the `__init__` method.\n", + "\n", + "### Attributes\n", + "Objects have _attributes_, which are like variables stored on an object. We'll want to store the values above into the `LightCurve` object, each with their own attribute, like this: \n", + "```python\n", + "class LightCurve(object):\n", + " def __init__(self, times=None, fluxes=None, uncertainties=None, flags=None, name=None):\n", + " self.times = times\n", + " self.fluxes = fluxes\n", + " self.uncertainties = uncertainties\n", + " self.flags = flags\n", + " self.name = name\n", + " \n", + "```\n", + "Each attribute is defined by setting `self. = `. All attributes should be defined within the `__init__` method. \n", + "\n", + "### Example\n", + "Let's now create an instance of the `LightCurve` object, and see how it works: " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%matplotlib inline\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt \n", + "\n", + "# Define the object: \n", + "\n", + "class LightCurve(object):\n", + " \n", + " # add the initializer method in an indented block:\n", + " def __init__(self, times=None, fluxes=None, uncertainties=None, flags=None, name=None):\n", + " \n", + " # Now define each attribute which the object will have:\n", + " self.times = times\n", + " self.fluxes = fluxes\n", + " self.uncertainties = uncertainties\n", + " self.flags = flags\n", + " self.name = name" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The above code defines a `LightCurve` object, and the initializer method on the `LightCurve` object, giving us everything we need to initialize an instance of the `LightCurve`, like so: " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create some fake data:\n", + "times = np.linspace(0, 10, 100)\n", + "sigma = 0.1\n", + "fluxes = 1 + sigma * np.random.randn(len(times))\n", + "uncertainties = sigma * np.ones_like(fluxes)\n", + "flags = np.random.randint(0, 5, len(fluxes))\n", + "name = 'proxima Centauri'\n", + "\n", + "# Initialize a LightCurve object:\n", + "prox_cen = LightCurve(times=times, fluxes=fluxes, \n", + " uncertainties=uncertainties, flags=flags,\n", + " name=name)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can see what values are stored in each attribute like this: " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "prox_cen.times" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Creating methods\n", + "\n", + "So far this just looks like another way to store your data. It becomes more powerful when you write _methods_ for your object. Let's make a simple plotting method for the `LightCurve` object, which plots the light curve." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "class LightCurve(object):\n", + " def __init__(self, times=None, fluxes=None, uncertainties=None, flags=None, name=None):\n", + " self.times = times\n", + " self.fluxes = fluxes\n", + " self.uncertainties = uncertainties\n", + " self.flags = flags\n", + " self.name = name\n", + " \n", + " def plot(self, color=None):\n", + " \"\"\"Plot the light curve\"\"\"\n", + " plt.errorbar(self.times, self.fluxes, self.uncertainties, fmt='o', color=color)\n", + " plt.xlabel('Time')\n", + " plt.ylabel('Flux')\n", + " plt.title(self.name)\n", + "\n", + "# Initialize a LightCurve object:\n", + "prox_cen = LightCurve(times=times, fluxes=fluxes, \n", + " uncertainties=uncertainties, flags=flags,\n", + " name=name)\n", + "\n", + "prox_cen.plot('y')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Note that you can access the attributes of the object within methods by calling `self.`.\n", + "\n", + "### Class methods\n", + "\n", + "There's more than one way to initialize a light curve. Maybe your light curves come to you in a particular file type, and you want to be able to load those files directly into a light curve object. You could do that with a _class method_, like this:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "class LightCurve(object):\n", + " def __init__(self, times=None, fluxes=None, uncertainties=None, flags=None, name=None):\n", + " self.times = times\n", + " self.fluxes = fluxes\n", + " self.uncertainties = uncertainties\n", + " self.flags = flags\n", + " self.name = name\n", + " \n", + " def plot(self):\n", + " \"\"\"Plot the light curve\"\"\"\n", + " plt.errorbar(self.times, self.fluxes, self.uncertainties, fmt='o')\n", + " plt.xlabel('Time')\n", + " plt.ylabel('Flux')\n", + " plt.title(self.name)\n", + "\n", + " @classmethod\n", + " def from_txt(cls, path):\n", + " data = np.loadtxt(path)\n", + " \n", + " times = data[0, :]\n", + " fluxes = data[1, :]\n", + " uncertainties = data[2, :]\n", + " flags = data[3, :]\n", + " return cls(times=times, fluxes=fluxes, \n", + " uncertainties=uncertainties, flags=flags)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To get sample light curves to load using the new class method, run the script `generate_lcs.py` from the command line, or with the cell below:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%writefile generate_lcs.py\n", + "\n", + "import os\n", + "import numpy as np\n", + "\n", + "directory_name = 'sample_lcs'\n", + "\n", + "if not os.path.exists(directory_name):\n", + " os.makedirs(directory_name)\n", + "\n", + "n_lightcurves = 10\n", + "\n", + "for i in range(n_lightcurves):\n", + " times = np.linspace(0, 10, 100)\n", + " sigma = np.random.rand()\n", + " fluxes = (10 + sigma * np.random.randn(len(times)) +\n", + " np.polyval([(0.5 - np.random.rand()), np.random.rand()], (times - 10*np.random.rand())))\n", + " uncertainties = sigma * np.ones_like(fluxes)\n", + " flags = np.random.randint(0, 5, len(fluxes))\n", + "\n", + " data = [times, fluxes, uncertainties, flags] \n", + " np.savetxt('{0}/lc_{1}.txt'.format(directory_name, i), data)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%%bash\n", + "python generate_lcs.py" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can now load a light curve with the class method like this: " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "path = 'sample_lcs/lc_0.txt'\n", + "\n", + "lc0 = LightCurve.from_txt(path)\n", + "\n", + "lc0.plot()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now let's compute the mean flux of each target, taking advantage of the object we've created: " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from glob import glob\n", + "\n", + "# Here are the paths to the light curves:\n", + "lc_paths = glob('sample_lcs/*.txt')\n", + "\n", + "print(lc_paths)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# let's load all of the light curves with a generator: \n", + "lightcurves = [LightCurve.from_txt(path) for path in lc_paths]\n", + "\n", + "for lightcurve in lightcurves:\n", + " print(lightcurve.fluxes.mean())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "And let's plot each light curve:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "for lightcurve in lightcurves:\n", + " lightcurve.plot()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### A more useful method\n", + "\n", + "Let's \"clean\" the light curves by normalizing out a polynomial trend from each one. We'll do this with a new method. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "class LightCurve(object):\n", + " def __init__(self, times=None, fluxes=None, uncertainties=None, flags=None, name=None):\n", + " self.times = times\n", + " self.fluxes = fluxes\n", + " self.uncertainties = uncertainties\n", + " self.flags = flags\n", + " self.name = name\n", + " \n", + " self.cleaned = False\n", + " \n", + " def plot(self):\n", + " \"\"\"Plot the light curve\"\"\"\n", + " plt.errorbar(self.times, self.fluxes, self.uncertainties, fmt='o')\n", + " plt.xlabel('Time')\n", + " plt.ylabel('Flux')\n", + " plt.title(self.name)\n", + "\n", + " @classmethod\n", + " def from_txt(cls, path):\n", + " data = np.loadtxt(path)\n", + " \n", + " times = data[0, :]\n", + " fluxes = data[1, :]\n", + " uncertainties = data[2, :]\n", + " flags = data[3, :]\n", + " return cls(times=times, fluxes=fluxes, \n", + " uncertainties=uncertainties, flags=flags)\n", + " \n", + " def clean(self, order=1):\n", + " # Fit a polynomial trend to the light curve: \n", + " poly_params = np.polyfit(self.times, self.fluxes, order)\n", + " best_fit_model = np.polyval(poly_params, self.times)\n", + " \n", + " # Now normalize each flux by the flux in the best-fit polynomial model\n", + " self.fluxes /= best_fit_model\n", + " self.uncertainties /= best_fit_model\n", + " \n", + " # Change the \"cleaned\" attribute to True: \n", + " self.cleaned = True" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's try it on our data:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from glob import glob\n", + "\n", + "# Here are the paths to the light curves:\n", + "lc_paths = glob('sample_lcs/*.txt')\n", + "\n", + "# let's load all of the light curves with a generator: \n", + "lightcurves = [LightCurve.from_txt(path) for path in lc_paths]\n", + "\n", + "for lightcurve in lightcurves:\n", + " lightcurve.clean()\n", + " \n", + " lightcurve.plot()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can now check to see if a light curve has been cleaned with the cleaned attribute: " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "lightcurves[0].cleaned" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Documentation\n", + "\n", + "If you want to share your code with collaborators or with your future self, you should include documentation. We've neglected that above, so let's add in some _docstrings_!\n", + "\n", + "There's a top-level docstring for the object, then docstrings for each method on the class. The format of the docstrings below is called [numpydoc](https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "class LightCurve(object):\n", + " \"\"\"Container for astrophysical light curves\"\"\"\n", + " def __init__(self, times=None, fluxes=None, uncertainties=None, flags=None, name=None):\n", + " \"\"\"\n", + " Parameters\n", + " ----------\n", + " times : array-like\n", + " Time of each flux measurement\n", + " fluxes : array-like\n", + " Fluxes at each time\n", + " uncertainties : array-like\n", + " Uncertainties of each flux measurement\n", + " flags : array-like\n", + " Data quality flags for each flux\n", + " name : string\n", + " Name of the target\n", + " \"\"\"\n", + " self.times = times\n", + " self.fluxes = fluxes\n", + " self.uncertainties = uncertainties\n", + " self.flags = flags\n", + " self.name = name\n", + " \n", + " self.cleaned = False\n", + " \n", + " def plot(self):\n", + " \"\"\"\n", + " Plot the light curve.\n", + " \"\"\"\n", + " plt.errorbar(self.times, self.fluxes, self.uncertainties, fmt='o')\n", + " plt.xlabel('Time')\n", + " plt.ylabel('Flux')\n", + " plt.title(self.name)\n", + "\n", + " @classmethod\n", + " def from_txt(cls, path):\n", + " \"\"\"\n", + " Load a light curve from a raw text file.\n", + " \n", + " Parameters\n", + " ----------\n", + " path : str \n", + " Path to the light curve text file\n", + " \"\"\"\n", + " data = np.loadtxt(path)\n", + " \n", + " times = data[0, :]\n", + " fluxes = data[1, :]\n", + " uncertainties = data[2, :]\n", + " flags = data[3, :]\n", + " return cls(times=times, fluxes=fluxes, \n", + " uncertainties=uncertainties, flags=flags)\n", + " \n", + " def clean(self, order=1):\n", + " \"\"\"\n", + " Normalize the light curve by a polynomial.\n", + " \n", + " Parameters\n", + " ----------\n", + " order : int\n", + " Polynomial order\n", + " \"\"\"\n", + " # Fit a polynomial trend to the light curve: \n", + " poly_params = np.polyfit(self.times, self.fluxes, order)\n", + " best_fit_model = np.polyval(poly_params, self.times)\n", + " \n", + " # Now normalize each flux by the flux in the best-fit polynomial model\n", + " self.fluxes /= best_fit_model\n", + " self.uncertainties /= best_fit_model\n", + " \n", + " # Change the \"cleaned\" attribute to True: \n", + " self.cleaned = True\n", + " \n", + "# Initialize a LightCurve object:\n", + "prox_cen = LightCurve(times=times, fluxes=fluxes, \n", + " uncertainties=uncertainties, flags=flags,\n", + " name=name)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now you can see the documentation on the module within iPython or iPython Notebooks by typing\n", + "```\n", + "prox_cen?\n", + "```\n", + "...you can see the documentation for each method by typing\n", + "```\n", + "prox_cen.clean?\n", + "```\n", + "and you can see the source code for each method by typing:\n", + "```\n", + "prox_cen.clean??\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}