From b493dad2222dfbcc3c5d2c096581e25068eb54c9 Mon Sep 17 00:00:00 2001
From: Chanchal Kumar Maji
 <31502077+ChanchalKumarMaji@users.noreply.github.com>
Date: Sun, 3 Feb 2019 22:39:37 +0530
Subject: [PATCH 1/5] Update README.md

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/README.md b/README.md
index a2a09d1..3a2ea3c 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# Assignment 5
+# Assignment 5 
 This is the final Assignment of MLCC Study Jam, DSC Kolkata. In this Assignment, you're asked to solve the official MLCC Notebooks. These notebooks are a property of Google Inc.
 To get accepted for final evaluation, complete all the notebooks and commit them to the branches having your github ID.
 

From 0fa63fdd118d772247aef08af5c07fdc7d3427c2 Mon Sep 17 00:00:00 2001
From: Chanchal Kumar Maji
 <31502077+ChanchalKumarMaji@users.noreply.github.com>
Date: Sun, 3 Feb 2019 22:51:15 +0530
Subject: [PATCH 2/5] Created using Colaboratory

---
 intro_to_pandas.ipynb | 660 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 660 insertions(+)
 create mode 100644 intro_to_pandas.ipynb

diff --git a/intro_to_pandas.ipynb b/intro_to_pandas.ipynb
new file mode 100644
index 0000000..942ea63
--- /dev/null
+++ b/intro_to_pandas.ipynb
@@ -0,0 +1,660 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "intro_to_pandas.ipynb",
+      "version": "0.3.2",
+      "provenance": [],
+      "collapsed_sections": [
+        "JndnmDMp66FL",
+        "YHIWvc9Ms-Ll",
+        "TJffr5_Jwqvd"
+      ],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python2",
+      "display_name": "Python 2"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/ChanchalKumarMaji/Assignment-5/blob/ChanchalKumarMaji/intro_to_pandas.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "JndnmDMp66FL"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "#### Copyright 2017 Google LLC."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "hMqWDc_m6rUC",
+        "cellView": "both",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+        "# you may not use this file except in compliance with the License.\n",
+        "# You may obtain a copy of the License at\n",
+        "#\n",
+        "# https://www.apache.org/licenses/LICENSE-2.0\n",
+        "#\n",
+        "# Unless required by applicable law or agreed to in writing, software\n",
+        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+        "# See the License for the specific language governing permissions and\n",
+        "# limitations under the License."
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "rHLcriKWLRe4"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "# Intro to pandas"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "QvJBqX8_Bctk"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "**Learning Objectives:**\n",
+        "  * Gain an introduction to the `DataFrame` and `Series` data structures of the *pandas* library\n",
+        "  * Access and manipulate data within a `DataFrame` and `Series`\n",
+        "  * Import CSV data into a *pandas* `DataFrame`\n",
+        "  * Reindex a `DataFrame` to shuffle data"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "TIFJ83ZTBctl"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "[*pandas*](http://pandas.pydata.org/) is a column-oriented data analysis API. It's a great tool for handling and analyzing input data, and many ML frameworks support *pandas* data structures as inputs.\n",
+        "Although a comprehensive introduction to the *pandas* API would span many pages, the core concepts are fairly straightforward, and we'll present them below. For a more complete reference, the [*pandas* docs site](http://pandas.pydata.org/pandas-docs/stable/index.html) contains extensive documentation and many tutorials."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "s_JOISVgmn9v"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "## Basic Concepts\n",
+        "\n",
+        "The following line imports the *pandas* API and prints the API version:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "aSRYu62xUi3g",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "from __future__ import print_function\n",
+        "\n",
+        "import pandas as pd\n",
+        "pd.__version__"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "daQreKXIUslr"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "The primary data structures in *pandas* are implemented as two classes:\n",
+        "\n",
+        "  * **`DataFrame`**, which you can imagine as a relational data table, with rows and named columns.\n",
+        "  * **`Series`**, which is a single column. A `DataFrame` contains one or more `Series` and a name for each `Series`.\n",
+        "\n",
+        "The data frame is a commonly used abstraction for data manipulation. Similar implementations exist in [Spark](https://spark.apache.org/) and [R](https://www.r-project.org/about.html)."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "fjnAk1xcU0yc"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "One way to create a `Series` is to construct a `Series` object. For example:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "DFZ42Uq7UFDj",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "pd.Series(['San Francisco', 'San Jose', 'Sacramento'])"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "U5ouUp1cU6pC"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "`DataFrame` objects can be created by passing a `dict` mapping `string` column names to their respective `Series`. If the `Series` don't match in length, missing values are filled with special [NA/NaN](http://pandas.pydata.org/pandas-docs/stable/missing_data.html) values. Example:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "avgr6GfiUh8t",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "city_names = pd.Series(['San Francisco', 'San Jose', 'Sacramento'])\n",
+        "population = pd.Series([852469, 1015785, 485199])\n",
+        "\n",
+        "pd.DataFrame({ 'City name': city_names, 'Population': population })"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "oa5wfZT7VHJl"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "But most of the time, you load an entire file into a `DataFrame`. The following example loads a file with California housing data. Run the following cell to load the data and create feature definitions:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "av6RYOraVG1V",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "california_housing_dataframe = pd.read_csv(\"https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv\", sep=\",\")\n",
+        "california_housing_dataframe.describe()"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "WrkBjfz5kEQu"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "The example above used `DataFrame.describe` to show interesting statistics about a `DataFrame`. Another useful function is `DataFrame.head`, which displays the first few records of a `DataFrame`:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "s3ND3bgOkB5k",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "california_housing_dataframe.head()"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "w9-Es5Y6laGd"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "Another powerful feature of *pandas* is graphing. For example, `DataFrame.hist` lets you quickly study the distribution of values in a column:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "nqndFVXVlbPN",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "california_housing_dataframe.hist('housing_median_age')"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "XtYZ7114n3b-"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "## Accessing Data\n",
+        "\n",
+        "You can access `DataFrame` data using familiar Python dict/list operations:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "_TFm7-looBFF",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "cities = pd.DataFrame({ 'City name': city_names, 'Population': population })\n",
+        "print(type(cities['City name']))\n",
+        "cities['City name']"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "V5L6xacLoxyv",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "print(type(cities['City name'][1]))\n",
+        "cities['City name'][1]"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "gcYX1tBPugZl",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "print(type(cities[0:2]))\n",
+        "cities[0:2]"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "65g1ZdGVjXsQ"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "In addition, *pandas* provides an extremely rich API for advanced [indexing and selection](http://pandas.pydata.org/pandas-docs/stable/indexing.html) that is too extensive to be covered here."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "RM1iaD-ka3Y1"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "## Manipulating Data\n",
+        "\n",
+        "You may apply Python's basic arithmetic operations to `Series`. For example:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "XWmyCFJ5bOv-",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "population / 1000."
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "TQzIVnbnmWGM"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "[NumPy](http://www.numpy.org/) is a popular toolkit for scientific computing. *pandas* `Series` can be used as arguments to most NumPy functions:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "ko6pLK6JmkYP",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "import numpy as np\n",
+        "\n",
+        "np.log(population)"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "xmxFuQmurr6d"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "For more complex single-column transformations, you can use `Series.apply`. Like the Python [map function](https://docs.python.org/2/library/functions.html#map), \n",
+        "`Series.apply` accepts as an argument a [lambda function](https://docs.python.org/2/tutorial/controlflow.html#lambda-expressions), which is applied to each value.\n",
+        "\n",
+        "The example below creates a new `Series` that indicates whether `population` is over one million:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "Fc1DvPAbstjI",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "population.apply(lambda val: val > 1000000)"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "ZeYYLoV9b9fB"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "\n",
+        "Modifying `DataFrames` is also straightforward. For example, the following code adds two `Series` to an existing `DataFrame`:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "0gCEX99Hb8LR",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "cities['Area square miles'] = pd.Series([46.87, 176.53, 97.92])\n",
+        "cities['Population density'] = cities['Population'] / cities['Area square miles']\n",
+        "cities"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "6qh63m-ayb-c"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "## Exercise #1\n",
+        "\n",
+        "Modify the `cities` table by adding a new boolean column that is True if and only if *both* of the following are True:\n",
+        "\n",
+        "  * The city is named after a saint.\n",
+        "  * The city has an area greater than 50 square miles.\n",
+        "\n",
+        "**Note:** Boolean `Series` are combined using the bitwise, rather than the traditional boolean, operators. For example, when performing *logical and*, use `&` instead of `and`.\n",
+        "\n",
+        "**Hint:** \"San\" in Spanish means \"saint.\""
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "zCOn8ftSyddH",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "# Your code here"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "YHIWvc9Ms-Ll"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "### Solution\n",
+        "\n",
+        "Click below for a solution."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "T5OlrqtdtCIb",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "cities['Is wide and has saint name'] = (cities['Area square miles'] > 50) & cities['City name'].apply(lambda name: name.startswith('San'))\n",
+        "cities"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "f-xAOJeMiXFB"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "## Indexes\n",
+        "Both `Series` and `DataFrame` objects also define an `index` property that assigns an identifier value to each `Series` item or `DataFrame` row. \n",
+        "\n",
+        "By default, at construction, *pandas* assigns index values that reflect the ordering of the source data. Once created, the index values are stable; that is, they do not change when data is reordered."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "2684gsWNinq9",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "city_names.index"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "F_qPe2TBjfWd",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "cities.index"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "hp2oWY9Slo_h"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "Call `DataFrame.reindex` to manually reorder the rows. For example, the following has the same effect as sorting by city name:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "sN0zUzSAj-U1",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "cities.reindex([2, 0, 1])"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "-GQFz8NZuS06"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "Reindexing is a great way to shuffle (randomize) a `DataFrame`. In the example below, we take the index, which is array-like, and pass it to NumPy's `random.permutation` function, which shuffles its values in place. Calling `reindex` with this shuffled array causes the `DataFrame` rows to be shuffled in the same way.\n",
+        "Try running the following cell multiple times!"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "mF8GC0k8uYhz",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "cities.reindex(np.random.permutation(cities.index))"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "fSso35fQmGKb"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "For more information, see the [Index documentation](http://pandas.pydata.org/pandas-docs/stable/indexing.html#index-objects)."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "8UngIdVhz8C0"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "## Exercise #2\n",
+        "\n",
+        "The `reindex` method allows index values that are not in the original `DataFrame`'s index values. Try it and see what happens if you use such values! Why do you think this is allowed?"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "PN55GrDX0jzO",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "# Your code here"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "TJffr5_Jwqvd"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "### Solution\n",
+        "\n",
+        "Click below for the solution."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "8oSvi2QWwuDH"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "If your `reindex` input array includes values not in the original `DataFrame` index values, `reindex` will add new rows for these \"missing\" indices and populate all corresponding columns with `NaN` values:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "yBdkucKCwy4x",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "cities.reindex([0, 4, 5, 2])"
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "2l82PhPbwz7g"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "This behavior is desirable because indexes are often strings pulled from the actual data (see the [*pandas* reindex\n",
+        "documentation](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.reindex.html) for an example\n",
+        "in which the index values are browser names).\n",
+        "\n",
+        "In this case, allowing \"missing\" indices makes it easy to reindex using an external list, as you don't have to worry about\n",
+        "sanitizing the input."
+      ]
+    }
+  ]
+}
\ No newline at end of file

From 50fe890274003fa469ef03d3c44a6f02556f253a Mon Sep 17 00:00:00 2001
From: Chanchal Kumar Maji
 <31502077+ChanchalKumarMaji@users.noreply.github.com>
Date: Sun, 3 Feb 2019 23:00:18 +0530
Subject: [PATCH 3/5] Created using Colaboratory


From f282d373f15c0d27ff7f8fdf341bba5bfe30782f Mon Sep 17 00:00:00 2001
From: Chanchal Kumar Maji
 <31502077+ChanchalKumarMaji@users.noreply.github.com>
Date: Sun, 3 Feb 2019 23:01:02 +0530
Subject: [PATCH 4/5] Delete intro_to_pandas.ipynb

---
 intro_to_pandas.ipynb | 660 ------------------------------------------
 1 file changed, 660 deletions(-)
 delete mode 100644 intro_to_pandas.ipynb

diff --git a/intro_to_pandas.ipynb b/intro_to_pandas.ipynb
deleted file mode 100644
index 942ea63..0000000
--- a/intro_to_pandas.ipynb
+++ /dev/null
@@ -1,660 +0,0 @@
-{
-  "nbformat": 4,
-  "nbformat_minor": 0,
-  "metadata": {
-    "colab": {
-      "name": "intro_to_pandas.ipynb",
-      "version": "0.3.2",
-      "provenance": [],
-      "collapsed_sections": [
-        "JndnmDMp66FL",
-        "YHIWvc9Ms-Ll",
-        "TJffr5_Jwqvd"
-      ],
-      "include_colab_link": true
-    },
-    "kernelspec": {
-      "name": "python2",
-      "display_name": "Python 2"
-    }
-  },
-  "cells": [
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "view-in-github",
-        "colab_type": "text"
-      },
-      "source": [
-        "<a href=\"https://colab.research.google.com/github/ChanchalKumarMaji/Assignment-5/blob/ChanchalKumarMaji/intro_to_pandas.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "JndnmDMp66FL"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "#### Copyright 2017 Google LLC."
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "hMqWDc_m6rUC",
-        "cellView": "both",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
-        "# you may not use this file except in compliance with the License.\n",
-        "# You may obtain a copy of the License at\n",
-        "#\n",
-        "# https://www.apache.org/licenses/LICENSE-2.0\n",
-        "#\n",
-        "# Unless required by applicable law or agreed to in writing, software\n",
-        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
-        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
-        "# See the License for the specific language governing permissions and\n",
-        "# limitations under the License."
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "rHLcriKWLRe4"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "# Intro to pandas"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "QvJBqX8_Bctk"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "**Learning Objectives:**\n",
-        "  * Gain an introduction to the `DataFrame` and `Series` data structures of the *pandas* library\n",
-        "  * Access and manipulate data within a `DataFrame` and `Series`\n",
-        "  * Import CSV data into a *pandas* `DataFrame`\n",
-        "  * Reindex a `DataFrame` to shuffle data"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "TIFJ83ZTBctl"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "[*pandas*](http://pandas.pydata.org/) is a column-oriented data analysis API. It's a great tool for handling and analyzing input data, and many ML frameworks support *pandas* data structures as inputs.\n",
-        "Although a comprehensive introduction to the *pandas* API would span many pages, the core concepts are fairly straightforward, and we'll present them below. For a more complete reference, the [*pandas* docs site](http://pandas.pydata.org/pandas-docs/stable/index.html) contains extensive documentation and many tutorials."
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "s_JOISVgmn9v"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "## Basic Concepts\n",
-        "\n",
-        "The following line imports the *pandas* API and prints the API version:"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "aSRYu62xUi3g",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "from __future__ import print_function\n",
-        "\n",
-        "import pandas as pd\n",
-        "pd.__version__"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "daQreKXIUslr"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "The primary data structures in *pandas* are implemented as two classes:\n",
-        "\n",
-        "  * **`DataFrame`**, which you can imagine as a relational data table, with rows and named columns.\n",
-        "  * **`Series`**, which is a single column. A `DataFrame` contains one or more `Series` and a name for each `Series`.\n",
-        "\n",
-        "The data frame is a commonly used abstraction for data manipulation. Similar implementations exist in [Spark](https://spark.apache.org/) and [R](https://www.r-project.org/about.html)."
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "fjnAk1xcU0yc"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "One way to create a `Series` is to construct a `Series` object. For example:"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "DFZ42Uq7UFDj",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "pd.Series(['San Francisco', 'San Jose', 'Sacramento'])"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "U5ouUp1cU6pC"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "`DataFrame` objects can be created by passing a `dict` mapping `string` column names to their respective `Series`. If the `Series` don't match in length, missing values are filled with special [NA/NaN](http://pandas.pydata.org/pandas-docs/stable/missing_data.html) values. Example:"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "avgr6GfiUh8t",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "city_names = pd.Series(['San Francisco', 'San Jose', 'Sacramento'])\n",
-        "population = pd.Series([852469, 1015785, 485199])\n",
-        "\n",
-        "pd.DataFrame({ 'City name': city_names, 'Population': population })"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "oa5wfZT7VHJl"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "But most of the time, you load an entire file into a `DataFrame`. The following example loads a file with California housing data. Run the following cell to load the data and create feature definitions:"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "av6RYOraVG1V",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "california_housing_dataframe = pd.read_csv(\"https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv\", sep=\",\")\n",
-        "california_housing_dataframe.describe()"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "WrkBjfz5kEQu"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "The example above used `DataFrame.describe` to show interesting statistics about a `DataFrame`. Another useful function is `DataFrame.head`, which displays the first few records of a `DataFrame`:"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "s3ND3bgOkB5k",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "california_housing_dataframe.head()"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "w9-Es5Y6laGd"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "Another powerful feature of *pandas* is graphing. For example, `DataFrame.hist` lets you quickly study the distribution of values in a column:"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "nqndFVXVlbPN",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "california_housing_dataframe.hist('housing_median_age')"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "XtYZ7114n3b-"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "## Accessing Data\n",
-        "\n",
-        "You can access `DataFrame` data using familiar Python dict/list operations:"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "_TFm7-looBFF",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "cities = pd.DataFrame({ 'City name': city_names, 'Population': population })\n",
-        "print(type(cities['City name']))\n",
-        "cities['City name']"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "V5L6xacLoxyv",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "print(type(cities['City name'][1]))\n",
-        "cities['City name'][1]"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "gcYX1tBPugZl",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "print(type(cities[0:2]))\n",
-        "cities[0:2]"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "65g1ZdGVjXsQ"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "In addition, *pandas* provides an extremely rich API for advanced [indexing and selection](http://pandas.pydata.org/pandas-docs/stable/indexing.html) that is too extensive to be covered here."
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "RM1iaD-ka3Y1"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "## Manipulating Data\n",
-        "\n",
-        "You may apply Python's basic arithmetic operations to `Series`. For example:"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "XWmyCFJ5bOv-",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "population / 1000."
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "TQzIVnbnmWGM"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "[NumPy](http://www.numpy.org/) is a popular toolkit for scientific computing. *pandas* `Series` can be used as arguments to most NumPy functions:"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "ko6pLK6JmkYP",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "import numpy as np\n",
-        "\n",
-        "np.log(population)"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "xmxFuQmurr6d"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "For more complex single-column transformations, you can use `Series.apply`. Like the Python [map function](https://docs.python.org/2/library/functions.html#map), \n",
-        "`Series.apply` accepts as an argument a [lambda function](https://docs.python.org/2/tutorial/controlflow.html#lambda-expressions), which is applied to each value.\n",
-        "\n",
-        "The example below creates a new `Series` that indicates whether `population` is over one million:"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "Fc1DvPAbstjI",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "population.apply(lambda val: val > 1000000)"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "ZeYYLoV9b9fB"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "\n",
-        "Modifying `DataFrames` is also straightforward. For example, the following code adds two `Series` to an existing `DataFrame`:"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "0gCEX99Hb8LR",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "cities['Area square miles'] = pd.Series([46.87, 176.53, 97.92])\n",
-        "cities['Population density'] = cities['Population'] / cities['Area square miles']\n",
-        "cities"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "6qh63m-ayb-c"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "## Exercise #1\n",
-        "\n",
-        "Modify the `cities` table by adding a new boolean column that is True if and only if *both* of the following are True:\n",
-        "\n",
-        "  * The city is named after a saint.\n",
-        "  * The city has an area greater than 50 square miles.\n",
-        "\n",
-        "**Note:** Boolean `Series` are combined using the bitwise, rather than the traditional boolean, operators. For example, when performing *logical and*, use `&` instead of `and`.\n",
-        "\n",
-        "**Hint:** \"San\" in Spanish means \"saint.\""
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "zCOn8ftSyddH",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "# Your code here"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "YHIWvc9Ms-Ll"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "### Solution\n",
-        "\n",
-        "Click below for a solution."
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "T5OlrqtdtCIb",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "cities['Is wide and has saint name'] = (cities['Area square miles'] > 50) & cities['City name'].apply(lambda name: name.startswith('San'))\n",
-        "cities"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "f-xAOJeMiXFB"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "## Indexes\n",
-        "Both `Series` and `DataFrame` objects also define an `index` property that assigns an identifier value to each `Series` item or `DataFrame` row. \n",
-        "\n",
-        "By default, at construction, *pandas* assigns index values that reflect the ordering of the source data. Once created, the index values are stable; that is, they do not change when data is reordered."
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "2684gsWNinq9",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "city_names.index"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "F_qPe2TBjfWd",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "cities.index"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "hp2oWY9Slo_h"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "Call `DataFrame.reindex` to manually reorder the rows. For example, the following has the same effect as sorting by city name:"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "sN0zUzSAj-U1",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "cities.reindex([2, 0, 1])"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "-GQFz8NZuS06"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "Reindexing is a great way to shuffle (randomize) a `DataFrame`. In the example below, we take the index, which is array-like, and pass it to NumPy's `random.permutation` function, which shuffles its values in place. Calling `reindex` with this shuffled array causes the `DataFrame` rows to be shuffled in the same way.\n",
-        "Try running the following cell multiple times!"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "mF8GC0k8uYhz",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "cities.reindex(np.random.permutation(cities.index))"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "fSso35fQmGKb"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "For more information, see the [Index documentation](http://pandas.pydata.org/pandas-docs/stable/indexing.html#index-objects)."
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "8UngIdVhz8C0"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "## Exercise #2\n",
-        "\n",
-        "The `reindex` method allows index values that are not in the original `DataFrame`'s index values. Try it and see what happens if you use such values! Why do you think this is allowed?"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "PN55GrDX0jzO",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "# Your code here"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "TJffr5_Jwqvd"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "### Solution\n",
-        "\n",
-        "Click below for the solution."
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "8oSvi2QWwuDH"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "If your `reindex` input array includes values not in the original `DataFrame` index values, `reindex` will add new rows for these \"missing\" indices and populate all corresponding columns with `NaN` values:"
-      ]
-    },
-    {
-      "metadata": {
-        "colab_type": "code",
-        "id": "yBdkucKCwy4x",
-        "colab": {}
-      },
-      "cell_type": "code",
-      "source": [
-        "cities.reindex([0, 4, 5, 2])"
-      ],
-      "execution_count": 0,
-      "outputs": []
-    },
-    {
-      "metadata": {
-        "colab_type": "text",
-        "id": "2l82PhPbwz7g"
-      },
-      "cell_type": "markdown",
-      "source": [
-        "This behavior is desirable because indexes are often strings pulled from the actual data (see the [*pandas* reindex\n",
-        "documentation](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.reindex.html) for an example\n",
-        "in which the index values are browser names).\n",
-        "\n",
-        "In this case, allowing \"missing\" indices makes it easy to reindex using an external list, as you don't have to worry about\n",
-        "sanitizing the input."
-      ]
-    }
-  ]
-}
\ No newline at end of file

From 693b53a3e972c663557431f4c4ce4fa25f99042c Mon Sep 17 00:00:00 2001
From: Chanchal Kumar Maji
 <31502077+ChanchalKumarMaji@users.noreply.github.com>
Date: Mon, 4 Feb 2019 00:41:09 +0530
Subject: [PATCH 5/5] Created using Colaboratory

---
 intro_to_pandas.ipynb | 1870 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1870 insertions(+)
 create mode 100644 intro_to_pandas.ipynb

diff --git a/intro_to_pandas.ipynb b/intro_to_pandas.ipynb
new file mode 100644
index 0000000..472d30f
--- /dev/null
+++ b/intro_to_pandas.ipynb
@@ -0,0 +1,1870 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "intro_to_pandas.ipynb",
+      "version": "0.3.2",
+      "provenance": [],
+      "collapsed_sections": [],
+      "include_colab_link": true
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "accelerator": "GPU"
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "view-in-github",
+        "colab_type": "text"
+      },
+      "source": [
+        "<a href=\"https://colab.research.google.com/github/ChanchalKumarMaji/Assignment-5/blob/ChanchalKumarMaji/intro_to_pandas.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "JndnmDMp66FL"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "#### Copyright 2017 Google LLC."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "hMqWDc_m6rUC",
+        "cellView": "both",
+        "colab": {}
+      },
+      "cell_type": "code",
+      "source": [
+        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+        "# you may not use this file except in compliance with the License.\n",
+        "# You may obtain a copy of the License at\n",
+        "#\n",
+        "# https://www.apache.org/licenses/LICENSE-2.0\n",
+        "#\n",
+        "# Unless required by applicable law or agreed to in writing, software\n",
+        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+        "# See the License for the specific language governing permissions and\n",
+        "# limitations under the License."
+      ],
+      "execution_count": 0,
+      "outputs": []
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "rHLcriKWLRe4"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "# Intro to pandas"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "QvJBqX8_Bctk"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "**Learning Objectives:**\n",
+        "  * Gain an introduction to the `DataFrame` and `Series` data structures of the *pandas* library\n",
+        "  * Access and manipulate data within a `DataFrame` and `Series`\n",
+        "  * Import CSV data into a *pandas* `DataFrame`\n",
+        "  * Reindex a `DataFrame` to shuffle data"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "TIFJ83ZTBctl"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "[*pandas*](http://pandas.pydata.org/) is a column-oriented data analysis API. It's a great tool for handling and analyzing input data, and many ML frameworks support *pandas* data structures as inputs.\n",
+        "Although a comprehensive introduction to the *pandas* API would span many pages, the core concepts are fairly straightforward, and we'll present them below. For a more complete reference, the [*pandas* docs site](http://pandas.pydata.org/pandas-docs/stable/index.html) contains extensive documentation and many tutorials."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "s_JOISVgmn9v"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "## Basic Concepts\n",
+        "\n",
+        "The following line imports the *pandas* API and prints the API version:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "aSRYu62xUi3g",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 34
+        },
+        "outputId": "b69fa038-4215-44f9-c5dd-2acc82452a93"
+      },
+      "cell_type": "code",
+      "source": [
+        "from __future__ import print_function\n",
+        "\n",
+        "import pandas as pd\n",
+        "pd.__version__"
+      ],
+      "execution_count": 2,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "'0.22.0'"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 2
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "daQreKXIUslr"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "The primary data structures in *pandas* are implemented as two classes:\n",
+        "\n",
+        "  * **`DataFrame`**, which you can imagine as a relational data table, with rows and named columns.\n",
+        "  * **`Series`**, which is a single column. A `DataFrame` contains one or more `Series` and a name for each `Series`.\n",
+        "\n",
+        "The data frame is a commonly used abstraction for data manipulation. Similar implementations exist in [Spark](https://spark.apache.org/) and [R](https://www.r-project.org/about.html)."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "fjnAk1xcU0yc"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "One way to create a `Series` is to construct a `Series` object. For example:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "DFZ42Uq7UFDj",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 86
+        },
+        "outputId": "8acbe90f-6151-48ef-9870-ce0a5cc32a9c"
+      },
+      "cell_type": "code",
+      "source": [
+        "pd.Series(['San Francisco', 'San Jose', 'Sacramento'])"
+      ],
+      "execution_count": 3,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "0    San Francisco\n",
+              "1         San Jose\n",
+              "2       Sacramento\n",
+              "dtype: object"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 3
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "U5ouUp1cU6pC"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "`DataFrame` objects can be created by passing a `dict` mapping `string` column names to their respective `Series`. If the `Series` don't match in length, missing values are filled with special [NA/NaN](http://pandas.pydata.org/pandas-docs/stable/missing_data.html) values. Example:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "avgr6GfiUh8t",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 143
+        },
+        "outputId": "53e13300-254c-4d87-c4ca-5e021d787a12"
+      },
+      "cell_type": "code",
+      "source": [
+        "city_names = pd.Series(['San Francisco', 'San Jose', 'Sacramento'])\n",
+        "population = pd.Series([852469, 1015785, 485199])\n",
+        "\n",
+        "pd.DataFrame({ 'City name': city_names, 'Population': population })"
+      ],
+      "execution_count": 4,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/html": [
+              "<div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>City name</th>\n",
+              "      <th>Population</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>0</th>\n",
+              "      <td>San Francisco</td>\n",
+              "      <td>852469</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>1</th>\n",
+              "      <td>San Jose</td>\n",
+              "      <td>1015785</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>2</th>\n",
+              "      <td>Sacramento</td>\n",
+              "      <td>485199</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>"
+            ],
+            "text/plain": [
+              "       City name  Population\n",
+              "0  San Francisco      852469\n",
+              "1       San Jose     1015785\n",
+              "2     Sacramento      485199"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 4
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "oa5wfZT7VHJl"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "But most of the time, you load an entire file into a `DataFrame`. The following example loads a file with California housing data. Run the following cell to load the data and create feature definitions:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "av6RYOraVG1V",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 320
+        },
+        "outputId": "4b8ef629-7bcd-449f-b0d6-c13bce927f2e"
+      },
+      "cell_type": "code",
+      "source": [
+        "california_housing_dataframe = pd.read_csv(\"https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv\", sep=\",\")\n",
+        "california_housing_dataframe.describe()"
+      ],
+      "execution_count": 5,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/html": [
+              "<div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>longitude</th>\n",
+              "      <th>latitude</th>\n",
+              "      <th>housing_median_age</th>\n",
+              "      <th>total_rooms</th>\n",
+              "      <th>total_bedrooms</th>\n",
+              "      <th>population</th>\n",
+              "      <th>households</th>\n",
+              "      <th>median_income</th>\n",
+              "      <th>median_house_value</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>count</th>\n",
+              "      <td>17000.000000</td>\n",
+              "      <td>17000.000000</td>\n",
+              "      <td>17000.000000</td>\n",
+              "      <td>17000.000000</td>\n",
+              "      <td>17000.000000</td>\n",
+              "      <td>17000.000000</td>\n",
+              "      <td>17000.000000</td>\n",
+              "      <td>17000.000000</td>\n",
+              "      <td>17000.000000</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>mean</th>\n",
+              "      <td>-119.562108</td>\n",
+              "      <td>35.625225</td>\n",
+              "      <td>28.589353</td>\n",
+              "      <td>2643.664412</td>\n",
+              "      <td>539.410824</td>\n",
+              "      <td>1429.573941</td>\n",
+              "      <td>501.221941</td>\n",
+              "      <td>3.883578</td>\n",
+              "      <td>207300.912353</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>std</th>\n",
+              "      <td>2.005166</td>\n",
+              "      <td>2.137340</td>\n",
+              "      <td>12.586937</td>\n",
+              "      <td>2179.947071</td>\n",
+              "      <td>421.499452</td>\n",
+              "      <td>1147.852959</td>\n",
+              "      <td>384.520841</td>\n",
+              "      <td>1.908157</td>\n",
+              "      <td>115983.764387</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>min</th>\n",
+              "      <td>-124.350000</td>\n",
+              "      <td>32.540000</td>\n",
+              "      <td>1.000000</td>\n",
+              "      <td>2.000000</td>\n",
+              "      <td>1.000000</td>\n",
+              "      <td>3.000000</td>\n",
+              "      <td>1.000000</td>\n",
+              "      <td>0.499900</td>\n",
+              "      <td>14999.000000</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>25%</th>\n",
+              "      <td>-121.790000</td>\n",
+              "      <td>33.930000</td>\n",
+              "      <td>18.000000</td>\n",
+              "      <td>1462.000000</td>\n",
+              "      <td>297.000000</td>\n",
+              "      <td>790.000000</td>\n",
+              "      <td>282.000000</td>\n",
+              "      <td>2.566375</td>\n",
+              "      <td>119400.000000</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>50%</th>\n",
+              "      <td>-118.490000</td>\n",
+              "      <td>34.250000</td>\n",
+              "      <td>29.000000</td>\n",
+              "      <td>2127.000000</td>\n",
+              "      <td>434.000000</td>\n",
+              "      <td>1167.000000</td>\n",
+              "      <td>409.000000</td>\n",
+              "      <td>3.544600</td>\n",
+              "      <td>180400.000000</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>75%</th>\n",
+              "      <td>-118.000000</td>\n",
+              "      <td>37.720000</td>\n",
+              "      <td>37.000000</td>\n",
+              "      <td>3151.250000</td>\n",
+              "      <td>648.250000</td>\n",
+              "      <td>1721.000000</td>\n",
+              "      <td>605.250000</td>\n",
+              "      <td>4.767000</td>\n",
+              "      <td>265000.000000</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>max</th>\n",
+              "      <td>-114.310000</td>\n",
+              "      <td>41.950000</td>\n",
+              "      <td>52.000000</td>\n",
+              "      <td>37937.000000</td>\n",
+              "      <td>6445.000000</td>\n",
+              "      <td>35682.000000</td>\n",
+              "      <td>6082.000000</td>\n",
+              "      <td>15.000100</td>\n",
+              "      <td>500001.000000</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>"
+            ],
+            "text/plain": [
+              "          longitude      latitude  housing_median_age   total_rooms  \\\n",
+              "count  17000.000000  17000.000000        17000.000000  17000.000000   \n",
+              "mean    -119.562108     35.625225           28.589353   2643.664412   \n",
+              "std        2.005166      2.137340           12.586937   2179.947071   \n",
+              "min     -124.350000     32.540000            1.000000      2.000000   \n",
+              "25%     -121.790000     33.930000           18.000000   1462.000000   \n",
+              "50%     -118.490000     34.250000           29.000000   2127.000000   \n",
+              "75%     -118.000000     37.720000           37.000000   3151.250000   \n",
+              "max     -114.310000     41.950000           52.000000  37937.000000   \n",
+              "\n",
+              "       total_bedrooms    population    households  median_income  \\\n",
+              "count    17000.000000  17000.000000  17000.000000   17000.000000   \n",
+              "mean       539.410824   1429.573941    501.221941       3.883578   \n",
+              "std        421.499452   1147.852959    384.520841       1.908157   \n",
+              "min          1.000000      3.000000      1.000000       0.499900   \n",
+              "25%        297.000000    790.000000    282.000000       2.566375   \n",
+              "50%        434.000000   1167.000000    409.000000       3.544600   \n",
+              "75%        648.250000   1721.000000    605.250000       4.767000   \n",
+              "max       6445.000000  35682.000000   6082.000000      15.000100   \n",
+              "\n",
+              "       median_house_value  \n",
+              "count        17000.000000  \n",
+              "mean        207300.912353  \n",
+              "std         115983.764387  \n",
+              "min          14999.000000  \n",
+              "25%         119400.000000  \n",
+              "50%         180400.000000  \n",
+              "75%         265000.000000  \n",
+              "max         500001.000000  "
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 5
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "WrkBjfz5kEQu"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "The example above used `DataFrame.describe` to show interesting statistics about a `DataFrame`. Another useful function is `DataFrame.head`, which displays the first few records of a `DataFrame`:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "s3ND3bgOkB5k",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 226
+        },
+        "outputId": "a0227da6-863c-453e-c83e-4e69b4657896"
+      },
+      "cell_type": "code",
+      "source": [
+        "california_housing_dataframe.head()"
+      ],
+      "execution_count": 6,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/html": [
+              "<div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>longitude</th>\n",
+              "      <th>latitude</th>\n",
+              "      <th>housing_median_age</th>\n",
+              "      <th>total_rooms</th>\n",
+              "      <th>total_bedrooms</th>\n",
+              "      <th>population</th>\n",
+              "      <th>households</th>\n",
+              "      <th>median_income</th>\n",
+              "      <th>median_house_value</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>0</th>\n",
+              "      <td>-114.31</td>\n",
+              "      <td>34.19</td>\n",
+              "      <td>15.0</td>\n",
+              "      <td>5612.0</td>\n",
+              "      <td>1283.0</td>\n",
+              "      <td>1015.0</td>\n",
+              "      <td>472.0</td>\n",
+              "      <td>1.4936</td>\n",
+              "      <td>66900.0</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>1</th>\n",
+              "      <td>-114.47</td>\n",
+              "      <td>34.40</td>\n",
+              "      <td>19.0</td>\n",
+              "      <td>7650.0</td>\n",
+              "      <td>1901.0</td>\n",
+              "      <td>1129.0</td>\n",
+              "      <td>463.0</td>\n",
+              "      <td>1.8200</td>\n",
+              "      <td>80100.0</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>2</th>\n",
+              "      <td>-114.56</td>\n",
+              "      <td>33.69</td>\n",
+              "      <td>17.0</td>\n",
+              "      <td>720.0</td>\n",
+              "      <td>174.0</td>\n",
+              "      <td>333.0</td>\n",
+              "      <td>117.0</td>\n",
+              "      <td>1.6509</td>\n",
+              "      <td>85700.0</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>3</th>\n",
+              "      <td>-114.57</td>\n",
+              "      <td>33.64</td>\n",
+              "      <td>14.0</td>\n",
+              "      <td>1501.0</td>\n",
+              "      <td>337.0</td>\n",
+              "      <td>515.0</td>\n",
+              "      <td>226.0</td>\n",
+              "      <td>3.1917</td>\n",
+              "      <td>73400.0</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>4</th>\n",
+              "      <td>-114.57</td>\n",
+              "      <td>33.57</td>\n",
+              "      <td>20.0</td>\n",
+              "      <td>1454.0</td>\n",
+              "      <td>326.0</td>\n",
+              "      <td>624.0</td>\n",
+              "      <td>262.0</td>\n",
+              "      <td>1.9250</td>\n",
+              "      <td>65500.0</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>"
+            ],
+            "text/plain": [
+              "   longitude  latitude  housing_median_age  total_rooms  total_bedrooms  \\\n",
+              "0    -114.31     34.19                15.0       5612.0          1283.0   \n",
+              "1    -114.47     34.40                19.0       7650.0          1901.0   \n",
+              "2    -114.56     33.69                17.0        720.0           174.0   \n",
+              "3    -114.57     33.64                14.0       1501.0           337.0   \n",
+              "4    -114.57     33.57                20.0       1454.0           326.0   \n",
+              "\n",
+              "   population  households  median_income  median_house_value  \n",
+              "0      1015.0       472.0         1.4936             66900.0  \n",
+              "1      1129.0       463.0         1.8200             80100.0  \n",
+              "2       333.0       117.0         1.6509             85700.0  \n",
+              "3       515.0       226.0         3.1917             73400.0  \n",
+              "4       624.0       262.0         1.9250             65500.0  "
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 6
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "w9-Es5Y6laGd"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "Another powerful feature of *pandas* is graphing. For example, `DataFrame.hist` lets you quickly study the distribution of values in a column:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "nqndFVXVlbPN",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 397
+        },
+        "outputId": "edaf5f93-5bce-44a6-c30f-4435f896ea23"
+      },
+      "cell_type": "code",
+      "source": [
+        "california_housing_dataframe.hist('housing_median_age')"
+      ],
+      "execution_count": 7,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "array([[<matplotlib.axes._subplots.AxesSubplot object at 0x7f76f977f6a0>]],\n",
+              "      dtype=object)"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 7
+        },
+        {
+          "output_type": "display_data",
+          "data": {
+            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAeoAAAFZCAYAAABXM2zhAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzt3X1UlHX+//HXMDAH0UEEGTfLarf0\naEmaa5l4U0Iokp7IVRPWdU3q6Iqtlql499WTlajRmmZZmunRU7GNtofcAjJxyyRanT0uuu0p2VOr\neTejKCqgSPP7o9Os/FRguP1Az8dfcTEz1+d6H+3pdQ1zYfF6vV4BAAAjBTT3AgAAwPURagAADEao\nAQAwGKEGAMBghBoAAIMRagAADEaogVo6cuSI7rjjjkbdxz//+U+lpKQ06j4a0h133KEjR47o448/\n1ty5c5t7OUCrZOFz1EDtHDlyREOHDtW//vWv5l6KMe644w7l5ubqpptuau6lAK0WZ9SAn5xOp0aO\nHKn7779f27dv1w8//KA//elPio+PV3x8vNLS0lRaWipJiomJ0d69e33P/enry5cva/78+Ro2bJji\n4uI0bdo0nT9/XgUFBYqLi5MkrV69Ws8++6xSU1MVGxur0aNH6+TJk5KkgwcPaujQoRo6dKheeeUV\njRw5UgUFBdWue/Xq1Vq0aJEmT56sgQMHatasWcrLy9OoUaM0cOBA5eXlSZIuXbqk5557TsOGDVNM\nTIzWrl3re42//e1viouL0/Dhw7V+/Xrf9m3btmnixImSJI/Ho5SUFMXHxysmJkZvvfVWleN/9913\nNXr0aA0cOFDp6ek1zrusrEwzZszwrWfZsmW+71U3hx07dmjkyJGKjY3VpEmTdPr06Rr3BZiIUAN+\n+OGHH1RRUaEPPvhAc+fO1cqVK/XRRx/p008/1bZt2/TXv/5VJSUl2rhxY7Wvs3v3bh05ckTZ2dnK\nzc3V7bffrn/84x9XPS47O1vz5s3Tjh07FBERoa1bt0qSFi5cqIkTJyo3N1ft2rXTt99+W6v179q1\nSy+88II++OADZWdn+9Y9ZcoUrVu3TpK0bt06HTp0SB988IG2b9+unJwc5eXlqbKyUvPnz9eiRYv0\n0UcfKSAgQJWVlVft47XXXtNNN92k7Oxsbdq0SRkZGTp27Jjv+3//+9+VmZmprVu3asuWLTp+/Hi1\na37nnXd04cIFZWdn6/3339e2bdt8//i53hwOHz6s2bNnKyMjQ5988on69eunxYsX12pGgGkINeAH\nr9erxMREST9e9j1+/Lh27dqlxMREhYSEyGq1atSoUfr888+rfZ3w8HAVFRXp448/9p0xDho06KrH\n9e3bVzfeeKMsFot69OihY8eOqby8XAcPHtSIESMkSb/97W9V23ew7r77bkVERKhDhw6KjIzU4MGD\nJUndunXzna3n5eUpOTlZNptNISEhevjhh5Wbm6tvv/1Wly5d0sCBAyVJjzzyyDX3sWDBAi1cuFCS\n1KVLF0VGRurIkSO+748cOVJWq1WdOnVSRERElYhfy6RJk/Tqq6/KYrGoffv26tq1q44cOVLtHD79\n9FPde++96tatmyRp3Lhx2rlz5zX/YQGYLrC5FwC0JFarVW3atJEkBQQE6IcfftDp06fVvn1732Pa\nt2+vU6dOVfs6d911lxYsWKDNmzdrzpw5iomJ0aJFi656nN1ur7LvyspKnT17VhaLRaGhoZKkoKAg\nRURE1Gr9bdu2rfJ6ISEhVY5Fks6dO6elS5fqpZdekvTjpfC77rpLZ8+eVbt27aoc57UUFhb6zqID\nAgLkdrt9ry2pymv8dEzV+fbbb5Wenq7//Oc/CggI0PHjxzVq1Khq53Du3Dnt3btX8fHxVfZ75syZ\nWs8KMAWhBuqpY8eOOnPmjO/rM2fOqGPHjpKqBlCSzp496/vvn97TPnPmjObNm6c333xT0dHRNe6v\nXbt28nq9KisrU5s2bXT58uUGff/V4XBo0qRJGjJkSJXtRUVFOn/+vO/r6+1z1qxZ+v3vf6+kpCRZ\nLJZrXinwx7PPPqs777xTa9askdVq1bhx4yRVPweHw6Ho6GitWrWqXvsGTMClb6CeHnjgAWVlZams\nrEyXL1+W0+nU/fffL0mKjIzUv//9b0nShx9+qIsXL0qStm7dqjVr1kiSwsLC9Ktf/arW+2vbtq1u\nu+02ffTRR5KkzMxMWSyWBjue2NhYvffee6qsrJTX69Wrr76qTz/9VDfffLOsVqvvh7W2bdt2zf2e\nOnVKPXv2lMVi0fvvv6+ysjLfD9fVxalTp9SjRw9ZrVZ9/vnn+u6771RaWlrtHAYOHKi9e/fq8OHD\nkn782Ntzzz1X5zUAzYlQA/UUHx+vwYMHa9SoURoxYoR+8YtfaMKECZKkqVOnauPGjRoxYoSKiop0\n++23S/oxhj/9xPLw4cN16NAhPfbYY7Xe56JFi7R27Vo99NBDKi0tVadOnRos1snJyercubMeeugh\nxcfHq6ioSL/+9a8VFBSkJUuWaN68eRo+fLgsFovv0vmVpk+frtTUVI0cOVKlpaV69NFHtXDhQv33\nv/+t03r+8Ic/aNmyZRoxYoS+/PJLTZs2TatXr9a+ffuuOweHw6ElS5YoNTVVw4cP17PPPquEhIT6\njgZoFnyOGmihvF6vL8733XefNm7cqO7duzfzqpoec0Brxxk10AL98Y9/9H2cKj8/X16vV7feemvz\nLqoZMAf8HHBGDbRARUVFmjt3rs6ePaugoCDNmjVLN910k1JTU6/5+Ntuu833nrhpioqK6rzua83h\np58PAFoLQg0AgMG49A0AgMEINQAABjPyhidu9zm/Ht+hQ4iKi+v+Oc2fO+ZXd8yufphf3TG7+jFt\nfpGR9ut+r1WcUQcGWpt7CS0a86s7Zlc/zK/umF39tKT5tYpQAwDQWhFqAAAMRqgBADBYjT9MVlZW\nprS0NJ06dUoXL17U1KlT1b17d82ePVuVlZWKjIzUihUrZLPZlJWVpU2bNikgIEBjx47VmDFjVFFR\nobS0NB09elRWq1VLly5Vly5dmuLYAABo8Wo8o87Ly1PPnj21ZcsWrVy5Uunp6Vq1apWSk5P19ttv\n65ZbbpHT6VRpaanWrFmjjRs3avPmzdq0aZPOnDmj7du3KzQ0VO+8846mTJmijIyMpjguAABahRpD\nnZCQoCeeeEKSdOzYMXXq1EkFBQWKjY2VJA0ZMkT5+fnav3+/oqKiZLfbFRwcrD59+sjlcik/P19x\ncXGSpOjoaLlcrkY8HAAAWpdaf4563LhxOn78uNauXavHHntMNptNkhQRESG32y2Px6Pw8HDf48PD\nw6/aHhAQIIvFokuXLvmeDwAArq/WoX733Xf11VdfadasWbry9uDXu1W4v9uv1KFDiN+fcavuw+Ko\nGfOrO2ZXP8yv7phd/bSU+dUY6gMHDigiIkI33HCDevToocrKSrVt21bl5eUKDg7WiRMn5HA45HA4\n5PF4fM87efKkevfuLYfDIbfbre7du6uiokJer7fGs2l/7xYTGWn3+25m+B/mV3fMrn6YX90xu/ox\nbX71ujPZ3r17tWHDBkmSx+NRaWmpoqOjlZOTI0nKzc3VoEGD1KtXLxUWFqqkpEQXLlyQy+VS3759\nNWDAAGVnZ0v68QfT+vXr1xDHBADAz0KNZ9Tjxo3T/PnzlZycrPLycv3f//2fevbsqTlz5igzM1Od\nO3dWYmKigoKCNHPmTKWkpMhisSg1NVV2u10JCQnas2ePkpKSZLPZlJ6e3hTHBQBAq2Dk76P293KE\naZcwWhrmV3fMrn6YX90xu/oxbX7VXfo28rdnAcC1TErf2dxLqNGGtJjmXgJaGW4hCgCAwQg1AAAG\nI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCA\nwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMA\nYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QA\nABiMUAMAYDBCDQCAwQg1AAAGC6zNg5YvX659+/bp8uXLmjx5snbu3KmDBw8qLCxMkpSSkqIHHnhA\nWVlZ2rRpkwICAjR27FiNGTNGFRUVSktL09GjR2W1WrV06VJ16dKlUQ8KAIDWosZQf/HFF/rmm2+U\nmZmp4uJiPfLII7rvvvv09NNPa8iQIb7HlZaWas2aNXI6nQoKCtLo0aMVFxenvLw8hYaGKiMjQ7t3\n71ZGRoZWrlzZqAcFAEBrUeOl73vuuUcvv/yyJCk0NFRlZWWqrKy86nH79+9XVFSU7Ha7goOD1adP\nH7lcLuXn5ysuLk6SFB0dLZfL1cCHAABA61VjqK1Wq0JCQiRJTqdTgwcPltVq1ZYtWzRhwgQ99dRT\nOn36tDwej8LDw33PCw8Pl9vtrrI9ICBAFotFly5daqTDAQCgdanVe9SStGPHDjmdTm3YsEEHDhxQ\nWFiYevTooTfeeEOvvPKK7r777iqP93q913yd622/UocOIQoMtNZ2aZKkyEi7X49HVcyv7phd/bS2\n+TXl8bS22TW1ljK/WoX6s88+09q1a7V+/XrZ7Xb179/f972YmBgtXrxYw4YNk8fj8W0/efKkevfu\nLYfDIbfbre7du6uiokJer1c2m63a/RUXl/p1EJGRdrnd5/x6Dv6H+dUds6uf1ji/pjqe1ji7pmTa\n/Kr7R0ONl77PnTun5cuX6/XXX/f9lPeTTz6pw4cPS5IKCgrUtWtX9erVS4WFhSopKdGFCxfkcrnU\nt29fDRgwQNnZ2ZKkvLw89evXryGOCQCAn4Uaz6g//PBDFRcXa8aMGb5to0aN0owZM9SmTRuFhIRo\n6dKlCg4O1syZM5WSkiKLxaLU1FTZ7XYlJCRoz549SkpKks1mU3p6eqMeEAAArYnFW5s3jZuYv5cj\nTLuE0dIwv7pjdvXj7/wmpe9sxNU0jA1pMU2yH/7s1Y9p86vXpW8AANB8CDUAAAYj1AAAGIxQAwBg\nMEINAIDBCDUAAAYj1AAAGIxQAwBgMEINAIDBCDUAAAYj1AAAGIxQAwBgMEINAIDBCDUAAAYj1AAA\nGIxQAwBgMEINAIDBCDUAAAYj1AAAGIxQAwBgMEINAIDBCDUAAAYLbO4FAA1lUvrO5l5CtTakxTT3\nEgC0QJxRAwBgMEINAIDBCDUAAAYj1AAAGIxQAwBgMEINAIDBCDUAAAYj1AAAGIxQAwBgMEINAIDB\nCDUAAAYj1AAAGIxQAwBgMEINAIDBCDUAAAbj91EDTcT035ct8TuzARNxRg0AgMFqdUa9fPly7du3\nT5cvX9bkyZMVFRWl2bNnq7KyUpGRkVqxYoVsNpuysrK0adMmBQQEaOzYsRozZowqKiqUlpamo0eP\nymq1aunSperSpUtjHxcAAK1CjaH+4osv9M033ygzM1PFxcV65JFH1L9/fyUnJ2v48OF66aWX5HQ6\nlZiYqDVr1sjpdCooKEijR49WXFyc8vLyFBoaqoyMDO3evVsZGRlauXJlUxwbAAAtXo2Xvu+55x69\n/PLLkqTQ0FCVlZWpoKBAsbGxkqQhQ4YoPz9f+/fvV1RUlOx2u4KDg9WnTx+5XC7l5+crLi5OkhQd\nHS2Xy9WIhwMAQOtS4xm11WpVSEiIJMnpdGrw4MHavXu3bDabJCkiIkJut1sej0fh4eG+54WHh1+1\nPSAgQBaLRZcuXfI9/1o6dAhRYKDVrwOJjLT79XhUxfwgNc+fg9b2Z68pj6e1za6ptZT51fqnvnfs\n2CGn06kNGzZo6NChvu1er/eaj/d3+5WKi0truyxJPw7b7T7n13PwP8wPP2nqPwet8c9eUx1Pa5xd\nUzJtftX9o6FWP/X92Wefae3atVq3bp3sdrtCQkJUXl4uSTpx4oQcDoccDoc8Ho/vOSdPnvRtd7vd\nkqSKigp5vd5qz6YBAMD/1Bjqc+fOafny5Xr99dcVFhYm6cf3mnNyciRJubm5GjRokHr16qXCwkKV\nlJTowoULcrlc6tu3rwYMGKDs7GxJUl5envr169eIhwMAQOtS46XvDz/8UMXFxZoxY4ZvW3p6uhYs\nWKDMzEx17txZiYmJCgoK0syZM5WSkiKLxaLU1FTZ7XYlJCRoz549SkpKks1mU3p6eqMeEAAArUmN\noX700Uf16KOPXrX9rbfeumpbfHy84uPjq2z76bPTAADAf9xCFIBPS7jNKfBzwy1EAQAwGKEGAMBg\nhBoAAIMRagAADEaoAQAwGKEGAMBghBoAAIMRagAADEaoAQAwGHcmQ61wxyoAaB6cUQMAYDBCDQCA\nwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMA\nYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABgssLkXAADAlSal72zuJdRoQ1pM\nk+2LM2oAAAxGqAEAMBihBgDAYIQaAACDEWoAAAxGqAEAMBihBgDAYLX6HPXXX3+tqVOnauLEiRo/\nfrzS0tJ08OBBhYWFSZJSUlL0wAMPKCsrS5s2bVJAQIDGjh2rMWPGqKKiQmlpaTp69KisVquWLl2q\nLl26NOpBAUBz4TPAaGg1hrq0tFRLlixR//79q2x/+umnNWTIkCqPW7NmjZxOp4KCgjR69GjFxcUp\nLy9PoaGhysjI0O7du5WRkaGVK1c2/JEAANAK1Xjp22azad26dXI4HNU+bv/+/YqKipLdbldwcLD6\n9Okjl8ul/Px8xcXFSZKio6PlcrkaZuUAAPwM1BjqwMBABQcHX7V9y5YtmjBhgp566imdPn1aHo9H\n4eHhvu+Hh4fL7XZX2R4QECCLxaJLly414CEAANB61ele3w8//LDCwsLUo0cPvfHGG3rllVd09913\nV3mM1+u95nOvt/1KHTqEKDDQ6teaIiPtfj0eVTE/4OeDv+/115QzrFOor3y/OiYmRosXL9awYcPk\n8Xh820+ePKnevXvL4XDI7Xare/fuqqiokNfrlc1mq/b1i4tL/VpPZKRdbvc5/w4CPswP+Hnh73v9\nNfQMqwt/nT6e9eSTT+rw4cOSpIKCAnXt2lW9evVSYWGhSkpKdOHCBblcLvXt21cDBgxQdna2JCkv\nL0/9+vWryy4BAPhZqvGM+sCBA1q2bJm+//57BQYGKicnR+PHj9eMGTPUpk0bhYSEaOnSpQoODtbM\nmTOVkpIii8Wi1NRU2e12JSQkaM+ePUpKSpLNZlN6enpTHBcAAK1CjaHu2bOnNm/efNX2YcOGXbUt\nPj5e8fHxVbb99NlpAADgP+5MBgCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMA\nYDBCDQCAwQg1AAAGI9QAABiMUAMAYLA6/T5qAEDLNSl9Z3MvAX7gjBoAAIMRagAADEaoAQAwGKEG\nAMBghBoAAIMRagAADEaoAQAwGKEGAMBghBoAAIMRagAADEaoAQAwGKEGAMBghBoAAIMRagAADEao\nAQAwGKEGAMBghBoAAIMRagAADEaoAQAwGKEGAMBghBoAAIMRagAADEaoAQAwGKEGAMBghBoAAIMR\nagAADFarUH/99dd68MEHtWXLFknSsWPH9Lvf/U7JycmaPn26Ll26JEnKysrSb37zG40ZM0bvvfee\nJKmiokIzZ85UUlKSxo8fr8OHDzfSoQAA0PrUGOrS0lItWbJE/fv3921btWqVkpOT9fbbb+uWW26R\n0+lUaWmp1qxZo40bN2rz5s3atGmTzpw5o+3btys0NFTvvPOOpkyZooyMjEY9IAAAWpMaQ22z2bRu\n3To5HA7ftoKCAsXGxkqShgwZovz8fO3fv19RUVGy2+0KDg5Wnz595HK5lJ+fr7i4OElSdHS0XC5X\nIx0KAACtT42hDgwMVHBwcJVtZWVlstlskqSIiAi53W55PB6Fh4f7HhMeHn7V9oCAAFksFt+lcgAA\nUL3A+r6A1+ttkO1X6tAhRIGBVr/WERlp9+vxqIr5AUDtNeX/M+sU6pCQEJWXlys4OFgnTpyQw+GQ\nw+GQx+PxPebkyZPq3bu3HA6H3G63unfvroqKCnm9Xt/Z+PUUF5f6tZ7ISLvc7nN1ORSI+QGAvxr6\n/5nVhb9OH8+Kjo5WTk6OJCk3N1eDBg1Sr169VFhYqJKSEl24cEEul0t9+/bVgAEDlJ2dLUnKy8tT\nv3796rJLAAB+lmo8oz5w4ICWLVum77//XoGBgcrJydGLL76otLQ0ZWZmqnPnzkpMTFRQUJBmzpyp\nlJQUWSwWpaamym63KyEhQXv27FFSUpJsNpvS09Ob4rgAAGgVLN7avGncxPy9pMCl2/qpzfwmpe9s\notUAgPk2pMU06Os1+KVvAADQNOr9U99oGJyxAgCuhTNqAAAMRqgBADAYoQYAwGCEGgAAgxFqAAAM\nRqgBADAYoQYAwGCEGgAAgxFqAAAMRqgBADAYoQYAwGCEGgAAgxFqAAAMRqgBADAYoQYAwGCEGgAA\ngxFqAAAMRqgBADAYoQYAwGCEGgAAgxFqAAAMRqgBADAYoQYAwGCEGgAAgxFqAAAMRqgBADAYoQYA\nwGCEGgAAgxFqAAAMRqgBADAYoQYAwGCEGgAAgxFqAAAMFtjcC2gKk9J3NvcSAACoE86oAQAwGKEG\nAMBghBoAAIMRagAADFanHyYrKCjQ9OnT1bVrV0lSt27d9Pjjj2v27NmqrKxUZGSkVqxYIZvNpqys\nLG3atEkBAQEaO3asxowZ06AHAABAa1bnn/q+9957tWrVKt/Xc+fOVXJysoYPH66XXnpJTqdTiYmJ\nWrNmjZxOp4KCgjR69GjFxcUpLCysQRYPAEBr12CXvgsKChQbGytJGjJkiPLz87V//35FRUXJbrcr\nODhYffr0kcvlaqhdAgDQ6tX5jPrQoUOaMmWKzp49q2nTpqmsrEw2m02SFBERIbfbLY/Ho/DwcN9z\nwsPD5Xa7a3ztDh1CFBho9Ws9kZF2/w4AAIA6asrm1CnUt956q6ZNm6bhw4fr8OHDmjBhgiorK33f\n93q913ze9bb//4qLS/1aT2SkXW73Ob+eAwBAXTV0c6oLf50ufXfq1EkJCQmyWCy6+eab1bFjR509\ne1bl5eWSpBMnTsjhcMjhcMjj8fied/LkSTkcjrrsEgCAn6U6hTorK0tvvvmmJMntduvUqVMaNWqU\ncnJyJEm5ubkaNGiQevXqpcLCQpWUlOjChQtyuVzq27dvw60eAIBWrk6XvmNiYvTMM8/ok08+UUVF\nhRYvXqwePXpozpw5yszMVOfOnZWYmKigoCDNnDlTKSkpslgsSk1Nld3Oe8kAANSWxVvbN46bkL/X\n/mt6j5pfygEAaEgb0mIa9PUa/D1qAADQNAg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiM\nUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAG\nI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCA\nwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABiMUAMAYDBCDQCAwQg1AAAGI9QAABgssCl2\n8sILL2j//v2yWCyaN2+e7rrrrqbYLQAALV6jh/rLL7/Ud999p8zMTBUVFWnevHnKzMxs7N0CANAq\nNPql7/z8fD344IOSpNtuu01nz57V+fPnG3u3AAC0Co0eao/How4dOvi+Dg8Pl9vtbuzdAgDQKjTJ\ne9RX8nq9NT4mMtLu9+tW95wPMh72+/UAADBBo59ROxwOeTwe39cnT55UZGRkY+8WAIBWodFDPWDA\nAOXk5EiSDh48KIfDoXbt2jX2bgEAaBUa/dJ3nz59dOedd2rcuHGyWCxatGhRY+8SAIBWw+KtzZvG\nAACgWXBnMgAADEaoAQAwWJN/PKuhcXtS/3399deaOnWqJk6cqPHjx+vYsWOaPXu2KisrFRkZqRUr\nVshmszX3Mo20fPly7du3T5cvX9bkyZMVFRXF7GqhrKxMaWlpOnXqlC5evKipU6eqe/fuzM5P5eXl\nGjFihKZOnar+/fszv1oqKCjQ9OnT1bVrV0lSt27d9Pjjj7eY+bXoM+orb0/6/PPP6/nnn2/uJRmv\ntLRUS5YsUf/+/X3bVq1apeTkZL399tu65ZZb5HQ6m3GF5vriiy/0zTffKDMzU+vXr9cLL7zA7Gop\nLy9PPXv21JYtW7Ry5Uqlp6czuzp47bXX1L59e0n8vfXXvffeq82bN2vz5s1auHBhi5pfiw41tyf1\nn81m07p16+RwOHzbCgoKFBsbK0kaMmSI8vPzm2t5Rrvnnnv08ssvS5JCQ0NVVlbG7GopISFBTzzx\nhCTp2LFj6tSpE7PzU1FRkQ4dOqQHHnhAEn9v66slza9Fh5rbk/ovMDBQwcHBVbaVlZX5LvlEREQw\nw+uwWq0KCQmRJDmdTg0ePJjZ+WncuHF65plnNG/ePGbnp2XLliktLc33NfPzz6FDhzRlyhQlJSXp\n888/b1Hza/HvUV+JT5rVHzOs2Y4dO+R0OrVhwwYNHTrUt53Z1ezdd9/VV199pVmzZlWZF7Or3l/+\n8hf17t1bXbp0ueb3mV/1br31Vk2bNk3Dhw/X4cOHNWHCBFVWVvq+b/r8WnSouT1pwwgJCVF5ebmC\ng4N14sSJKpfFUdVnn32mtWvXav369bLb7cyulg4cOKCIiAjdcMMN6tGjhyorK9W2bVtmV0u7du3S\n4cOHtWvXLh0/flw2m40/e37o1KmTEhISJEk333yzOnbsqMLCwhYzvxZ96ZvbkzaM6Oho3xxzc3M1\naNCgZl6Rmc6dO6fly5fr9ddfV1hYmCRmV1t79+7Vhg0bJP34llVpaSmz88PKlSu1detW/fnPf9aY\nMWM0depU5ueHrKwsvfnmm5Ikt9utU6dOadSoUS1mfi3+zmQvvvii9u7d67s9affu3Zt7SUY7cOCA\nli1bpu+//16BgYHq1KmTXnytKYqYAAAArElEQVTxRaWlpenixYvq3Lmzli5dqqCgoOZeqnEyMzO1\nevVq/fKXv/RtS09P14IFC5hdDcrLyzV//nwdO3ZM5eXlmjZtmnr27Kk5c+YwOz+tXr1aN954owYO\nHMj8aun8+fN65plnVFJSooqKCk2bNk09evRoMfNr8aEGAKA1a9GXvgEAaO0INQAABiPUAAAYjFAD\nAGAwQg0AgMEINQAABiPUAAAYjFADAGCw/wdkB5RjykY3PgAAAABJRU5ErkJggg==\n",
+            "text/plain": [
+              "<Figure size 576x396 with 1 Axes>"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          }
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "XtYZ7114n3b-"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "## Accessing Data\n",
+        "\n",
+        "You can access `DataFrame` data using familiar Python dict/list operations:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "_TFm7-looBFF",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 104
+        },
+        "outputId": "57aede33-3c05-4f1c-8509-78060101057d"
+      },
+      "cell_type": "code",
+      "source": [
+        "cities = pd.DataFrame({ 'City name': city_names, 'Population': population })\n",
+        "print(type(cities['City name']))\n",
+        "cities['City name']"
+      ],
+      "execution_count": 8,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "<class 'pandas.core.series.Series'>\n"
+          ],
+          "name": "stdout"
+        },
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "0    San Francisco\n",
+              "1         San Jose\n",
+              "2       Sacramento\n",
+              "Name: City name, dtype: object"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 8
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "V5L6xacLoxyv",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 52
+        },
+        "outputId": "c6c3f30a-067e-4fef-ca63-51aae3939f62"
+      },
+      "cell_type": "code",
+      "source": [
+        "print(type(cities['City name'][1]))\n",
+        "cities['City name'][1]"
+      ],
+      "execution_count": 9,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "<class 'str'>\n"
+          ],
+          "name": "stdout"
+        },
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "'San Jose'"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 9
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "gcYX1tBPugZl",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 129
+        },
+        "outputId": "41ecc9c0-2935-4afe-a530-97b64431fc29"
+      },
+      "cell_type": "code",
+      "source": [
+        "print(type(cities[0:2]))\n",
+        "cities[0:2]"
+      ],
+      "execution_count": 10,
+      "outputs": [
+        {
+          "output_type": "stream",
+          "text": [
+            "<class 'pandas.core.frame.DataFrame'>\n"
+          ],
+          "name": "stdout"
+        },
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/html": [
+              "<div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>City name</th>\n",
+              "      <th>Population</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>0</th>\n",
+              "      <td>San Francisco</td>\n",
+              "      <td>852469</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>1</th>\n",
+              "      <td>San Jose</td>\n",
+              "      <td>1015785</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>"
+            ],
+            "text/plain": [
+              "       City name  Population\n",
+              "0  San Francisco      852469\n",
+              "1       San Jose     1015785"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 10
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "65g1ZdGVjXsQ"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "In addition, *pandas* provides an extremely rich API for advanced [indexing and selection](http://pandas.pydata.org/pandas-docs/stable/indexing.html) that is too extensive to be covered here."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "RM1iaD-ka3Y1"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "## Manipulating Data\n",
+        "\n",
+        "You may apply Python's basic arithmetic operations to `Series`. For example:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "XWmyCFJ5bOv-",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 86
+        },
+        "outputId": "daaa5c1b-3eda-4b83-df5c-0c25a24f3b89"
+      },
+      "cell_type": "code",
+      "source": [
+        "population / 1000."
+      ],
+      "execution_count": 11,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "0     852.469\n",
+              "1    1015.785\n",
+              "2     485.199\n",
+              "dtype: float64"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 11
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "TQzIVnbnmWGM"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "[NumPy](http://www.numpy.org/) is a popular toolkit for scientific computing. *pandas* `Series` can be used as arguments to most NumPy functions:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "ko6pLK6JmkYP",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 86
+        },
+        "outputId": "f4e256ae-81a6-4f43-eed7-80a0cb371482"
+      },
+      "cell_type": "code",
+      "source": [
+        "import numpy as np\n",
+        "\n",
+        "np.log(population)"
+      ],
+      "execution_count": 12,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "0    13.655892\n",
+              "1    13.831172\n",
+              "2    13.092314\n",
+              "dtype: float64"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 12
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "xmxFuQmurr6d"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "For more complex single-column transformations, you can use `Series.apply`. Like the Python [map function](https://docs.python.org/2/library/functions.html#map), \n",
+        "`Series.apply` accepts as an argument a [lambda function](https://docs.python.org/2/tutorial/controlflow.html#lambda-expressions), which is applied to each value.\n",
+        "\n",
+        "The example below creates a new `Series` that indicates whether `population` is over one million:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "Fc1DvPAbstjI",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 86
+        },
+        "outputId": "a53dfe27-2832-4927-acb4-c097eefc16a5"
+      },
+      "cell_type": "code",
+      "source": [
+        "population.apply(lambda val: val > 1000000)"
+      ],
+      "execution_count": 13,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "0    False\n",
+              "1     True\n",
+              "2    False\n",
+              "dtype: bool"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 13
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "ZeYYLoV9b9fB"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "\n",
+        "Modifying `DataFrames` is also straightforward. For example, the following code adds two `Series` to an existing `DataFrame`:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "0gCEX99Hb8LR",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 143
+        },
+        "outputId": "b093cd85-f4e2-4c35-a936-8cce335552c3"
+      },
+      "cell_type": "code",
+      "source": [
+        "cities['Area square miles'] = pd.Series([46.87, 176.53, 97.92])\n",
+        "cities['Population density'] = cities['Population'] / cities['Area square miles']\n",
+        "cities"
+      ],
+      "execution_count": 14,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/html": [
+              "<div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>City name</th>\n",
+              "      <th>Population</th>\n",
+              "      <th>Area square miles</th>\n",
+              "      <th>Population density</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>0</th>\n",
+              "      <td>San Francisco</td>\n",
+              "      <td>852469</td>\n",
+              "      <td>46.87</td>\n",
+              "      <td>18187.945381</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>1</th>\n",
+              "      <td>San Jose</td>\n",
+              "      <td>1015785</td>\n",
+              "      <td>176.53</td>\n",
+              "      <td>5754.177760</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>2</th>\n",
+              "      <td>Sacramento</td>\n",
+              "      <td>485199</td>\n",
+              "      <td>97.92</td>\n",
+              "      <td>4955.055147</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>"
+            ],
+            "text/plain": [
+              "       City name  Population  Area square miles  Population density\n",
+              "0  San Francisco      852469              46.87        18187.945381\n",
+              "1       San Jose     1015785             176.53         5754.177760\n",
+              "2     Sacramento      485199              97.92         4955.055147"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 14
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "6qh63m-ayb-c"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "## Exercise #1\n",
+        "\n",
+        "Modify the `cities` table by adding a new boolean column that is True if and only if *both* of the following are True:\n",
+        "\n",
+        "  * The city is named after a saint.\n",
+        "  * The city has an area greater than 50 square miles.\n",
+        "\n",
+        "**Note:** Boolean `Series` are combined using the bitwise, rather than the traditional boolean, operators. For example, when performing *logical and*, use `&` instead of `and`.\n",
+        "\n",
+        "**Hint:** \"San\" in Spanish means \"saint.\""
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "zCOn8ftSyddH",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 143
+        },
+        "outputId": "59689928-0c91-43d2-963f-6e5d7c16a527"
+      },
+      "cell_type": "code",
+      "source": [
+        "# Your code here\n",
+        "cities['Is wide and has saint name'] = (cities['Area square miles'] > 50) & cities['City name'].apply(lambda name: name.startswith('San'))\n",
+        "cities"
+      ],
+      "execution_count": 15,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/html": [
+              "<div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>City name</th>\n",
+              "      <th>Population</th>\n",
+              "      <th>Area square miles</th>\n",
+              "      <th>Population density</th>\n",
+              "      <th>Is wide and has saint name</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>0</th>\n",
+              "      <td>San Francisco</td>\n",
+              "      <td>852469</td>\n",
+              "      <td>46.87</td>\n",
+              "      <td>18187.945381</td>\n",
+              "      <td>False</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>1</th>\n",
+              "      <td>San Jose</td>\n",
+              "      <td>1015785</td>\n",
+              "      <td>176.53</td>\n",
+              "      <td>5754.177760</td>\n",
+              "      <td>True</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>2</th>\n",
+              "      <td>Sacramento</td>\n",
+              "      <td>485199</td>\n",
+              "      <td>97.92</td>\n",
+              "      <td>4955.055147</td>\n",
+              "      <td>False</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>"
+            ],
+            "text/plain": [
+              "       City name  Population  Area square miles  Population density  \\\n",
+              "0  San Francisco      852469              46.87        18187.945381   \n",
+              "1       San Jose     1015785             176.53         5754.177760   \n",
+              "2     Sacramento      485199              97.92         4955.055147   \n",
+              "\n",
+              "   Is wide and has saint name  \n",
+              "0                       False  \n",
+              "1                        True  \n",
+              "2                       False  "
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 15
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "YHIWvc9Ms-Ll"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "### Solution\n",
+        "\n",
+        "Click below for a solution."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "T5OlrqtdtCIb",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 143
+        },
+        "outputId": "68302cf2-5454-4e78-c2a4-de3617cfd167"
+      },
+      "cell_type": "code",
+      "source": [
+        "cities['Is wide and has saint name'] = (cities['Area square miles'] > 50) & cities['City name'].apply(lambda name: name.startswith('San'))\n",
+        "cities"
+      ],
+      "execution_count": 16,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/html": [
+              "<div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>City name</th>\n",
+              "      <th>Population</th>\n",
+              "      <th>Area square miles</th>\n",
+              "      <th>Population density</th>\n",
+              "      <th>Is wide and has saint name</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>0</th>\n",
+              "      <td>San Francisco</td>\n",
+              "      <td>852469</td>\n",
+              "      <td>46.87</td>\n",
+              "      <td>18187.945381</td>\n",
+              "      <td>False</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>1</th>\n",
+              "      <td>San Jose</td>\n",
+              "      <td>1015785</td>\n",
+              "      <td>176.53</td>\n",
+              "      <td>5754.177760</td>\n",
+              "      <td>True</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>2</th>\n",
+              "      <td>Sacramento</td>\n",
+              "      <td>485199</td>\n",
+              "      <td>97.92</td>\n",
+              "      <td>4955.055147</td>\n",
+              "      <td>False</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>"
+            ],
+            "text/plain": [
+              "       City name  Population  Area square miles  Population density  \\\n",
+              "0  San Francisco      852469              46.87        18187.945381   \n",
+              "1       San Jose     1015785             176.53         5754.177760   \n",
+              "2     Sacramento      485199              97.92         4955.055147   \n",
+              "\n",
+              "   Is wide and has saint name  \n",
+              "0                       False  \n",
+              "1                        True  \n",
+              "2                       False  "
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 16
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "f-xAOJeMiXFB"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "## Indexes\n",
+        "Both `Series` and `DataFrame` objects also define an `index` property that assigns an identifier value to each `Series` item or `DataFrame` row. \n",
+        "\n",
+        "By default, at construction, *pandas* assigns index values that reflect the ordering of the source data. Once created, the index values are stable; that is, they do not change when data is reordered."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "2684gsWNinq9",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 34
+        },
+        "outputId": "b11186b3-1c6a-4aa4-ebd7-00818a7394e7"
+      },
+      "cell_type": "code",
+      "source": [
+        "city_names.index"
+      ],
+      "execution_count": 17,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "RangeIndex(start=0, stop=3, step=1)"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 17
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "F_qPe2TBjfWd",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 34
+        },
+        "outputId": "be2d6e8d-d7ed-4259-e79b-7011b121c3ea"
+      },
+      "cell_type": "code",
+      "source": [
+        "cities.index"
+      ],
+      "execution_count": 18,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/plain": [
+              "RangeIndex(start=0, stop=3, step=1)"
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 18
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "hp2oWY9Slo_h"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "Call `DataFrame.reindex` to manually reorder the rows. For example, the following has the same effect as sorting by city name:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "sN0zUzSAj-U1",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 143
+        },
+        "outputId": "a2f5d075-3435-4209-9e7b-c6bd3a1317b7"
+      },
+      "cell_type": "code",
+      "source": [
+        "cities.reindex([2, 0, 1])"
+      ],
+      "execution_count": 19,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/html": [
+              "<div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>City name</th>\n",
+              "      <th>Population</th>\n",
+              "      <th>Area square miles</th>\n",
+              "      <th>Population density</th>\n",
+              "      <th>Is wide and has saint name</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>2</th>\n",
+              "      <td>Sacramento</td>\n",
+              "      <td>485199</td>\n",
+              "      <td>97.92</td>\n",
+              "      <td>4955.055147</td>\n",
+              "      <td>False</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>0</th>\n",
+              "      <td>San Francisco</td>\n",
+              "      <td>852469</td>\n",
+              "      <td>46.87</td>\n",
+              "      <td>18187.945381</td>\n",
+              "      <td>False</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>1</th>\n",
+              "      <td>San Jose</td>\n",
+              "      <td>1015785</td>\n",
+              "      <td>176.53</td>\n",
+              "      <td>5754.177760</td>\n",
+              "      <td>True</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>"
+            ],
+            "text/plain": [
+              "       City name  Population  Area square miles  Population density  \\\n",
+              "2     Sacramento      485199              97.92         4955.055147   \n",
+              "0  San Francisco      852469              46.87        18187.945381   \n",
+              "1       San Jose     1015785             176.53         5754.177760   \n",
+              "\n",
+              "   Is wide and has saint name  \n",
+              "2                       False  \n",
+              "0                       False  \n",
+              "1                        True  "
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 19
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "-GQFz8NZuS06"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "Reindexing is a great way to shuffle (randomize) a `DataFrame`. In the example below, we take the index, which is array-like, and pass it to NumPy's `random.permutation` function, which shuffles its values in place. Calling `reindex` with this shuffled array causes the `DataFrame` rows to be shuffled in the same way.\n",
+        "Try running the following cell multiple times!"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "mF8GC0k8uYhz",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 143
+        },
+        "outputId": "d073772b-41d3-4260-d9d4-12520da345ec"
+      },
+      "cell_type": "code",
+      "source": [
+        "cities.reindex(np.random.permutation(cities.index))"
+      ],
+      "execution_count": 20,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/html": [
+              "<div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>City name</th>\n",
+              "      <th>Population</th>\n",
+              "      <th>Area square miles</th>\n",
+              "      <th>Population density</th>\n",
+              "      <th>Is wide and has saint name</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>0</th>\n",
+              "      <td>San Francisco</td>\n",
+              "      <td>852469</td>\n",
+              "      <td>46.87</td>\n",
+              "      <td>18187.945381</td>\n",
+              "      <td>False</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>1</th>\n",
+              "      <td>San Jose</td>\n",
+              "      <td>1015785</td>\n",
+              "      <td>176.53</td>\n",
+              "      <td>5754.177760</td>\n",
+              "      <td>True</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>2</th>\n",
+              "      <td>Sacramento</td>\n",
+              "      <td>485199</td>\n",
+              "      <td>97.92</td>\n",
+              "      <td>4955.055147</td>\n",
+              "      <td>False</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>"
+            ],
+            "text/plain": [
+              "       City name  Population  Area square miles  Population density  \\\n",
+              "0  San Francisco      852469              46.87        18187.945381   \n",
+              "1       San Jose     1015785             176.53         5754.177760   \n",
+              "2     Sacramento      485199              97.92         4955.055147   \n",
+              "\n",
+              "   Is wide and has saint name  \n",
+              "0                       False  \n",
+              "1                        True  \n",
+              "2                       False  "
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 20
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "fSso35fQmGKb"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "For more information, see the [Index documentation](http://pandas.pydata.org/pandas-docs/stable/indexing.html#index-objects)."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "8UngIdVhz8C0"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "## Exercise #2\n",
+        "\n",
+        "The `reindex` method allows index values that are not in the original `DataFrame`'s index values. Try it and see what happens if you use such values! Why do you think this is allowed?"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "PN55GrDX0jzO",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 175
+        },
+        "outputId": "6d125adc-a36f-4717-b964-8b204a63b250"
+      },
+      "cell_type": "code",
+      "source": [
+        "# Your code here\n",
+        "cities.reindex([0, 4, 5, 2])"
+      ],
+      "execution_count": 21,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/html": [
+              "<div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>City name</th>\n",
+              "      <th>Population</th>\n",
+              "      <th>Area square miles</th>\n",
+              "      <th>Population density</th>\n",
+              "      <th>Is wide and has saint name</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>0</th>\n",
+              "      <td>San Francisco</td>\n",
+              "      <td>852469.0</td>\n",
+              "      <td>46.87</td>\n",
+              "      <td>18187.945381</td>\n",
+              "      <td>False</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>4</th>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>5</th>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>2</th>\n",
+              "      <td>Sacramento</td>\n",
+              "      <td>485199.0</td>\n",
+              "      <td>97.92</td>\n",
+              "      <td>4955.055147</td>\n",
+              "      <td>False</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>"
+            ],
+            "text/plain": [
+              "       City name  Population  Area square miles  Population density  \\\n",
+              "0  San Francisco    852469.0              46.87        18187.945381   \n",
+              "4            NaN         NaN                NaN                 NaN   \n",
+              "5            NaN         NaN                NaN                 NaN   \n",
+              "2     Sacramento    485199.0              97.92         4955.055147   \n",
+              "\n",
+              "  Is wide and has saint name  \n",
+              "0                      False  \n",
+              "4                        NaN  \n",
+              "5                        NaN  \n",
+              "2                      False  "
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 21
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "TJffr5_Jwqvd"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "### Solution\n",
+        "\n",
+        "Click below for the solution."
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "8oSvi2QWwuDH"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "If your `reindex` input array includes values not in the original `DataFrame` index values, `reindex` will add new rows for these \"missing\" indices and populate all corresponding columns with `NaN` values:"
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "code",
+        "id": "yBdkucKCwy4x",
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 175
+        },
+        "outputId": "6ef4d8ea-5e17-4387-c0f5-3f5ff8f28cb6"
+      },
+      "cell_type": "code",
+      "source": [
+        "cities.reindex([0, 4, 5, 2])"
+      ],
+      "execution_count": 22,
+      "outputs": [
+        {
+          "output_type": "execute_result",
+          "data": {
+            "text/html": [
+              "<div>\n",
+              "<style scoped>\n",
+              "    .dataframe tbody tr th:only-of-type {\n",
+              "        vertical-align: middle;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe tbody tr th {\n",
+              "        vertical-align: top;\n",
+              "    }\n",
+              "\n",
+              "    .dataframe thead th {\n",
+              "        text-align: right;\n",
+              "    }\n",
+              "</style>\n",
+              "<table border=\"1\" class=\"dataframe\">\n",
+              "  <thead>\n",
+              "    <tr style=\"text-align: right;\">\n",
+              "      <th></th>\n",
+              "      <th>City name</th>\n",
+              "      <th>Population</th>\n",
+              "      <th>Area square miles</th>\n",
+              "      <th>Population density</th>\n",
+              "      <th>Is wide and has saint name</th>\n",
+              "    </tr>\n",
+              "  </thead>\n",
+              "  <tbody>\n",
+              "    <tr>\n",
+              "      <th>0</th>\n",
+              "      <td>San Francisco</td>\n",
+              "      <td>852469.0</td>\n",
+              "      <td>46.87</td>\n",
+              "      <td>18187.945381</td>\n",
+              "      <td>False</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>4</th>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>5</th>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "      <td>NaN</td>\n",
+              "    </tr>\n",
+              "    <tr>\n",
+              "      <th>2</th>\n",
+              "      <td>Sacramento</td>\n",
+              "      <td>485199.0</td>\n",
+              "      <td>97.92</td>\n",
+              "      <td>4955.055147</td>\n",
+              "      <td>False</td>\n",
+              "    </tr>\n",
+              "  </tbody>\n",
+              "</table>\n",
+              "</div>"
+            ],
+            "text/plain": [
+              "       City name  Population  Area square miles  Population density  \\\n",
+              "0  San Francisco    852469.0              46.87        18187.945381   \n",
+              "4            NaN         NaN                NaN                 NaN   \n",
+              "5            NaN         NaN                NaN                 NaN   \n",
+              "2     Sacramento    485199.0              97.92         4955.055147   \n",
+              "\n",
+              "  Is wide and has saint name  \n",
+              "0                      False  \n",
+              "4                        NaN  \n",
+              "5                        NaN  \n",
+              "2                      False  "
+            ]
+          },
+          "metadata": {
+            "tags": []
+          },
+          "execution_count": 22
+        }
+      ]
+    },
+    {
+      "metadata": {
+        "colab_type": "text",
+        "id": "2l82PhPbwz7g"
+      },
+      "cell_type": "markdown",
+      "source": [
+        "This behavior is desirable because indexes are often strings pulled from the actual data (see the [*pandas* reindex\n",
+        "documentation](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.reindex.html) for an example\n",
+        "in which the index values are browser names).\n",
+        "\n",
+        "In this case, allowing \"missing\" indices makes it easy to reindex using an external list, as you don't have to worry about\n",
+        "sanitizing the input."
+      ]
+    }
+  ]
+}
\ No newline at end of file

	longitude	latitude	housing_median_age	total_rooms	total_bedrooms	population	households	median_income	median_house_value
count	17000.000000	17000.000000	17000.000000	17000.000000	17000.000000	17000.000000	17000.000000	17000.000000	17000.000000
mean	-119.562108	35.625225	28.589353	2643.664412	539.410824	1429.573941	501.221941	3.883578	207300.912353
std	2.005166	2.137340	12.586937	2179.947071	421.499452	1147.852959	384.520841	1.908157	115983.764387
min	-124.350000	32.540000	1.000000	2.000000	1.000000	3.000000	1.000000	0.499900	14999.000000
25%	-121.790000	33.930000	18.000000	1462.000000	297.000000	790.000000	282.000000	2.566375	119400.000000
50%	-118.490000	34.250000	29.000000	2127.000000	434.000000	1167.000000	409.000000	3.544600	180400.000000
75%	-118.000000	37.720000	37.000000	3151.250000	648.250000	1721.000000	605.250000	4.767000	265000.000000
max	-114.310000	41.950000	52.000000	37937.000000	6445.000000	35682.000000	6082.000000	15.000100	500001.000000
	longitude	latitude	housing_median_age	total_rooms	total_bedrooms	population	households	median_income	median_house_value
0	-114.31	34.19	15.0	5612.0	1283.0	1015.0	472.0	1.4936	66900.0
1	-114.47	34.40	19.0	7650.0	1901.0	1129.0	463.0	1.8200	80100.0
2	-114.56	33.69	17.0	720.0	174.0	333.0	117.0	1.6509	85700.0
3	-114.57	33.64	14.0	1501.0	337.0	515.0	226.0	3.1917	73400.0
4	-114.57	33.57	20.0	1454.0	326.0	624.0	262.0	1.9250	65500.0
	City name	Population	Area square miles	Population density
0	San Francisco	852469	46.87	18187.945381
1	San Jose	1015785	176.53	5754.177760
2	Sacramento	485199	97.92	4955.055147
	City name	Population	Area square miles	Population density	Is wide and has saint name
0	San Francisco	852469.0	46.87	18187.945381	False
4	NaN	NaN	NaN	NaN	NaN
5	NaN	NaN	NaN	NaN	NaN
2	Sacramento	485199.0	97.92	4955.055147	False