diff --git a/.ipynb_checkpoints/lab-hypothesis-testing-checkpoint.ipynb b/.ipynb_checkpoints/lab-hypothesis-testing-checkpoint.ipynb new file mode 100644 index 0000000..465f94a --- /dev/null +++ b/.ipynb_checkpoints/lab-hypothesis-testing-checkpoint.ipynb @@ -0,0 +1,3019 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Lab | Hypothesis Testing" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Objective**\n", + "\n", + "Welcome to the Hypothesis Testing Lab, where we embark on an enlightening journey through the realm of statistical decision-making! In this laboratory, we delve into various scenarios, applying the powerful tools of hypothesis testing to scrutinize and interpret data.\n", + "\n", + "From testing the mean of a single sample (One Sample T-Test), to investigating differences between independent groups (Two Sample T-Test), and exploring relationships within dependent samples (Paired Sample T-Test), our exploration knows no bounds. Furthermore, we'll venture into the realm of Analysis of Variance (ANOVA), unraveling the complexities of comparing means across multiple groups.\n", + "\n", + "So, grab your statistical tools, prepare your hypotheses, and let's embark on this fascinating journey of exploration and discovery in the world of hypothesis testing!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Challenge 1**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this challenge, we will be working with pokemon data. The data can be found here:\n", + "\n", + "- https://raw.githubusercontent.com/data-bootcamp-v4/data/main/pokemon.csv" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "#libraries\n", + "import pandas as pd\n", + "import scipy.stats as st\n", + "import numpy as np\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
740SkiddoGrassNaN6665486257526False
741GogoatGrassNaN123100629781686False
742PanchamFightingNaN6782624648436False
743PangoroFightingDark95124786971586False
744FurfrouNormalNaN75806065901026False
745EspurrPsychicNaN6248546360686False
746Meowstic MalePsychicNaN74487683811046False
747Meowstic FemalePsychicNaN74487683811046False
748HonedgeSteelGhost45801003537286False
749DoubladeSteelGhost591101504549356False
750Aegislash Blade FormeSteelGhost601505015050606False
751Aegislash Shield FormeSteelGhost605015050150606False
752SpritzeeFairyNaN7852606365236False
753AromatisseFairyNaN10172729989296False
754SwirlixFairyNaN6248665957496False
755SlurpuffFairyNaN8280868575726False
756InkayDarkPsychic5354533746456False
757MalamarDarkPsychic8692886875736False
758BinacleRockWater4252673956506False
759BarbaracleRockWater721051155486686False
760SkrelpPoisonWater5060606060306False
761DragalgePoisonDragon65759097123446False
762ClauncherWaterNaN5053625863446False
763ClawitzerWaterNaN71738812089596False
764HelioptileElectricNormal4438336143706False
765HelioliskElectricNormal625552109941096False
766TyruntRockDragon5889774545486False
767TyrantrumRockDragon821211196959716False
768AmauraRockIce7759506763466False
769AurorusRockIce12377729992586False
770SylveonFairyNaN956565110130606False
771HawluchaFightingFlying78927574631186False
772DedenneElectricFairy67585781671016False
773CarbinkRockFairy505015050150506False
774GoomyDragonNaN4550355575406False
775SliggooDragonNaN68755383113606False
776GoodraDragonNaN9010070110150806False
777KlefkiSteelFairy5780918087756False
778PhantumpGhostGrass4370485060386False
779TrevenantGhostGrass85110766582566False
780Pumpkaboo Average SizeGhostGrass4966704455516False
781Pumpkaboo Small SizeGhostGrass4466704455566False
782Pumpkaboo Large SizeGhostGrass5466704455466False
783Pumpkaboo Super SizeGhostGrass5966704455416False
784Gourgeist Average SizeGhostGrass65901225875846False
785Gourgeist Small SizeGhostGrass55851225875996False
786Gourgeist Large SizeGhostGrass75951225875696False
787Gourgeist Super SizeGhostGrass851001225875546False
788BergmiteIceNaN5569853235286False
789AvaluggIceNaN951171844446286False
790NoibatFlyingDragon4030354540556False
791NoivernFlyingDragon85708097801236False
792XerneasFairyNaN1261319513198996True
793YveltalDarkFlying1261319513198996True
794Zygarde Half FormeDragonGround1081001218195956True
795DiancieRockFairy50100150100150506True
796Mega DiancieRockFairy501601101601101106True
797Hoopa ConfinedPsychicGhost8011060150130706True
798Hoopa UnboundPsychicDark8016060170130806True
799VolcanionFireWater8011012013090706True
\n", + "
" + ], + "text/plain": [ + " Name Type 1 Type 2 HP Attack Defense Sp. Atk \\\n", + "740 Skiddo Grass NaN 66 65 48 62 \n", + "741 Gogoat Grass NaN 123 100 62 97 \n", + "742 Pancham Fighting NaN 67 82 62 46 \n", + "743 Pangoro Fighting Dark 95 124 78 69 \n", + "744 Furfrou Normal NaN 75 80 60 65 \n", + "745 Espurr Psychic NaN 62 48 54 63 \n", + "746 Meowstic Male Psychic NaN 74 48 76 83 \n", + "747 Meowstic Female Psychic NaN 74 48 76 83 \n", + "748 Honedge Steel Ghost 45 80 100 35 \n", + "749 Doublade Steel Ghost 59 110 150 45 \n", + "750 Aegislash Blade Forme Steel Ghost 60 150 50 150 \n", + "751 Aegislash Shield Forme Steel Ghost 60 50 150 50 \n", + "752 Spritzee Fairy NaN 78 52 60 63 \n", + "753 Aromatisse Fairy NaN 101 72 72 99 \n", + "754 Swirlix Fairy NaN 62 48 66 59 \n", + "755 Slurpuff Fairy NaN 82 80 86 85 \n", + "756 Inkay Dark Psychic 53 54 53 37 \n", + "757 Malamar Dark Psychic 86 92 88 68 \n", + "758 Binacle Rock Water 42 52 67 39 \n", + "759 Barbaracle Rock Water 72 105 115 54 \n", + "760 Skrelp Poison Water 50 60 60 60 \n", + "761 Dragalge Poison Dragon 65 75 90 97 \n", + "762 Clauncher Water NaN 50 53 62 58 \n", + "763 Clawitzer Water NaN 71 73 88 120 \n", + "764 Helioptile Electric Normal 44 38 33 61 \n", + "765 Heliolisk Electric Normal 62 55 52 109 \n", + "766 Tyrunt Rock Dragon 58 89 77 45 \n", + "767 Tyrantrum Rock Dragon 82 121 119 69 \n", + "768 Amaura Rock Ice 77 59 50 67 \n", + "769 Aurorus Rock Ice 123 77 72 99 \n", + "770 Sylveon Fairy NaN 95 65 65 110 \n", + "771 Hawlucha Fighting Flying 78 92 75 74 \n", + "772 Dedenne Electric Fairy 67 58 57 81 \n", + "773 Carbink Rock Fairy 50 50 150 50 \n", + "774 Goomy Dragon NaN 45 50 35 55 \n", + "775 Sliggoo Dragon NaN 68 75 53 83 \n", + "776 Goodra Dragon NaN 90 100 70 110 \n", + "777 Klefki Steel Fairy 57 80 91 80 \n", + "778 Phantump Ghost Grass 43 70 48 50 \n", + "779 Trevenant Ghost Grass 85 110 76 65 \n", + "780 Pumpkaboo Average Size Ghost Grass 49 66 70 44 \n", + "781 Pumpkaboo Small Size Ghost Grass 44 66 70 44 \n", + "782 Pumpkaboo Large Size Ghost Grass 54 66 70 44 \n", + "783 Pumpkaboo Super Size Ghost Grass 59 66 70 44 \n", + "784 Gourgeist Average Size Ghost Grass 65 90 122 58 \n", + "785 Gourgeist Small Size Ghost Grass 55 85 122 58 \n", + "786 Gourgeist Large Size Ghost Grass 75 95 122 58 \n", + "787 Gourgeist Super Size Ghost Grass 85 100 122 58 \n", + "788 Bergmite Ice NaN 55 69 85 32 \n", + "789 Avalugg Ice NaN 95 117 184 44 \n", + "790 Noibat Flying Dragon 40 30 35 45 \n", + "791 Noivern Flying Dragon 85 70 80 97 \n", + "792 Xerneas Fairy NaN 126 131 95 131 \n", + "793 Yveltal Dark Flying 126 131 95 131 \n", + "794 Zygarde Half Forme Dragon Ground 108 100 121 81 \n", + "795 Diancie Rock Fairy 50 100 150 100 \n", + "796 Mega Diancie Rock Fairy 50 160 110 160 \n", + "797 Hoopa Confined Psychic Ghost 80 110 60 150 \n", + "798 Hoopa Unbound Psychic Dark 80 160 60 170 \n", + "799 Volcanion Fire Water 80 110 120 130 \n", + "\n", + " Sp. Def Speed Generation Legendary \n", + "740 57 52 6 False \n", + "741 81 68 6 False \n", + "742 48 43 6 False \n", + "743 71 58 6 False \n", + "744 90 102 6 False \n", + "745 60 68 6 False \n", + "746 81 104 6 False \n", + "747 81 104 6 False \n", + "748 37 28 6 False \n", + "749 49 35 6 False \n", + "750 50 60 6 False \n", + "751 150 60 6 False \n", + "752 65 23 6 False \n", + "753 89 29 6 False \n", + "754 57 49 6 False \n", + "755 75 72 6 False \n", + "756 46 45 6 False \n", + "757 75 73 6 False \n", + "758 56 50 6 False \n", + "759 86 68 6 False \n", + "760 60 30 6 False \n", + "761 123 44 6 False \n", + "762 63 44 6 False \n", + "763 89 59 6 False \n", + "764 43 70 6 False \n", + "765 94 109 6 False \n", + "766 45 48 6 False \n", + "767 59 71 6 False \n", + "768 63 46 6 False \n", + "769 92 58 6 False \n", + "770 130 60 6 False \n", + "771 63 118 6 False \n", + "772 67 101 6 False \n", + "773 150 50 6 False \n", + "774 75 40 6 False \n", + "775 113 60 6 False \n", + "776 150 80 6 False \n", + "777 87 75 6 False \n", + "778 60 38 6 False \n", + "779 82 56 6 False \n", + "780 55 51 6 False \n", + "781 55 56 6 False \n", + "782 55 46 6 False \n", + "783 55 41 6 False \n", + "784 75 84 6 False \n", + "785 75 99 6 False \n", + "786 75 69 6 False \n", + "787 75 54 6 False \n", + "788 35 28 6 False \n", + "789 46 28 6 False \n", + "790 40 55 6 False \n", + "791 80 123 6 False \n", + "792 98 99 6 True \n", + "793 98 99 6 True \n", + "794 95 95 6 True \n", + "795 150 50 6 True \n", + "796 110 110 6 True \n", + "797 130 70 6 True \n", + "798 130 80 6 True \n", + "799 90 70 6 True " + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df = pd.read_csv(\"https://raw.githubusercontent.com/data-bootcamp-v4/data/main/pokemon.csv\")\n", + "df.tail(60)" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0 Bulbasaur\n", + "1 Ivysaur\n", + "2 Venusaur\n", + "3 Mega Venusaur\n", + "4 Charmander\n", + " ... \n", + "795 Diancie\n", + "796 Mega Diancie\n", + "797 Hoopa Confined\n", + "798 Hoopa Unbound\n", + "799 Volcanion\n", + "Name: Name, Length: 800, dtype: object" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_name = df[\"Name\"]\n", + "pokemon_name" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Index(['Name', 'Type 1', 'Type 2', 'HP', 'Attack', 'Defense', 'Sp. Atk',\n", + " 'Sp. Def', 'Speed', 'Generation', 'Legendary'],\n", + " dtype='object')" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df.columns" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "- We posit that Pokemons of type Dragon have, on average, more HP stats than Grass. Choose the propper test and, with 5% significance, comment your findings." + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0 45\n", + "1 60\n", + "2 80\n", + "3 80\n", + "48 45\n", + " ..\n", + "783 59\n", + "784 65\n", + "785 55\n", + "786 75\n", + "787 85\n", + "Name: HP, Length: 95, dtype: int64" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_grass = df[(df[\"Type 1\"]==\"Grass\") | (df[\"Type 2\"]==\"Grass\")] [\"HP\"]\n", + "pokemon_grass" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "7 78\n", + "159 41\n", + "160 61\n", + "161 91\n", + "196 90\n", + "249 75\n", + "275 70\n", + "360 50\n", + "361 80\n", + "365 75\n", + "366 75\n", + "406 45\n", + "407 65\n", + "408 95\n", + "409 95\n", + "417 80\n", + "418 80\n", + "419 80\n", + "420 80\n", + "425 105\n", + "426 105\n", + "491 58\n", + "492 68\n", + "493 108\n", + "494 108\n", + "540 100\n", + "541 90\n", + "544 150\n", + "545 150\n", + "671 46\n", + "672 66\n", + "673 76\n", + "682 77\n", + "694 52\n", + "695 72\n", + "696 92\n", + "706 100\n", + "707 100\n", + "710 125\n", + "711 125\n", + "712 125\n", + "761 65\n", + "766 58\n", + "767 82\n", + "774 45\n", + "775 68\n", + "776 90\n", + "790 40\n", + "791 85\n", + "794 108\n", + "Name: HP, dtype: int64" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#code here\n", + "pokemon_dragon = df[(df[\"Type 1\"]==\"Dragon\") | (df[\"Type 2\"]==\"Dragon\")][\"HP\"] # df not saved filtered df, just HP values\n", + "pokemon_dragon" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# 5% significance, 95% confidence level\n", + "# This is the long version.\n", + "# python shorter version below\n", + "\n", + "#sample mean\n", + "# mean = pokemon_dragon.mean()\n", + "\n", + "#standard deviation of sample\n", + "#s = pokemon_dragon.std(ddof=1)\n", + "\n", + "#sample size\n", + "#n = len(pokemon_dragon)\n", + "\n", + "#hypothesized population mean\n", + "#mu = ?\n", + "\n", + "#stat = (mean - mu)/(s/np.sqrt(n))\n", + "#stat" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "TtestResult(statistic=np.float64(4.097528915272702), pvalue=np.float64(0.00010181538122353851), df=np.float64(77.58086781513519))" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# python shorter version\n", + "# two sample T-test Dragon greater on average, more HP stats than Grass\n", + "# with 5% significance\n", + "# ttest_ind(sample1,sample2,alternative = \"greater\")\n", + "\n", + "st.ttest_ind(pokemon_dragon,pokemon_grass, equal_var=False, alternative = \"two-sided\")\n" + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "7 78\n", + "159 41\n", + "160 61\n", + "161 91\n", + "196 90\n", + "249 75\n", + "275 70\n", + "360 50\n", + "361 80\n", + "365 75\n", + "366 75\n", + "406 45\n", + "407 65\n", + "408 95\n", + "409 95\n", + "417 80\n", + "418 80\n", + "419 80\n", + "420 80\n", + "425 105\n", + "426 105\n", + "491 58\n", + "492 68\n", + "493 108\n", + "494 108\n", + "540 100\n", + "541 90\n", + "544 150\n", + "545 150\n", + "671 46\n", + "672 66\n", + "673 76\n", + "682 77\n", + "694 52\n", + "695 72\n", + "696 92\n", + "706 100\n", + "707 100\n", + "710 125\n", + "711 125\n", + "712 125\n", + "761 65\n", + "766 58\n", + "767 82\n", + "774 45\n", + "775 68\n", + "776 90\n", + "790 40\n", + "791 85\n", + "794 108\n", + "Name: HP, dtype: int64" + ] + }, + "execution_count": 48, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_dragon" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Because p_value is lower than significance level, we reject the null hypothesis, this means that Dragon on average don't have more HP stats than Grass" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "- We posit that Legendary Pokemons have different stats (HP, Attack, Defense, Sp.Atk, Sp.Def, Speed) when comparing with Non-Legendary. Choose the propper test and, with 5% significance, comment your findings.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": {}, + "outputs": [], + "source": [ + "pokemon_legendary = df[(df[\"Legendary\"]==True)]\n", + "pokemon_non_legendary = df[(df[\"Legendary\"]==False)]" + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
156ArticunoIceFlying908510095125851True
157ZapdosElectricFlying909085125901001True
158MoltresFireFlying901009012585901True
162MewtwoPsychicNaN10611090154901301True
163Mega Mewtwo XPsychicFighting1061901001541001301True
....................................
795DiancieRockFairy50100150100150506True
796Mega DiancieRockFairy501601101601101106True
797Hoopa ConfinedPsychicGhost8011060150130706True
798Hoopa UnboundPsychicDark8016060170130806True
799VolcanionFireWater8011012013090706True
\n", + "

65 rows × 11 columns

\n", + "
" + ], + "text/plain": [ + " Name Type 1 Type 2 HP Attack Defense Sp. Atk \\\n", + "156 Articuno Ice Flying 90 85 100 95 \n", + "157 Zapdos Electric Flying 90 90 85 125 \n", + "158 Moltres Fire Flying 90 100 90 125 \n", + "162 Mewtwo Psychic NaN 106 110 90 154 \n", + "163 Mega Mewtwo X Psychic Fighting 106 190 100 154 \n", + ".. ... ... ... ... ... ... ... \n", + "795 Diancie Rock Fairy 50 100 150 100 \n", + "796 Mega Diancie Rock Fairy 50 160 110 160 \n", + "797 Hoopa Confined Psychic Ghost 80 110 60 150 \n", + "798 Hoopa Unbound Psychic Dark 80 160 60 170 \n", + "799 Volcanion Fire Water 80 110 120 130 \n", + "\n", + " Sp. Def Speed Generation Legendary \n", + "156 125 85 1 True \n", + "157 90 100 1 True \n", + "158 85 90 1 True \n", + "162 90 130 1 True \n", + "163 100 130 1 True \n", + ".. ... ... ... ... \n", + "795 150 50 6 True \n", + "796 110 110 6 True \n", + "797 130 70 6 True \n", + "798 130 80 6 True \n", + "799 90 70 6 True \n", + "\n", + "[65 rows x 11 columns]" + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_legendary" + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
0BulbasaurGrassPoison4549496565451False
1IvysaurGrassPoison6062638080601False
2VenusaurGrassPoison808283100100801False
3Mega VenusaurGrassPoison80100123122120801False
4CharmanderFireNaN3952436050651False
....................................
787Gourgeist Super SizeGhostGrass851001225875546False
788BergmiteIceNaN5569853235286False
789AvaluggIceNaN951171844446286False
790NoibatFlyingDragon4030354540556False
791NoivernFlyingDragon85708097801236False
\n", + "

735 rows × 11 columns

\n", + "
" + ], + "text/plain": [ + " Name Type 1 Type 2 HP Attack Defense Sp. Atk \\\n", + "0 Bulbasaur Grass Poison 45 49 49 65 \n", + "1 Ivysaur Grass Poison 60 62 63 80 \n", + "2 Venusaur Grass Poison 80 82 83 100 \n", + "3 Mega Venusaur Grass Poison 80 100 123 122 \n", + "4 Charmander Fire NaN 39 52 43 60 \n", + ".. ... ... ... .. ... ... ... \n", + "787 Gourgeist Super Size Ghost Grass 85 100 122 58 \n", + "788 Bergmite Ice NaN 55 69 85 32 \n", + "789 Avalugg Ice NaN 95 117 184 44 \n", + "790 Noibat Flying Dragon 40 30 35 45 \n", + "791 Noivern Flying Dragon 85 70 80 97 \n", + "\n", + " Sp. Def Speed Generation Legendary \n", + "0 65 45 1 False \n", + "1 80 60 1 False \n", + "2 100 80 1 False \n", + "3 120 80 1 False \n", + "4 50 65 1 False \n", + ".. ... ... ... ... \n", + "787 75 54 6 False \n", + "788 35 28 6 False \n", + "789 46 28 6 False \n", + "790 40 55 6 False \n", + "791 80 123 6 False \n", + "\n", + "[735 rows x 11 columns]" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_non_legendary" + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0 65\n", + "1 80\n", + "2 100\n", + "3 122\n", + "4 60\n", + " ... \n", + "787 58\n", + "788 32\n", + "789 44\n", + "790 45\n", + "791 97\n", + "Name: Sp. Atk, Length: 735, dtype: int64" + ] + }, + "execution_count": 61, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_non_legendary[\"Sp. Atk\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Name object\n", + "Type 1 object\n", + "Type 2 object\n", + "HP int64\n", + "Attack int64\n", + "Defense int64\n", + "Sp. Atk int64\n", + "Sp. Def int64\n", + "Speed int64\n", + "Generation int64\n", + "Legendary bool\n", + "dtype: object" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df.dtypes" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": 56, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "the pvalue of HP is 1.0026911708035284e-13\n", + "reject the null hypothesis\n", + "the pvalue of Attack is 2.520372449236646e-16\n", + "reject the null hypothesis\n", + "the pvalue of Defense is 4.8269984949193316e-11\n", + "reject the null hypothesis\n", + "the pvalue of Sp. Atk is 1.5514614112239812e-21\n", + "reject the null hypothesis\n", + "the pvalue of Sp. Def is 2.2949327864052826e-15\n", + "reject the null hypothesis\n", + "the pvalue of Speed is 1.049016311882451e-18\n", + "reject the null hypothesis\n" + ] + } + ], + "source": [ + "# Create a for loop:\n", + "\n", + "stats = [\"HP\", \"Attack\", \"Defense\", \"Sp. Atk\", \"Sp. Def\", \"Speed\"] # list with quotations is strings, without quotations would be a list of different variables.\n", + "\n", + "# for vs while\n", + "# we know the iterations, vs we know the condition\n", + "\n", + "for x in stats:\n", + " stat, pvalue = st.ttest_ind(pokemon_legendary[x], pokemon_non_legendary[x], equal_var=False, alternative = \"two-sided\")\n", + " print(f\"the pvalue of {x} is {pvalue}\") # {} when you do f\"string \"\" and the variable changes\n", + " if pvalue < .05:\n", + " \n", + " print(\"reject the null hypothesis\")\n", + " else:\n", + " print(\"unable to reject the null hypothesis\")\n", + " " + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "156 90\n", + "157 90\n", + "158 90\n", + "162 106\n", + "163 106\n", + " ... \n", + "795 50\n", + "796 50\n", + "797 80\n", + "798 80\n", + "799 80\n", + "Name: HP, Length: 65, dtype: int64" + ] + }, + "execution_count": 51, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_legendary[\"HP\"]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Challenge 2**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this challenge, we will be working with california-housing data. The data can be found here:\n", + "- https://raw.githubusercontent.com/data-bootcamp-v4/data/main/california_housing.csv" + ] + }, + { + "cell_type": "code", + "execution_count": 57, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
longitudelatitudehousing_median_agetotal_roomstotal_bedroomspopulationhouseholdsmedian_incomemedian_house_value
0-114.3134.1915.05612.01283.01015.0472.01.493666900.0
1-114.4734.4019.07650.01901.01129.0463.01.820080100.0
2-114.5633.6917.0720.0174.0333.0117.01.650985700.0
3-114.5733.6414.01501.0337.0515.0226.03.191773400.0
4-114.5733.5720.01454.0326.0624.0262.01.925065500.0
\n", + "
" + ], + "text/plain": [ + " longitude latitude housing_median_age total_rooms total_bedrooms \\\n", + "0 -114.31 34.19 15.0 5612.0 1283.0 \n", + "1 -114.47 34.40 19.0 7650.0 1901.0 \n", + "2 -114.56 33.69 17.0 720.0 174.0 \n", + "3 -114.57 33.64 14.0 1501.0 337.0 \n", + "4 -114.57 33.57 20.0 1454.0 326.0 \n", + "\n", + " population households median_income median_house_value \n", + "0 1015.0 472.0 1.4936 66900.0 \n", + "1 1129.0 463.0 1.8200 80100.0 \n", + "2 333.0 117.0 1.6509 85700.0 \n", + "3 515.0 226.0 3.1917 73400.0 \n", + "4 624.0 262.0 1.9250 65500.0 " + ] + }, + "execution_count": 57, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df = pd.read_csv(\"https://raw.githubusercontent.com/data-bootcamp-v4/data/main/california_housing.csv\")\n", + "df.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**We posit that houses close to either a school or a hospital are more expensive.**\n", + "\n", + "- School coordinates (-118, 34)\n", + "- Hospital coordinates (-122, 37)\n", + "\n", + "We consider a house (neighborhood) to be close to a school or hospital if the distance is lower than 0.50.\n", + "\n", + "Hint:\n", + "- Write a function to calculate euclidean distance from each house (neighborhood) to the school and to the hospital.\n", + "- Divide your dataset into houses close and far from either a hospital or school.\n", + "- Choose the propper test and, with 5% significance, comment your findings.\n", + " " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# d = √((x₂ - x₁)² + (y₂ - y₁)²),\n" + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": {}, + "outputs": [], + "source": [ + "df[\"distance_to_school\"] = ((df[\"longitude\"]- -118)**2 + (df[\"latitude\"] - 34)**2)**0.5" + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": {}, + "outputs": [], + "source": [ + "df[\"distance_to_hospital\"] = ((df[\"longitude\"]- -122)**2 + (df[\"latitude\"] - 37)**2)**0.5" + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
longitudelatitudehousing_median_agetotal_roomstotal_bedroomspopulationhouseholdsmedian_incomemedian_house_valuedistance_to_schooldistance_to_hospital
0-114.3134.1915.05612.01283.01015.0472.01.493666900.03.6948888.187319
1-114.4734.4019.07650.01901.01129.0463.01.820080100.03.5525917.966235
2-114.5633.6917.0720.0174.0333.0117.01.650985700.03.4539408.143077
3-114.5733.6414.01501.0337.0515.0226.03.191773400.03.4488408.154416
4-114.5733.5720.01454.0326.0624.0262.01.925065500.03.4568488.183508
....................................
16995-124.2640.5852.02217.0394.0907.0369.02.3571111400.09.0820704.233675
16996-124.2740.6936.02349.0528.01194.0465.02.517979000.09.1689154.332320
16997-124.3041.8417.02677.0531.01244.0456.03.0313103600.010.0576145.358694
16998-124.3041.8019.02672.0552.01298.0478.01.979785800.010.0264655.322593
16999-124.3540.5452.01820.0300.0806.0270.03.014794600.09.1155974.249012
\n", + "

17000 rows × 11 columns

\n", + "
" + ], + "text/plain": [ + " longitude latitude housing_median_age total_rooms total_bedrooms \\\n", + "0 -114.31 34.19 15.0 5612.0 1283.0 \n", + "1 -114.47 34.40 19.0 7650.0 1901.0 \n", + "2 -114.56 33.69 17.0 720.0 174.0 \n", + "3 -114.57 33.64 14.0 1501.0 337.0 \n", + "4 -114.57 33.57 20.0 1454.0 326.0 \n", + "... ... ... ... ... ... \n", + "16995 -124.26 40.58 52.0 2217.0 394.0 \n", + "16996 -124.27 40.69 36.0 2349.0 528.0 \n", + "16997 -124.30 41.84 17.0 2677.0 531.0 \n", + "16998 -124.30 41.80 19.0 2672.0 552.0 \n", + "16999 -124.35 40.54 52.0 1820.0 300.0 \n", + "\n", + " population households median_income median_house_value \\\n", + "0 1015.0 472.0 1.4936 66900.0 \n", + "1 1129.0 463.0 1.8200 80100.0 \n", + "2 333.0 117.0 1.6509 85700.0 \n", + "3 515.0 226.0 3.1917 73400.0 \n", + "4 624.0 262.0 1.9250 65500.0 \n", + "... ... ... ... ... \n", + "16995 907.0 369.0 2.3571 111400.0 \n", + "16996 1194.0 465.0 2.5179 79000.0 \n", + "16997 1244.0 456.0 3.0313 103600.0 \n", + "16998 1298.0 478.0 1.9797 85800.0 \n", + "16999 806.0 270.0 3.0147 94600.0 \n", + "\n", + " distance_to_school distance_to_hospital \n", + "0 3.694888 8.187319 \n", + "1 3.552591 7.966235 \n", + "2 3.453940 8.143077 \n", + "3 3.448840 8.154416 \n", + "4 3.456848 8.183508 \n", + "... ... ... \n", + "16995 9.082070 4.233675 \n", + "16996 9.168915 4.332320 \n", + "16997 10.057614 5.358694 \n", + "16998 10.026465 5.322593 \n", + "16999 9.115597 4.249012 \n", + "\n", + "[17000 rows x 11 columns]" + ] + }, + "execution_count": 66, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df" + ] + }, + { + "cell_type": "code", + "execution_count": 67, + "metadata": {}, + "outputs": [], + "source": [ + "close_houses_df = df[(df[\"distance_to_school\"]<0.5) | (df[\"distance_to_hospital\"]<0.5)]" + ] + }, + { + "cell_type": "code", + "execution_count": 68, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
longitudelatitudehousing_median_agetotal_roomstotal_bedroomspopulationhouseholdsmedian_incomemedian_house_valuedistance_to_schooldistance_to_hospital
2366-117.5134.0036.03791.0746.02258.0672.03.2067124700.00.4900005.400009
2367-117.5133.9735.0352.062.0184.057.03.6691137500.00.4909185.416733
2368-117.5133.9512.09016.01486.04285.01457.04.9984169100.00.4925445.427946
2371-117.5233.9914.013562.02057.07600.02086.05.2759182900.00.4801045.397268
2372-117.5233.892.017978.03217.07305.02463.05.1695220800.00.4924435.453668
....................................
15090-122.2537.0820.01201.0282.0601.0234.02.5556177500.05.2487050.262488
15170-122.2637.3828.01103.0164.0415.0154.07.8633500001.05.4380140.460435
15253-122.2737.3237.02607.0534.01346.0507.05.3951277700.05.4088170.418688
15254-122.2737.2430.02762.0593.01581.0502.05.1002319400.05.3600840.361248
15686-122.3837.1852.01746.0315.0941.0220.03.3047286100.05.4126520.420476
\n", + "

6829 rows × 11 columns

\n", + "
" + ], + "text/plain": [ + " longitude latitude housing_median_age total_rooms total_bedrooms \\\n", + "2366 -117.51 34.00 36.0 3791.0 746.0 \n", + "2367 -117.51 33.97 35.0 352.0 62.0 \n", + "2368 -117.51 33.95 12.0 9016.0 1486.0 \n", + "2371 -117.52 33.99 14.0 13562.0 2057.0 \n", + "2372 -117.52 33.89 2.0 17978.0 3217.0 \n", + "... ... ... ... ... ... \n", + "15090 -122.25 37.08 20.0 1201.0 282.0 \n", + "15170 -122.26 37.38 28.0 1103.0 164.0 \n", + "15253 -122.27 37.32 37.0 2607.0 534.0 \n", + "15254 -122.27 37.24 30.0 2762.0 593.0 \n", + "15686 -122.38 37.18 52.0 1746.0 315.0 \n", + "\n", + " population households median_income median_house_value \\\n", + "2366 2258.0 672.0 3.2067 124700.0 \n", + "2367 184.0 57.0 3.6691 137500.0 \n", + "2368 4285.0 1457.0 4.9984 169100.0 \n", + "2371 7600.0 2086.0 5.2759 182900.0 \n", + "2372 7305.0 2463.0 5.1695 220800.0 \n", + "... ... ... ... ... \n", + "15090 601.0 234.0 2.5556 177500.0 \n", + "15170 415.0 154.0 7.8633 500001.0 \n", + "15253 1346.0 507.0 5.3951 277700.0 \n", + "15254 1581.0 502.0 5.1002 319400.0 \n", + "15686 941.0 220.0 3.3047 286100.0 \n", + "\n", + " distance_to_school distance_to_hospital \n", + "2366 0.490000 5.400009 \n", + "2367 0.490918 5.416733 \n", + "2368 0.492544 5.427946 \n", + "2371 0.480104 5.397268 \n", + "2372 0.492443 5.453668 \n", + "... ... ... \n", + "15090 5.248705 0.262488 \n", + "15170 5.438014 0.460435 \n", + "15253 5.408817 0.418688 \n", + "15254 5.360084 0.361248 \n", + "15686 5.412652 0.420476 \n", + "\n", + "[6829 rows x 11 columns]" + ] + }, + "execution_count": 68, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "close_houses_df" + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "metadata": {}, + "outputs": [], + "source": [ + "far_houses_df = df[(df[\"distance_to_school\"]>=0.5) & (df[\"distance_to_hospital\"]>=0.5)]" + ] + }, + { + "cell_type": "code", + "execution_count": 70, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
longitudelatitudehousing_median_agetotal_roomstotal_bedroomspopulationhouseholdsmedian_incomemedian_house_valuedistance_to_schooldistance_to_hospital
0-114.3134.1915.05612.01283.01015.0472.01.493666900.03.6948888.187319
1-114.4734.4019.07650.01901.01129.0463.01.820080100.03.5525917.966235
2-114.5633.6917.0720.0174.0333.0117.01.650985700.03.4539408.143077
3-114.5733.6414.01501.0337.0515.0226.03.191773400.03.4488408.154416
4-114.5733.5720.01454.0326.0624.0262.01.925065500.03.4568488.183508
....................................
16995-124.2640.5852.02217.0394.0907.0369.02.3571111400.09.0820704.233675
16996-124.2740.6936.02349.0528.01194.0465.02.517979000.09.1689154.332320
16997-124.3041.8417.02677.0531.01244.0456.03.0313103600.010.0576145.358694
16998-124.3041.8019.02672.0552.01298.0478.01.979785800.010.0264655.322593
16999-124.3540.5452.01820.0300.0806.0270.03.014794600.09.1155974.249012
\n", + "

10171 rows × 11 columns

\n", + "
" + ], + "text/plain": [ + " longitude latitude housing_median_age total_rooms total_bedrooms \\\n", + "0 -114.31 34.19 15.0 5612.0 1283.0 \n", + "1 -114.47 34.40 19.0 7650.0 1901.0 \n", + "2 -114.56 33.69 17.0 720.0 174.0 \n", + "3 -114.57 33.64 14.0 1501.0 337.0 \n", + "4 -114.57 33.57 20.0 1454.0 326.0 \n", + "... ... ... ... ... ... \n", + "16995 -124.26 40.58 52.0 2217.0 394.0 \n", + "16996 -124.27 40.69 36.0 2349.0 528.0 \n", + "16997 -124.30 41.84 17.0 2677.0 531.0 \n", + "16998 -124.30 41.80 19.0 2672.0 552.0 \n", + "16999 -124.35 40.54 52.0 1820.0 300.0 \n", + "\n", + " population households median_income median_house_value \\\n", + "0 1015.0 472.0 1.4936 66900.0 \n", + "1 1129.0 463.0 1.8200 80100.0 \n", + "2 333.0 117.0 1.6509 85700.0 \n", + "3 515.0 226.0 3.1917 73400.0 \n", + "4 624.0 262.0 1.9250 65500.0 \n", + "... ... ... ... ... \n", + "16995 907.0 369.0 2.3571 111400.0 \n", + "16996 1194.0 465.0 2.5179 79000.0 \n", + "16997 1244.0 456.0 3.0313 103600.0 \n", + "16998 1298.0 478.0 1.9797 85800.0 \n", + "16999 806.0 270.0 3.0147 94600.0 \n", + "\n", + " distance_to_school distance_to_hospital \n", + "0 3.694888 8.187319 \n", + "1 3.552591 7.966235 \n", + "2 3.453940 8.143077 \n", + "3 3.448840 8.154416 \n", + "4 3.456848 8.183508 \n", + "... ... ... \n", + "16995 9.082070 4.233675 \n", + "16996 9.168915 4.332320 \n", + "16997 10.057614 5.358694 \n", + "16998 10.026465 5.322593 \n", + "16999 9.115597 4.249012 \n", + "\n", + "[10171 rows x 11 columns]" + ] + }, + "execution_count": 70, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "far_houses_df" + ] + }, + { + "cell_type": "code", + "execution_count": 72, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "TtestResult(statistic=np.float64(37.992330214201516), pvalue=np.float64(1.5032478884296307e-301), df=np.float64(14571.229910954282))" + ] + }, + "execution_count": 72, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "st.ttest_ind(close_houses_df[\"median_house_value\"], far_houses_df[\"median_house_value\"], equal_var=False, alternative = \"greater\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# don't forget the quotation marks as it is value not a variable\n", + "# reject null hypothesis (really low)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python [conda env:base] *", + "language": "python", + "name": "conda-base-py" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.13.5" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/lab-hypothesis-testing.ipynb b/lab-hypothesis-testing.ipynb index 0cc26d5..465f94a 100644 --- a/lab-hypothesis-testing.ipynb +++ b/lab-hypothesis-testing.ipynb @@ -38,21 +38,22 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "#libraries\n", "import pandas as pd\n", "import scipy.stats as st\n", - "import numpy as np\n", - "\n" + "import numpy as np\n" ] }, { "cell_type": "code", - "execution_count": 3, - "metadata": {}, + "execution_count": 5, + "metadata": { + "scrolled": true + }, "outputs": [ { "data": { @@ -90,76 +91,2217 @@ " \n", " \n", " \n", - " 0\n", - " Bulbasaur\n", + " 740\n", + " Skiddo\n", " Grass\n", - " Poison\n", - " 45\n", - " 49\n", - " 49\n", - " 65\n", + " NaN\n", + " 66\n", " 65\n", - " 45\n", - " 1\n", + " 48\n", + " 62\n", + " 57\n", + " 52\n", + " 6\n", " False\n", " \n", " \n", - " 1\n", - " Ivysaur\n", + " 741\n", + " Gogoat\n", " Grass\n", - " Poison\n", + " NaN\n", + " 123\n", + " 100\n", + " 62\n", + " 97\n", + " 81\n", + " 68\n", + " 6\n", + " False\n", + " \n", + " \n", + " 742\n", + " Pancham\n", + " Fighting\n", + " NaN\n", + " 67\n", + " 82\n", + " 62\n", + " 46\n", + " 48\n", + " 43\n", + " 6\n", + " False\n", + " \n", + " \n", + " 743\n", + " Pangoro\n", + " Fighting\n", + " Dark\n", + " 95\n", + " 124\n", + " 78\n", + " 69\n", + " 71\n", + " 58\n", + " 6\n", + " False\n", + " \n", + " \n", + " 744\n", + " Furfrou\n", + " Normal\n", + " NaN\n", + " 75\n", + " 80\n", " 60\n", + " 65\n", + " 90\n", + " 102\n", + " 6\n", + " False\n", + " \n", + " \n", + " 745\n", + " Espurr\n", + " Psychic\n", + " NaN\n", " 62\n", + " 48\n", + " 54\n", " 63\n", - " 80\n", - " 80\n", " 60\n", - " 1\n", + " 68\n", + " 6\n", " False\n", " \n", " \n", - " 2\n", - " Venusaur\n", - " Grass\n", - " Poison\n", - " 80\n", - " 82\n", + " 746\n", + " Meowstic Male\n", + " Psychic\n", + " NaN\n", + " 74\n", + " 48\n", + " 76\n", " 83\n", - " 100\n", - " 100\n", - " 80\n", - " 1\n", + " 81\n", + " 104\n", + " 6\n", " False\n", " \n", " \n", - " 3\n", - " Mega Venusaur\n", - " Grass\n", - " Poison\n", + " 747\n", + " Meowstic Female\n", + " Psychic\n", + " NaN\n", + " 74\n", + " 48\n", + " 76\n", + " 83\n", + " 81\n", + " 104\n", + " 6\n", + " False\n", + " \n", + " \n", + " 748\n", + " Honedge\n", + " Steel\n", + " Ghost\n", + " 45\n", " 80\n", " 100\n", - " 123\n", - " 122\n", - " 120\n", - " 80\n", - " 1\n", + " 35\n", + " 37\n", + " 28\n", + " 6\n", " False\n", " \n", " \n", - " 4\n", - " Charmander\n", - " Fire\n", + " 749\n", + " Doublade\n", + " Steel\n", + " Ghost\n", + " 59\n", + " 110\n", + " 150\n", + " 45\n", + " 49\n", + " 35\n", + " 6\n", + " False\n", + " \n", + " \n", + " 750\n", + " Aegislash Blade Forme\n", + " Steel\n", + " Ghost\n", + " 60\n", + " 150\n", + " 50\n", + " 150\n", + " 50\n", + " 60\n", + " 6\n", + " False\n", + " \n", + " \n", + " 751\n", + " Aegislash Shield Forme\n", + " Steel\n", + " Ghost\n", + " 60\n", + " 50\n", + " 150\n", + " 50\n", + " 150\n", + " 60\n", + " 6\n", + " False\n", + " \n", + " \n", + " 752\n", + " Spritzee\n", + " Fairy\n", " NaN\n", - " 39\n", + " 78\n", " 52\n", - " 43\n", " 60\n", + " 63\n", + " 65\n", + " 23\n", + " 6\n", + " False\n", + " \n", + " \n", + " 753\n", + " Aromatisse\n", + " Fairy\n", + " NaN\n", + " 101\n", + " 72\n", + " 72\n", + " 99\n", + " 89\n", + " 29\n", + " 6\n", + " False\n", + " \n", + " \n", + " 754\n", + " Swirlix\n", + " Fairy\n", + " NaN\n", + " 62\n", + " 48\n", + " 66\n", + " 59\n", + " 57\n", + " 49\n", + " 6\n", + " False\n", + " \n", + " \n", + " 755\n", + " Slurpuff\n", + " Fairy\n", + " NaN\n", + " 82\n", + " 80\n", + " 86\n", + " 85\n", + " 75\n", + " 72\n", + " 6\n", + " False\n", + " \n", + " \n", + " 756\n", + " Inkay\n", + " Dark\n", + " Psychic\n", + " 53\n", + " 54\n", + " 53\n", + " 37\n", + " 46\n", + " 45\n", + " 6\n", + " False\n", + " \n", + " \n", + " 757\n", + " Malamar\n", + " Dark\n", + " Psychic\n", + " 86\n", + " 92\n", + " 88\n", + " 68\n", + " 75\n", + " 73\n", + " 6\n", + " False\n", + " \n", + " \n", + " 758\n", + " Binacle\n", + " Rock\n", + " Water\n", + " 42\n", + " 52\n", + " 67\n", + " 39\n", + " 56\n", + " 50\n", + " 6\n", + " False\n", + " \n", + " \n", + " 759\n", + " Barbaracle\n", + " Rock\n", + " Water\n", + " 72\n", + " 105\n", + " 115\n", + " 54\n", + " 86\n", + " 68\n", + " 6\n", + " False\n", + " \n", + " \n", + " 760\n", + " Skrelp\n", + " Poison\n", + " Water\n", " 50\n", + " 60\n", + " 60\n", + " 60\n", + " 60\n", + " 30\n", + " 6\n", + " False\n", + " \n", + " \n", + " 761\n", + " Dragalge\n", + " Poison\n", + " Dragon\n", " 65\n", - " 1\n", + " 75\n", + " 90\n", + " 97\n", + " 123\n", + " 44\n", + " 6\n", + " False\n", + " \n", + " \n", + " 762\n", + " Clauncher\n", + " Water\n", + " NaN\n", + " 50\n", + " 53\n", + " 62\n", + " 58\n", + " 63\n", + " 44\n", + " 6\n", + " False\n", + " \n", + " \n", + " 763\n", + " Clawitzer\n", + " Water\n", + " NaN\n", + " 71\n", + " 73\n", + " 88\n", + " 120\n", + " 89\n", + " 59\n", + " 6\n", + " False\n", + " \n", + " \n", + " 764\n", + " Helioptile\n", + " Electric\n", + " Normal\n", + " 44\n", + " 38\n", + " 33\n", + " 61\n", + " 43\n", + " 70\n", + " 6\n", + " False\n", + " \n", + " \n", + " 765\n", + " Heliolisk\n", + " Electric\n", + " Normal\n", + " 62\n", + " 55\n", + " 52\n", + " 109\n", + " 94\n", + " 109\n", + " 6\n", + " False\n", + " \n", + " \n", + " 766\n", + " Tyrunt\n", + " Rock\n", + " Dragon\n", + " 58\n", + " 89\n", + " 77\n", + " 45\n", + " 45\n", + " 48\n", + " 6\n", + " False\n", + " \n", + " \n", + " 767\n", + " Tyrantrum\n", + " Rock\n", + " Dragon\n", + " 82\n", + " 121\n", + " 119\n", + " 69\n", + " 59\n", + " 71\n", + " 6\n", + " False\n", + " \n", + " \n", + " 768\n", + " Amaura\n", + " Rock\n", + " Ice\n", + " 77\n", + " 59\n", + " 50\n", + " 67\n", + " 63\n", + " 46\n", + " 6\n", + " False\n", + " \n", + " \n", + " 769\n", + " Aurorus\n", + " Rock\n", + " Ice\n", + " 123\n", + " 77\n", + " 72\n", + " 99\n", + " 92\n", + " 58\n", + " 6\n", " False\n", " \n", " \n", + " 770\n", + " Sylveon\n", + " Fairy\n", + " NaN\n", + " 95\n", + " 65\n", + " 65\n", + " 110\n", + " 130\n", + " 60\n", + " 6\n", + " False\n", + " \n", + " \n", + " 771\n", + " Hawlucha\n", + " Fighting\n", + " Flying\n", + " 78\n", + " 92\n", + " 75\n", + " 74\n", + " 63\n", + " 118\n", + " 6\n", + " False\n", + " \n", + " \n", + " 772\n", + " Dedenne\n", + " Electric\n", + " Fairy\n", + " 67\n", + " 58\n", + " 57\n", + " 81\n", + " 67\n", + " 101\n", + " 6\n", + " False\n", + " \n", + " \n", + " 773\n", + " Carbink\n", + " Rock\n", + " Fairy\n", + " 50\n", + " 50\n", + " 150\n", + " 50\n", + " 150\n", + " 50\n", + " 6\n", + " False\n", + " \n", + " \n", + " 774\n", + " Goomy\n", + " Dragon\n", + " NaN\n", + " 45\n", + " 50\n", + " 35\n", + " 55\n", + " 75\n", + " 40\n", + " 6\n", + " False\n", + " \n", + " \n", + " 775\n", + " Sliggoo\n", + " Dragon\n", + " NaN\n", + " 68\n", + " 75\n", + " 53\n", + " 83\n", + " 113\n", + " 60\n", + " 6\n", + " False\n", + " \n", + " \n", + " 776\n", + " Goodra\n", + " Dragon\n", + " NaN\n", + " 90\n", + " 100\n", + " 70\n", + " 110\n", + " 150\n", + " 80\n", + " 6\n", + " False\n", + " \n", + " \n", + " 777\n", + " Klefki\n", + " Steel\n", + " Fairy\n", + " 57\n", + " 80\n", + " 91\n", + " 80\n", + " 87\n", + " 75\n", + " 6\n", + " False\n", + " \n", + " \n", + " 778\n", + " Phantump\n", + " Ghost\n", + " Grass\n", + " 43\n", + " 70\n", + " 48\n", + " 50\n", + " 60\n", + " 38\n", + " 6\n", + " False\n", + " \n", + " \n", + " 779\n", + " Trevenant\n", + " Ghost\n", + " Grass\n", + " 85\n", + " 110\n", + " 76\n", + " 65\n", + " 82\n", + " 56\n", + " 6\n", + " False\n", + " \n", + " \n", + " 780\n", + " Pumpkaboo Average Size\n", + " Ghost\n", + " Grass\n", + " 49\n", + " 66\n", + " 70\n", + " 44\n", + " 55\n", + " 51\n", + " 6\n", + " False\n", + " \n", + " \n", + " 781\n", + " Pumpkaboo Small Size\n", + " Ghost\n", + " Grass\n", + " 44\n", + " 66\n", + " 70\n", + " 44\n", + " 55\n", + " 56\n", + " 6\n", + " False\n", + " \n", + " \n", + " 782\n", + " Pumpkaboo Large Size\n", + " Ghost\n", + " Grass\n", + " 54\n", + " 66\n", + " 70\n", + " 44\n", + " 55\n", + " 46\n", + " 6\n", + " False\n", + " \n", + " \n", + " 783\n", + " Pumpkaboo Super Size\n", + " Ghost\n", + " Grass\n", + " 59\n", + " 66\n", + " 70\n", + " 44\n", + " 55\n", + " 41\n", + " 6\n", + " False\n", + " \n", + " \n", + " 784\n", + " Gourgeist Average Size\n", + " Ghost\n", + " Grass\n", + " 65\n", + " 90\n", + " 122\n", + " 58\n", + " 75\n", + " 84\n", + " 6\n", + " False\n", + " \n", + " \n", + " 785\n", + " Gourgeist Small Size\n", + " Ghost\n", + " Grass\n", + " 55\n", + " 85\n", + " 122\n", + " 58\n", + " 75\n", + " 99\n", + " 6\n", + " False\n", + " \n", + " \n", + " 786\n", + " Gourgeist Large Size\n", + " Ghost\n", + " Grass\n", + " 75\n", + " 95\n", + " 122\n", + " 58\n", + " 75\n", + " 69\n", + " 6\n", + " False\n", + " \n", + " \n", + " 787\n", + " Gourgeist Super Size\n", + " Ghost\n", + " Grass\n", + " 85\n", + " 100\n", + " 122\n", + " 58\n", + " 75\n", + " 54\n", + " 6\n", + " False\n", + " \n", + " \n", + " 788\n", + " Bergmite\n", + " Ice\n", + " NaN\n", + " 55\n", + " 69\n", + " 85\n", + " 32\n", + " 35\n", + " 28\n", + " 6\n", + " False\n", + " \n", + " \n", + " 789\n", + " Avalugg\n", + " Ice\n", + " NaN\n", + " 95\n", + " 117\n", + " 184\n", + " 44\n", + " 46\n", + " 28\n", + " 6\n", + " False\n", + " \n", + " \n", + " 790\n", + " Noibat\n", + " Flying\n", + " Dragon\n", + " 40\n", + " 30\n", + " 35\n", + " 45\n", + " 40\n", + " 55\n", + " 6\n", + " False\n", + " \n", + " \n", + " 791\n", + " Noivern\n", + " Flying\n", + " Dragon\n", + " 85\n", + " 70\n", + " 80\n", + " 97\n", + " 80\n", + " 123\n", + " 6\n", + " False\n", + " \n", + " \n", + " 792\n", + " Xerneas\n", + " Fairy\n", + " NaN\n", + " 126\n", + " 131\n", + " 95\n", + " 131\n", + " 98\n", + " 99\n", + " 6\n", + " True\n", + " \n", + " \n", + " 793\n", + " Yveltal\n", + " Dark\n", + " Flying\n", + " 126\n", + " 131\n", + " 95\n", + " 131\n", + " 98\n", + " 99\n", + " 6\n", + " True\n", + " \n", + " \n", + " 794\n", + " Zygarde Half Forme\n", + " Dragon\n", + " Ground\n", + " 108\n", + " 100\n", + " 121\n", + " 81\n", + " 95\n", + " 95\n", + " 6\n", + " True\n", + " \n", + " \n", + " 795\n", + " Diancie\n", + " Rock\n", + " Fairy\n", + " 50\n", + " 100\n", + " 150\n", + " 100\n", + " 150\n", + " 50\n", + " 6\n", + " True\n", + " \n", + " \n", + " 796\n", + " Mega Diancie\n", + " Rock\n", + " Fairy\n", + " 50\n", + " 160\n", + " 110\n", + " 160\n", + " 110\n", + " 110\n", + " 6\n", + " True\n", + " \n", + " \n", + " 797\n", + " Hoopa Confined\n", + " Psychic\n", + " Ghost\n", + " 80\n", + " 110\n", + " 60\n", + " 150\n", + " 130\n", + " 70\n", + " 6\n", + " True\n", + " \n", + " \n", + " 798\n", + " Hoopa Unbound\n", + " Psychic\n", + " Dark\n", + " 80\n", + " 160\n", + " 60\n", + " 170\n", + " 130\n", + " 80\n", + " 6\n", + " True\n", + " \n", + " \n", + " 799\n", + " Volcanion\n", + " Fire\n", + " Water\n", + " 80\n", + " 110\n", + " 120\n", + " 130\n", + " 90\n", + " 70\n", + " 6\n", + " True\n", + " \n", + " \n", + "\n", + "" + ], + "text/plain": [ + " Name Type 1 Type 2 HP Attack Defense Sp. Atk \\\n", + "740 Skiddo Grass NaN 66 65 48 62 \n", + "741 Gogoat Grass NaN 123 100 62 97 \n", + "742 Pancham Fighting NaN 67 82 62 46 \n", + "743 Pangoro Fighting Dark 95 124 78 69 \n", + "744 Furfrou Normal NaN 75 80 60 65 \n", + "745 Espurr Psychic NaN 62 48 54 63 \n", + "746 Meowstic Male Psychic NaN 74 48 76 83 \n", + "747 Meowstic Female Psychic NaN 74 48 76 83 \n", + "748 Honedge Steel Ghost 45 80 100 35 \n", + "749 Doublade Steel Ghost 59 110 150 45 \n", + "750 Aegislash Blade Forme Steel Ghost 60 150 50 150 \n", + "751 Aegislash Shield Forme Steel Ghost 60 50 150 50 \n", + "752 Spritzee Fairy NaN 78 52 60 63 \n", + "753 Aromatisse Fairy NaN 101 72 72 99 \n", + "754 Swirlix Fairy NaN 62 48 66 59 \n", + "755 Slurpuff Fairy NaN 82 80 86 85 \n", + "756 Inkay Dark Psychic 53 54 53 37 \n", + "757 Malamar Dark Psychic 86 92 88 68 \n", + "758 Binacle Rock Water 42 52 67 39 \n", + "759 Barbaracle Rock Water 72 105 115 54 \n", + "760 Skrelp Poison Water 50 60 60 60 \n", + "761 Dragalge Poison Dragon 65 75 90 97 \n", + "762 Clauncher Water NaN 50 53 62 58 \n", + "763 Clawitzer Water NaN 71 73 88 120 \n", + "764 Helioptile Electric Normal 44 38 33 61 \n", + "765 Heliolisk Electric Normal 62 55 52 109 \n", + "766 Tyrunt Rock Dragon 58 89 77 45 \n", + "767 Tyrantrum Rock Dragon 82 121 119 69 \n", + "768 Amaura Rock Ice 77 59 50 67 \n", + "769 Aurorus Rock Ice 123 77 72 99 \n", + "770 Sylveon Fairy NaN 95 65 65 110 \n", + "771 Hawlucha Fighting Flying 78 92 75 74 \n", + "772 Dedenne Electric Fairy 67 58 57 81 \n", + "773 Carbink Rock Fairy 50 50 150 50 \n", + "774 Goomy Dragon NaN 45 50 35 55 \n", + "775 Sliggoo Dragon NaN 68 75 53 83 \n", + "776 Goodra Dragon NaN 90 100 70 110 \n", + "777 Klefki Steel Fairy 57 80 91 80 \n", + "778 Phantump Ghost Grass 43 70 48 50 \n", + "779 Trevenant Ghost Grass 85 110 76 65 \n", + "780 Pumpkaboo Average Size Ghost Grass 49 66 70 44 \n", + "781 Pumpkaboo Small Size Ghost Grass 44 66 70 44 \n", + "782 Pumpkaboo Large Size Ghost Grass 54 66 70 44 \n", + "783 Pumpkaboo Super Size Ghost Grass 59 66 70 44 \n", + "784 Gourgeist Average Size Ghost Grass 65 90 122 58 \n", + "785 Gourgeist Small Size Ghost Grass 55 85 122 58 \n", + "786 Gourgeist Large Size Ghost Grass 75 95 122 58 \n", + "787 Gourgeist Super Size Ghost Grass 85 100 122 58 \n", + "788 Bergmite Ice NaN 55 69 85 32 \n", + "789 Avalugg Ice NaN 95 117 184 44 \n", + "790 Noibat Flying Dragon 40 30 35 45 \n", + "791 Noivern Flying Dragon 85 70 80 97 \n", + "792 Xerneas Fairy NaN 126 131 95 131 \n", + "793 Yveltal Dark Flying 126 131 95 131 \n", + "794 Zygarde Half Forme Dragon Ground 108 100 121 81 \n", + "795 Diancie Rock Fairy 50 100 150 100 \n", + "796 Mega Diancie Rock Fairy 50 160 110 160 \n", + "797 Hoopa Confined Psychic Ghost 80 110 60 150 \n", + "798 Hoopa Unbound Psychic Dark 80 160 60 170 \n", + "799 Volcanion Fire Water 80 110 120 130 \n", + "\n", + " Sp. Def Speed Generation Legendary \n", + "740 57 52 6 False \n", + "741 81 68 6 False \n", + "742 48 43 6 False \n", + "743 71 58 6 False \n", + "744 90 102 6 False \n", + "745 60 68 6 False \n", + "746 81 104 6 False \n", + "747 81 104 6 False \n", + "748 37 28 6 False \n", + "749 49 35 6 False \n", + "750 50 60 6 False \n", + "751 150 60 6 False \n", + "752 65 23 6 False \n", + "753 89 29 6 False \n", + "754 57 49 6 False \n", + "755 75 72 6 False \n", + "756 46 45 6 False \n", + "757 75 73 6 False \n", + "758 56 50 6 False \n", + "759 86 68 6 False \n", + "760 60 30 6 False \n", + "761 123 44 6 False \n", + "762 63 44 6 False \n", + "763 89 59 6 False \n", + "764 43 70 6 False \n", + "765 94 109 6 False \n", + "766 45 48 6 False \n", + "767 59 71 6 False \n", + "768 63 46 6 False \n", + "769 92 58 6 False \n", + "770 130 60 6 False \n", + "771 63 118 6 False \n", + "772 67 101 6 False \n", + "773 150 50 6 False \n", + "774 75 40 6 False \n", + "775 113 60 6 False \n", + "776 150 80 6 False \n", + "777 87 75 6 False \n", + "778 60 38 6 False \n", + "779 82 56 6 False \n", + "780 55 51 6 False \n", + "781 55 56 6 False \n", + "782 55 46 6 False \n", + "783 55 41 6 False \n", + "784 75 84 6 False \n", + "785 75 99 6 False \n", + "786 75 69 6 False \n", + "787 75 54 6 False \n", + "788 35 28 6 False \n", + "789 46 28 6 False \n", + "790 40 55 6 False \n", + "791 80 123 6 False \n", + "792 98 99 6 True \n", + "793 98 99 6 True \n", + "794 95 95 6 True \n", + "795 150 50 6 True \n", + "796 110 110 6 True \n", + "797 130 70 6 True \n", + "798 130 80 6 True \n", + "799 90 70 6 True " + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df = pd.read_csv(\"https://raw.githubusercontent.com/data-bootcamp-v4/data/main/pokemon.csv\")\n", + "df.tail(60)" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0 Bulbasaur\n", + "1 Ivysaur\n", + "2 Venusaur\n", + "3 Mega Venusaur\n", + "4 Charmander\n", + " ... \n", + "795 Diancie\n", + "796 Mega Diancie\n", + "797 Hoopa Confined\n", + "798 Hoopa Unbound\n", + "799 Volcanion\n", + "Name: Name, Length: 800, dtype: object" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_name = df[\"Name\"]\n", + "pokemon_name" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Index(['Name', 'Type 1', 'Type 2', 'HP', 'Attack', 'Defense', 'Sp. Atk',\n", + " 'Sp. Def', 'Speed', 'Generation', 'Legendary'],\n", + " dtype='object')" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df.columns" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "- We posit that Pokemons of type Dragon have, on average, more HP stats than Grass. Choose the propper test and, with 5% significance, comment your findings." + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0 45\n", + "1 60\n", + "2 80\n", + "3 80\n", + "48 45\n", + " ..\n", + "783 59\n", + "784 65\n", + "785 55\n", + "786 75\n", + "787 85\n", + "Name: HP, Length: 95, dtype: int64" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_grass = df[(df[\"Type 1\"]==\"Grass\") | (df[\"Type 2\"]==\"Grass\")] [\"HP\"]\n", + "pokemon_grass" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "7 78\n", + "159 41\n", + "160 61\n", + "161 91\n", + "196 90\n", + "249 75\n", + "275 70\n", + "360 50\n", + "361 80\n", + "365 75\n", + "366 75\n", + "406 45\n", + "407 65\n", + "408 95\n", + "409 95\n", + "417 80\n", + "418 80\n", + "419 80\n", + "420 80\n", + "425 105\n", + "426 105\n", + "491 58\n", + "492 68\n", + "493 108\n", + "494 108\n", + "540 100\n", + "541 90\n", + "544 150\n", + "545 150\n", + "671 46\n", + "672 66\n", + "673 76\n", + "682 77\n", + "694 52\n", + "695 72\n", + "696 92\n", + "706 100\n", + "707 100\n", + "710 125\n", + "711 125\n", + "712 125\n", + "761 65\n", + "766 58\n", + "767 82\n", + "774 45\n", + "775 68\n", + "776 90\n", + "790 40\n", + "791 85\n", + "794 108\n", + "Name: HP, dtype: int64" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "#code here\n", + "pokemon_dragon = df[(df[\"Type 1\"]==\"Dragon\") | (df[\"Type 2\"]==\"Dragon\")][\"HP\"] # df not saved filtered df, just HP values\n", + "pokemon_dragon" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# 5% significance, 95% confidence level\n", + "# This is the long version.\n", + "# python shorter version below\n", + "\n", + "#sample mean\n", + "# mean = pokemon_dragon.mean()\n", + "\n", + "#standard deviation of sample\n", + "#s = pokemon_dragon.std(ddof=1)\n", + "\n", + "#sample size\n", + "#n = len(pokemon_dragon)\n", + "\n", + "#hypothesized population mean\n", + "#mu = ?\n", + "\n", + "#stat = (mean - mu)/(s/np.sqrt(n))\n", + "#stat" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "TtestResult(statistic=np.float64(4.097528915272702), pvalue=np.float64(0.00010181538122353851), df=np.float64(77.58086781513519))" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# python shorter version\n", + "# two sample T-test Dragon greater on average, more HP stats than Grass\n", + "# with 5% significance\n", + "# ttest_ind(sample1,sample2,alternative = \"greater\")\n", + "\n", + "st.ttest_ind(pokemon_dragon,pokemon_grass, equal_var=False, alternative = \"two-sided\")\n" + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "7 78\n", + "159 41\n", + "160 61\n", + "161 91\n", + "196 90\n", + "249 75\n", + "275 70\n", + "360 50\n", + "361 80\n", + "365 75\n", + "366 75\n", + "406 45\n", + "407 65\n", + "408 95\n", + "409 95\n", + "417 80\n", + "418 80\n", + "419 80\n", + "420 80\n", + "425 105\n", + "426 105\n", + "491 58\n", + "492 68\n", + "493 108\n", + "494 108\n", + "540 100\n", + "541 90\n", + "544 150\n", + "545 150\n", + "671 46\n", + "672 66\n", + "673 76\n", + "682 77\n", + "694 52\n", + "695 72\n", + "696 92\n", + "706 100\n", + "707 100\n", + "710 125\n", + "711 125\n", + "712 125\n", + "761 65\n", + "766 58\n", + "767 82\n", + "774 45\n", + "775 68\n", + "776 90\n", + "790 40\n", + "791 85\n", + "794 108\n", + "Name: HP, dtype: int64" + ] + }, + "execution_count": 48, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_dragon" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Because p_value is lower than significance level, we reject the null hypothesis, this means that Dragon on average don't have more HP stats than Grass" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "- We posit that Legendary Pokemons have different stats (HP, Attack, Defense, Sp.Atk, Sp.Def, Speed) when comparing with Non-Legendary. Choose the propper test and, with 5% significance, comment your findings.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": {}, + "outputs": [], + "source": [ + "pokemon_legendary = df[(df[\"Legendary\"]==True)]\n", + "pokemon_non_legendary = df[(df[\"Legendary\"]==False)]" + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
156ArticunoIceFlying908510095125851True
157ZapdosElectricFlying909085125901001True
158MoltresFireFlying901009012585901True
162MewtwoPsychicNaN10611090154901301True
163Mega Mewtwo XPsychicFighting1061901001541001301True
....................................
795DiancieRockFairy50100150100150506True
796Mega DiancieRockFairy501601101601101106True
797Hoopa ConfinedPsychicGhost8011060150130706True
798Hoopa UnboundPsychicDark8016060170130806True
799VolcanionFireWater8011012013090706True
\n", + "

65 rows × 11 columns

\n", + "
" + ], + "text/plain": [ + " Name Type 1 Type 2 HP Attack Defense Sp. Atk \\\n", + "156 Articuno Ice Flying 90 85 100 95 \n", + "157 Zapdos Electric Flying 90 90 85 125 \n", + "158 Moltres Fire Flying 90 100 90 125 \n", + "162 Mewtwo Psychic NaN 106 110 90 154 \n", + "163 Mega Mewtwo X Psychic Fighting 106 190 100 154 \n", + ".. ... ... ... ... ... ... ... \n", + "795 Diancie Rock Fairy 50 100 150 100 \n", + "796 Mega Diancie Rock Fairy 50 160 110 160 \n", + "797 Hoopa Confined Psychic Ghost 80 110 60 150 \n", + "798 Hoopa Unbound Psychic Dark 80 160 60 170 \n", + "799 Volcanion Fire Water 80 110 120 130 \n", + "\n", + " Sp. Def Speed Generation Legendary \n", + "156 125 85 1 True \n", + "157 90 100 1 True \n", + "158 85 90 1 True \n", + "162 90 130 1 True \n", + "163 100 130 1 True \n", + ".. ... ... ... ... \n", + "795 150 50 6 True \n", + "796 110 110 6 True \n", + "797 130 70 6 True \n", + "798 130 80 6 True \n", + "799 90 70 6 True \n", + "\n", + "[65 rows x 11 columns]" + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_legendary" + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
NameType 1Type 2HPAttackDefenseSp. AtkSp. DefSpeedGenerationLegendary
0BulbasaurGrassPoison4549496565451False
1IvysaurGrassPoison6062638080601False
2VenusaurGrassPoison808283100100801False
3Mega VenusaurGrassPoison80100123122120801False
4CharmanderFireNaN3952436050651False
....................................
787Gourgeist Super SizeGhostGrass851001225875546False
788BergmiteIceNaN5569853235286False
789AvaluggIceNaN951171844446286False
790NoibatFlyingDragon4030354540556False
791NoivernFlyingDragon85708097801236False
\n", + "

735 rows × 11 columns

\n", + "
" + ], + "text/plain": [ + " Name Type 1 Type 2 HP Attack Defense Sp. Atk \\\n", + "0 Bulbasaur Grass Poison 45 49 49 65 \n", + "1 Ivysaur Grass Poison 60 62 63 80 \n", + "2 Venusaur Grass Poison 80 82 83 100 \n", + "3 Mega Venusaur Grass Poison 80 100 123 122 \n", + "4 Charmander Fire NaN 39 52 43 60 \n", + ".. ... ... ... .. ... ... ... \n", + "787 Gourgeist Super Size Ghost Grass 85 100 122 58 \n", + "788 Bergmite Ice NaN 55 69 85 32 \n", + "789 Avalugg Ice NaN 95 117 184 44 \n", + "790 Noibat Flying Dragon 40 30 35 45 \n", + "791 Noivern Flying Dragon 85 70 80 97 \n", + "\n", + " Sp. Def Speed Generation Legendary \n", + "0 65 45 1 False \n", + "1 80 60 1 False \n", + "2 100 80 1 False \n", + "3 120 80 1 False \n", + "4 50 65 1 False \n", + ".. ... ... ... ... \n", + "787 75 54 6 False \n", + "788 35 28 6 False \n", + "789 46 28 6 False \n", + "790 40 55 6 False \n", + "791 80 123 6 False \n", + "\n", + "[735 rows x 11 columns]" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_non_legendary" + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0 65\n", + "1 80\n", + "2 100\n", + "3 122\n", + "4 60\n", + " ... \n", + "787 58\n", + "788 32\n", + "789 44\n", + "790 45\n", + "791 97\n", + "Name: Sp. Atk, Length: 735, dtype: int64" + ] + }, + "execution_count": 61, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_non_legendary[\"Sp. Atk\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Name object\n", + "Type 1 object\n", + "Type 2 object\n", + "HP int64\n", + "Attack int64\n", + "Defense int64\n", + "Sp. Atk int64\n", + "Sp. Def int64\n", + "Speed int64\n", + "Generation int64\n", + "Legendary bool\n", + "dtype: object" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df.dtypes" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": 56, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "the pvalue of HP is 1.0026911708035284e-13\n", + "reject the null hypothesis\n", + "the pvalue of Attack is 2.520372449236646e-16\n", + "reject the null hypothesis\n", + "the pvalue of Defense is 4.8269984949193316e-11\n", + "reject the null hypothesis\n", + "the pvalue of Sp. Atk is 1.5514614112239812e-21\n", + "reject the null hypothesis\n", + "the pvalue of Sp. Def is 2.2949327864052826e-15\n", + "reject the null hypothesis\n", + "the pvalue of Speed is 1.049016311882451e-18\n", + "reject the null hypothesis\n" + ] + } + ], + "source": [ + "# Create a for loop:\n", + "\n", + "stats = [\"HP\", \"Attack\", \"Defense\", \"Sp. Atk\", \"Sp. Def\", \"Speed\"] # list with quotations is strings, without quotations would be a list of different variables.\n", + "\n", + "# for vs while\n", + "# we know the iterations, vs we know the condition\n", + "\n", + "for x in stats:\n", + " stat, pvalue = st.ttest_ind(pokemon_legendary[x], pokemon_non_legendary[x], equal_var=False, alternative = \"two-sided\")\n", + " print(f\"the pvalue of {x} is {pvalue}\") # {} when you do f\"string \"\" and the variable changes\n", + " if pvalue < .05:\n", + " \n", + " print(\"reject the null hypothesis\")\n", + " else:\n", + " print(\"unable to reject the null hypothesis\")\n", + " " + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "156 90\n", + "157 90\n", + "158 90\n", + "162 106\n", + "163 106\n", + " ... \n", + "795 50\n", + "796 50\n", + "797 80\n", + "798 80\n", + "799 80\n", + "Name: HP, Length: 65, dtype: int64" + ] + }, + "execution_count": 51, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "pokemon_legendary[\"HP\"]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Challenge 2**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this challenge, we will be working with california-housing data. The data can be found here:\n", + "- https://raw.githubusercontent.com/data-bootcamp-v4/data/main/california_housing.csv" + ] + }, + { + "cell_type": "code", + "execution_count": 57, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
longitudelatitudehousing_median_agetotal_roomstotal_bedroomspopulationhouseholdsmedian_incomemedian_house_value
0-114.3134.1915.05612.01283.01015.0472.01.493666900.0
1-114.4734.4019.07650.01901.01129.0463.01.820080100.0
2-114.5633.6917.0720.0174.0333.0117.01.650985700.0
3-114.5733.6414.01501.0337.0515.0226.03.191773400.0
4-114.5733.5720.01454.0326.0624.0262.01.925065500.0
\n", + "
" + ], + "text/plain": [ + " longitude latitude housing_median_age total_rooms total_bedrooms \\\n", + "0 -114.31 34.19 15.0 5612.0 1283.0 \n", + "1 -114.47 34.40 19.0 7650.0 1901.0 \n", + "2 -114.56 33.69 17.0 720.0 174.0 \n", + "3 -114.57 33.64 14.0 1501.0 337.0 \n", + "4 -114.57 33.57 20.0 1454.0 326.0 \n", + "\n", + " population households median_income median_house_value \n", + "0 1015.0 472.0 1.4936 66900.0 \n", + "1 1129.0 463.0 1.8200 80100.0 \n", + "2 333.0 117.0 1.6509 85700.0 \n", + "3 515.0 226.0 3.1917 73400.0 \n", + "4 624.0 262.0 1.9250 65500.0 " + ] + }, + "execution_count": 57, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df = pd.read_csv(\"https://raw.githubusercontent.com/data-bootcamp-v4/data/main/california_housing.csv\")\n", + "df.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**We posit that houses close to either a school or a hospital are more expensive.**\n", + "\n", + "- School coordinates (-118, 34)\n", + "- Hospital coordinates (-122, 37)\n", + "\n", + "We consider a house (neighborhood) to be close to a school or hospital if the distance is lower than 0.50.\n", + "\n", + "Hint:\n", + "- Write a function to calculate euclidean distance from each house (neighborhood) to the school and to the hospital.\n", + "- Divide your dataset into houses close and far from either a hospital or school.\n", + "- Choose the propper test and, with 5% significance, comment your findings.\n", + " " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# d = √((x₂ - x₁)² + (y₂ - y₁)²),\n" + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": {}, + "outputs": [], + "source": [ + "df[\"distance_to_school\"] = ((df[\"longitude\"]- -118)**2 + (df[\"latitude\"] - 34)**2)**0.5" + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": {}, + "outputs": [], + "source": [ + "df[\"distance_to_hospital\"] = ((df[\"longitude\"]- -122)**2 + (df[\"latitude\"] - 37)**2)**0.5" + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", " \n", " \n", " \n", @@ -174,171 +2316,407 @@ " \n", " \n", " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", " \n", " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", " \n", " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", " \n", " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", " \n", " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", " \n", " \n", "
longitudelatitudehousing_median_agetotal_roomstotal_bedroomspopulationhouseholdsmedian_incomemedian_house_valuedistance_to_schooldistance_to_hospital
0-114.3134.1915.05612.01283.01015.0472.01.493666900.03.6948888.187319
1-114.4734.4019.07650.01901.01129.0463.01.820080100.03.5525917.966235
2-114.5633.6917.0720.0174.0333.0117.01.650985700.03.4539408.143077
3-114.5733.6414.01501.0337.0515.0226.03.191773400.03.4488408.154416
4-114.5733.5720.01454.0326.0624.0262.01.925065500.03.4568488.183508
............
795DiancieRockFairy50100150100150506True16995-124.2640.5852.02217.0394.0907.0369.02.3571111400.09.0820704.233675
796Mega DiancieRockFairy501601101601101106True16996-124.2740.6936.02349.0528.01194.0465.02.517979000.09.1689154.332320
797Hoopa ConfinedPsychicGhost8011060150130706True16997-124.3041.8417.02677.0531.01244.0456.03.0313103600.010.0576145.358694
798Hoopa UnboundPsychicDark8016060170130806True16998-124.3041.8019.02672.0552.01298.0478.01.979785800.010.0264655.322593
799VolcanionFireWater8011012013090706True16999-124.3540.5452.01820.0300.0806.0270.03.014794600.09.1155974.249012
\n", - "

800 rows × 11 columns

\n", + "

17000 rows × 11 columns

\n", "
" ], "text/plain": [ - " Name Type 1 Type 2 HP Attack Defense Sp. Atk Sp. Def \\\n", - "0 Bulbasaur Grass Poison 45 49 49 65 65 \n", - "1 Ivysaur Grass Poison 60 62 63 80 80 \n", - "2 Venusaur Grass Poison 80 82 83 100 100 \n", - "3 Mega Venusaur Grass Poison 80 100 123 122 120 \n", - "4 Charmander Fire NaN 39 52 43 60 50 \n", - ".. ... ... ... .. ... ... ... ... \n", - "795 Diancie Rock Fairy 50 100 150 100 150 \n", - "796 Mega Diancie Rock Fairy 50 160 110 160 110 \n", - "797 Hoopa Confined Psychic Ghost 80 110 60 150 130 \n", - "798 Hoopa Unbound Psychic Dark 80 160 60 170 130 \n", - "799 Volcanion Fire Water 80 110 120 130 90 \n", + " longitude latitude housing_median_age total_rooms total_bedrooms \\\n", + "0 -114.31 34.19 15.0 5612.0 1283.0 \n", + "1 -114.47 34.40 19.0 7650.0 1901.0 \n", + "2 -114.56 33.69 17.0 720.0 174.0 \n", + "3 -114.57 33.64 14.0 1501.0 337.0 \n", + "4 -114.57 33.57 20.0 1454.0 326.0 \n", + "... ... ... ... ... ... \n", + "16995 -124.26 40.58 52.0 2217.0 394.0 \n", + "16996 -124.27 40.69 36.0 2349.0 528.0 \n", + "16997 -124.30 41.84 17.0 2677.0 531.0 \n", + "16998 -124.30 41.80 19.0 2672.0 552.0 \n", + "16999 -124.35 40.54 52.0 1820.0 300.0 \n", + "\n", + " population households median_income median_house_value \\\n", + "0 1015.0 472.0 1.4936 66900.0 \n", + "1 1129.0 463.0 1.8200 80100.0 \n", + "2 333.0 117.0 1.6509 85700.0 \n", + "3 515.0 226.0 3.1917 73400.0 \n", + "4 624.0 262.0 1.9250 65500.0 \n", + "... ... ... ... ... \n", + "16995 907.0 369.0 2.3571 111400.0 \n", + "16996 1194.0 465.0 2.5179 79000.0 \n", + "16997 1244.0 456.0 3.0313 103600.0 \n", + "16998 1298.0 478.0 1.9797 85800.0 \n", + "16999 806.0 270.0 3.0147 94600.0 \n", "\n", - " Speed Generation Legendary \n", - "0 45 1 False \n", - "1 60 1 False \n", - "2 80 1 False \n", - "3 80 1 False \n", - "4 65 1 False \n", - ".. ... ... ... \n", - "795 50 6 True \n", - "796 110 6 True \n", - "797 70 6 True \n", - "798 80 6 True \n", - "799 70 6 True \n", + " distance_to_school distance_to_hospital \n", + "0 3.694888 8.187319 \n", + "1 3.552591 7.966235 \n", + "2 3.453940 8.143077 \n", + "3 3.448840 8.154416 \n", + "4 3.456848 8.183508 \n", + "... ... ... \n", + "16995 9.082070 4.233675 \n", + "16996 9.168915 4.332320 \n", + "16997 10.057614 5.358694 \n", + "16998 10.026465 5.322593 \n", + "16999 9.115597 4.249012 \n", "\n", - "[800 rows x 11 columns]" + "[17000 rows x 11 columns]" ] }, - "execution_count": 3, + "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "df = pd.read_csv(\"https://raw.githubusercontent.com/data-bootcamp-v4/data/main/pokemon.csv\")\n", "df" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "- We posit that Pokemons of type Dragon have, on average, more HP stats than Grass. Choose the propper test and, with 5% significance, comment your findings." - ] - }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 67, "metadata": {}, "outputs": [], "source": [ - "#code here" + "close_houses_df = df[(df[\"distance_to_school\"]<0.5) | (df[\"distance_to_hospital\"]<0.5)]" ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": 68, "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
longitudelatitudehousing_median_agetotal_roomstotal_bedroomspopulationhouseholdsmedian_incomemedian_house_valuedistance_to_schooldistance_to_hospital
2366-117.5134.0036.03791.0746.02258.0672.03.2067124700.00.4900005.400009
2367-117.5133.9735.0352.062.0184.057.03.6691137500.00.4909185.416733
2368-117.5133.9512.09016.01486.04285.01457.04.9984169100.00.4925445.427946
2371-117.5233.9914.013562.02057.07600.02086.05.2759182900.00.4801045.397268
2372-117.5233.892.017978.03217.07305.02463.05.1695220800.00.4924435.453668
....................................
15090-122.2537.0820.01201.0282.0601.0234.02.5556177500.05.2487050.262488
15170-122.2637.3828.01103.0164.0415.0154.07.8633500001.05.4380140.460435
15253-122.2737.3237.02607.0534.01346.0507.05.3951277700.05.4088170.418688
15254-122.2737.2430.02762.0593.01581.0502.05.1002319400.05.3600840.361248
15686-122.3837.1852.01746.0315.0941.0220.03.3047286100.05.4126520.420476
\n", + "

6829 rows × 11 columns

\n", + "
" + ], + "text/plain": [ + " longitude latitude housing_median_age total_rooms total_bedrooms \\\n", + "2366 -117.51 34.00 36.0 3791.0 746.0 \n", + "2367 -117.51 33.97 35.0 352.0 62.0 \n", + "2368 -117.51 33.95 12.0 9016.0 1486.0 \n", + "2371 -117.52 33.99 14.0 13562.0 2057.0 \n", + "2372 -117.52 33.89 2.0 17978.0 3217.0 \n", + "... ... ... ... ... ... \n", + "15090 -122.25 37.08 20.0 1201.0 282.0 \n", + "15170 -122.26 37.38 28.0 1103.0 164.0 \n", + "15253 -122.27 37.32 37.0 2607.0 534.0 \n", + "15254 -122.27 37.24 30.0 2762.0 593.0 \n", + "15686 -122.38 37.18 52.0 1746.0 315.0 \n", + "\n", + " population households median_income median_house_value \\\n", + "2366 2258.0 672.0 3.2067 124700.0 \n", + "2367 184.0 57.0 3.6691 137500.0 \n", + "2368 4285.0 1457.0 4.9984 169100.0 \n", + "2371 7600.0 2086.0 5.2759 182900.0 \n", + "2372 7305.0 2463.0 5.1695 220800.0 \n", + "... ... ... ... ... \n", + "15090 601.0 234.0 2.5556 177500.0 \n", + "15170 415.0 154.0 7.8633 500001.0 \n", + "15253 1346.0 507.0 5.3951 277700.0 \n", + "15254 1581.0 502.0 5.1002 319400.0 \n", + "15686 941.0 220.0 3.3047 286100.0 \n", + "\n", + " distance_to_school distance_to_hospital \n", + "2366 0.490000 5.400009 \n", + "2367 0.490918 5.416733 \n", + "2368 0.492544 5.427946 \n", + "2371 0.480104 5.397268 \n", + "2372 0.492443 5.453668 \n", + "... ... ... \n", + "15090 5.248705 0.262488 \n", + "15170 5.438014 0.460435 \n", + "15253 5.408817 0.418688 \n", + "15254 5.360084 0.361248 \n", + "15686 5.412652 0.420476 \n", + "\n", + "[6829 rows x 11 columns]" + ] + }, + "execution_count": 68, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "- We posit that Legendary Pokemons have different stats (HP, Attack, Defense, Sp.Atk, Sp.Def, Speed) when comparing with Non-Legendary. Choose the propper test and, with 5% significance, comment your findings.\n" + "close_houses_df" ] }, { "cell_type": "code", - "execution_count": 18, + "execution_count": 69, "metadata": {}, "outputs": [], "source": [ - "#code here" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Challenge 2**" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In this challenge, we will be working with california-housing data. The data can be found here:\n", - "- https://raw.githubusercontent.com/data-bootcamp-v4/data/main/california_housing.csv" + "far_houses_df = df[(df[\"distance_to_school\"]>=0.5) & (df[\"distance_to_hospital\"]>=0.5)]" ] }, { "cell_type": "code", - "execution_count": 5, - "metadata": {}, + "execution_count": 70, + "metadata": { + "scrolled": true + }, "outputs": [ { "data": { @@ -370,6 +2748,8 @@ " households\n", " median_income\n", " median_house_value\n", + " distance_to_school\n", + " distance_to_hospital\n", " \n", " \n", " \n", @@ -384,6 +2764,8 @@ " 472.0\n", " 1.4936\n", " 66900.0\n", + " 3.694888\n", + " 8.187319\n", " \n", " \n", " 1\n", @@ -396,6 +2778,8 @@ " 463.0\n", " 1.8200\n", " 80100.0\n", + " 3.552591\n", + " 7.966235\n", " \n", " \n", " 2\n", @@ -408,6 +2792,8 @@ " 117.0\n", " 1.6509\n", " 85700.0\n", + " 3.453940\n", + " 8.143077\n", " \n", " \n", " 3\n", @@ -420,6 +2806,8 @@ " 226.0\n", " 3.1917\n", " 73400.0\n", + " 3.448840\n", + " 8.154416\n", " \n", " \n", " 4\n", @@ -432,53 +2820,168 @@ " 262.0\n", " 1.9250\n", " 65500.0\n", + " 3.456848\n", + " 8.183508\n", + " \n", + " \n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " \n", + " \n", + " 16995\n", + " -124.26\n", + " 40.58\n", + " 52.0\n", + " 2217.0\n", + " 394.0\n", + " 907.0\n", + " 369.0\n", + " 2.3571\n", + " 111400.0\n", + " 9.082070\n", + " 4.233675\n", + " \n", + " \n", + " 16996\n", + " -124.27\n", + " 40.69\n", + " 36.0\n", + " 2349.0\n", + " 528.0\n", + " 1194.0\n", + " 465.0\n", + " 2.5179\n", + " 79000.0\n", + " 9.168915\n", + " 4.332320\n", + " \n", + " \n", + " 16997\n", + " -124.30\n", + " 41.84\n", + " 17.0\n", + " 2677.0\n", + " 531.0\n", + " 1244.0\n", + " 456.0\n", + " 3.0313\n", + " 103600.0\n", + " 10.057614\n", + " 5.358694\n", + " \n", + " \n", + " 16998\n", + " -124.30\n", + " 41.80\n", + " 19.0\n", + " 2672.0\n", + " 552.0\n", + " 1298.0\n", + " 478.0\n", + " 1.9797\n", + " 85800.0\n", + " 10.026465\n", + " 5.322593\n", + " \n", + " \n", + " 16999\n", + " -124.35\n", + " 40.54\n", + " 52.0\n", + " 1820.0\n", + " 300.0\n", + " 806.0\n", + " 270.0\n", + " 3.0147\n", + " 94600.0\n", + " 9.115597\n", + " 4.249012\n", " \n", " \n", "\n", + "

10171 rows × 11 columns

\n", "" ], "text/plain": [ - " longitude latitude housing_median_age total_rooms total_bedrooms \\\n", - "0 -114.31 34.19 15.0 5612.0 1283.0 \n", - "1 -114.47 34.40 19.0 7650.0 1901.0 \n", - "2 -114.56 33.69 17.0 720.0 174.0 \n", - "3 -114.57 33.64 14.0 1501.0 337.0 \n", - "4 -114.57 33.57 20.0 1454.0 326.0 \n", + " longitude latitude housing_median_age total_rooms total_bedrooms \\\n", + "0 -114.31 34.19 15.0 5612.0 1283.0 \n", + "1 -114.47 34.40 19.0 7650.0 1901.0 \n", + "2 -114.56 33.69 17.0 720.0 174.0 \n", + "3 -114.57 33.64 14.0 1501.0 337.0 \n", + "4 -114.57 33.57 20.0 1454.0 326.0 \n", + "... ... ... ... ... ... \n", + "16995 -124.26 40.58 52.0 2217.0 394.0 \n", + "16996 -124.27 40.69 36.0 2349.0 528.0 \n", + "16997 -124.30 41.84 17.0 2677.0 531.0 \n", + "16998 -124.30 41.80 19.0 2672.0 552.0 \n", + "16999 -124.35 40.54 52.0 1820.0 300.0 \n", "\n", - " population households median_income median_house_value \n", - "0 1015.0 472.0 1.4936 66900.0 \n", - "1 1129.0 463.0 1.8200 80100.0 \n", - "2 333.0 117.0 1.6509 85700.0 \n", - "3 515.0 226.0 3.1917 73400.0 \n", - "4 624.0 262.0 1.9250 65500.0 " + " population households median_income median_house_value \\\n", + "0 1015.0 472.0 1.4936 66900.0 \n", + "1 1129.0 463.0 1.8200 80100.0 \n", + "2 333.0 117.0 1.6509 85700.0 \n", + "3 515.0 226.0 3.1917 73400.0 \n", + "4 624.0 262.0 1.9250 65500.0 \n", + "... ... ... ... ... \n", + "16995 907.0 369.0 2.3571 111400.0 \n", + "16996 1194.0 465.0 2.5179 79000.0 \n", + "16997 1244.0 456.0 3.0313 103600.0 \n", + "16998 1298.0 478.0 1.9797 85800.0 \n", + "16999 806.0 270.0 3.0147 94600.0 \n", + "\n", + " distance_to_school distance_to_hospital \n", + "0 3.694888 8.187319 \n", + "1 3.552591 7.966235 \n", + "2 3.453940 8.143077 \n", + "3 3.448840 8.154416 \n", + "4 3.456848 8.183508 \n", + "... ... ... \n", + "16995 9.082070 4.233675 \n", + "16996 9.168915 4.332320 \n", + "16997 10.057614 5.358694 \n", + "16998 10.026465 5.322593 \n", + "16999 9.115597 4.249012 \n", + "\n", + "[10171 rows x 11 columns]" ] }, - "execution_count": 5, + "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "df = pd.read_csv(\"https://raw.githubusercontent.com/data-bootcamp-v4/data/main/california_housing.csv\")\n", - "df.head()" + "far_houses_df" ] }, { - "cell_type": "markdown", + "cell_type": "code", + "execution_count": 72, "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "TtestResult(statistic=np.float64(37.992330214201516), pvalue=np.float64(1.5032478884296307e-301), df=np.float64(14571.229910954282))" + ] + }, + "execution_count": 72, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "**We posit that houses close to either a school or a hospital are more expensive.**\n", - "\n", - "- School coordinates (-118, 34)\n", - "- Hospital coordinates (-122, 37)\n", - "\n", - "We consider a house (neighborhood) to be close to a school or hospital if the distance is lower than 0.50.\n", - "\n", - "Hint:\n", - "- Write a function to calculate euclidean distance from each house (neighborhood) to the school and to the hospital.\n", - "- Divide your dataset into houses close and far from either a hospital or school.\n", - "- Choose the propper test and, with 5% significance, comment your findings.\n", - " " + "st.ttest_ind(close_houses_df[\"median_house_value\"], far_houses_df[\"median_house_value\"], equal_var=False, alternative = \"greater\")" ] }, { @@ -486,21 +2989,17 @@ "execution_count": null, "metadata": {}, "outputs": [], - "source": [] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] + "source": [ + "# don't forget the quotation marks as it is value not a variable\n", + "# reject null hypothesis (really low)" + ] } ], "metadata": { "kernelspec": { - "display_name": "Python 3", + "display_name": "Python [conda env:base] *", "language": "python", - "name": "python3" + "name": "conda-base-py" }, "language_info": { "codemirror_mode": { @@ -512,9 +3011,9 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.9" + "version": "3.13.5" } }, "nbformat": 4, - "nbformat_minor": 2 + "nbformat_minor": 4 }