.Statistica_Python.ipynb.sage-jupyter2

{"backend_state":"running","connection_file":"/tmp/xdg-runtime-user/jupyter/kernel-f6bc64a4-52c8-4d64-9737-ca9da7d352a0.json","kernel":"python3","kernel_error":"","kernel_state":"idle","kernel_usage":{"cpu":0,"memory":0},"metadata":{"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.10.12"}},"trust":true,"type":"settings"}
{"cell_type":"code","exec_count":0,"id":"13291f","input":"%%ai chatgpt\nHello world","pos":1,"type":"cell"}
{"cell_type":"code","exec_count":0,"id":"291f5c","input":"%%ai chatgpt\nshow me how import spss datasets with seaborn \n","pos":8,"type":"cell"}
{"cell_type":"code","exec_count":0,"id":"2e1957","input":"%%ai chatgpt\ndevo introdurre il tuo output dentro una cella jupyter. Quindi d'ora in poi mi drovai con vertire gli output in markdown ","pos":11,"type":"cell"}
{"cell_type":"code","exec_count":0,"id":"71f8e1","input":"%pip install --upgrade pip","pos":22,"type":"cell"}
{"cell_type":"code","exec_count":0,"id":"76366e","input":"?del","pos":18,"type":"cell"}
{"cell_type":"code","exec_count":0,"id":"8e5a13","input":"\n%load_ext jupyter_ai_magics\n# NOTE: Replace 'PROVIDER_API_KEY' with the credential key's name,\n# and replace 'YOUR_API_KEY_HERE' with the key.\n%env OPENAI_API_KEY=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\n","pos":0,"scrolled":true,"type":"cell"}
{"cell_type":"code","exec_count":0,"id":"963f5d","input":"%%ai chatgpt\nshow how to use descriptive statistic with seaborn python","pos":7,"type":"cell"}
{"cell_type":"code","exec_count":0,"id":"ac6e78","input":"","pos":23,"type":"cell"}
{"cell_type":"code","exec_count":0,"id":"bba07c","input":"%%ai chatgpt\nshow how to use descriptive statistic with pandas python","pos":9,"type":"cell"}
{"cell_type":"code","exec_count":0,"id":"fb1972","input":"","pos":24,"type":"cell"}
{"cell_type":"code","exec_count":10,"id":"498edc","input":"del df,data","pos":19,"type":"cell"}
{"cell_type":"code","exec_count":10,"id":"c6f942","input":"valori_mediani = df.median()\nprint(\"Media\")\nprint(valori_mediani)\nvalori_rank = df.rank()\nprint(\"Rank\")\nprint(valori_rank)","output":{"0":{"name":"stdout","output_type":"stream","text":"Media\nA    19.0\ndtype: float64\nRank\n     A\n0  1.0\n1  2.5\n2  2.5\n3  4.5\n4  4.5\n5  7.0\n6  7.0\n7  7.0\n8  9.0\n"}},"pos":13,"type":"cell"}
{"cell_type":"code","exec_count":11,"id":"35a974","input":"data = [2,12,12,19,19,19,20,20,20,25,26]\nmoda = mode(data)\nprint(moda)","output":{"0":{"name":"stdout","output_type":"stream","text":"19\n"}},"pos":20,"type":"cell"}
{"cell_type":"code","exec_count":15,"id":"1f13bc","input":"# Uno snippet che permette di inserire una serie di numeri, calcolare la moda, mediana e la media\ninput_numeri = input (\"Inserisci una serie di numeri separati da spazio: \")\ndata = [int(numero) for numero in input_numeri.split()]\n# Calcola la moda \nfrom statistics import mode\nmoda = mode(data)\n# Calcola la mediana \nfrom statistics import median\nmediana = median(data)\n# Calcola la media\nfrom statistics import mean\nmedia = mean(data)\n# Calcola il rank\nfrom scipy.stats import rankdata\nranking = rankdata(data)\n\n\n\nprint(f\"(Moda: {moda}\")\nprint(f\"(Mediana: {mediana}\")\nprint(f\"(Media: {media}\")\nprint(f\"(Ranking: {ranking}\")\n","metadata":{"cocalc":{"outputs":{"0":{"name":"input","opts":{"password":false,"prompt":"Inserisci una serie di numeri separati da spazio: "},"output_type":"stream","value":"1 3 4 12 12 14 18 92 92"}}}},"output":{"0":{"name":"input","opts":{"password":false,"prompt":"Inserisci una serie di numeri separati da spazio: "},"output_type":"stream","value":"1 3 4 12 12 14 18 92 92"},"1":{"name":"stdout","output_type":"stream","text":"(Moda: 12\n(Mediana: 12\n(Media: 27.555555555555557\n(Ranking: [1.  2.  3.  4.5 4.5 6.  7.  8.5 8.5]\n"}},"pos":21,"type":"cell"}
{"cell_type":"code","exec_count":3,"id":"f41a83","input":"%%ai chatgpt\nI need to get the same label function of the program SPSS in python \ncan you show me ","output":{"0":{"data":{"text/markdown":"Unfortunately, the SPSS software's `label` function does not have a direct equivalent in Python. `label` in SPSS is used to assign a descriptive name or label to a variable, value, or data. However, you can achieve similar functionality in Python using various libraries such as pandas or numpy.\n\nFor example, in pandas, you can use the `rename` function to assign labels to columns in a DataFrame. Here is an example:\n\n```python\nimport pandas as pd\n\n# Create a DataFrame\ndf = pd.DataFrame({'A': [1, 2, 3],\n                   'B': [4, 5, 6]})\n\n# Assign labels to columns\ndf.rename(columns={'A': 'Column A',\n                   'B': 'Column B'}, inplace=True)\n\n# Print the DataFrame\nprint(df)\n```\n\nOutput:\n```\n   Column A  Column B\n0         1         4\n1         2         5\n2         3         6\n```\n\nSimilarly, you can assign labels to values in a DataFrame using `replace` or in numpy array using indexing. The specific approach and library to use depends on your specific requirements and data structure.","text/plain":"<IPython.core.display.Markdown object>"},"exec_count":3,"metadata":{"text/markdown":{"jupyter_ai":{"model_id":"gpt-3.5-turbo","provider_id":"openai-chat"}}},"output_type":"execute_result"}},"pos":2,"type":"cell"}
{"cell_type":"code","exec_count":4,"id":"d3f08d","input":"import pandas as pd","pos":3,"type":"cell"}
{"cell_type":"code","exec_count":5,"id":"c8380c","input":"data = {'age': [25, 30, 35, 40],\n        'gender': ['Male', 'Female', 'Male', 'Female']}\ndf = pd.DataFrame(data)\n\n# Create a dictionary with variable labels\nvariable_labels = {'age': 'Age in years',\n                   'gender': 'Gender'}\n\n# Assign the variable labels to the DataFrame\ndf.rename(columns=variable_labels, inplace=True)","pos":4,"type":"cell"}
{"cell_type":"code","exec_count":6,"id":"2a8efa","input":"print(df)","output":{"0":{"name":"stdout","output_type":"stream","text":"   Age in years  Gender\n0            25    Male\n1            30  Female\n2            35    Male\n3            40  Female\n"}},"pos":5,"type":"cell"}
{"cell_type":"code","exec_count":8,"id":"22b01a","input":"from statistics import mode\n\nmoda = mode(data)\nprint (moda)","output":{"0":{"name":"stdout","output_type":"stream","text":"age\n"}},"pos":16,"type":"cell"}
{"cell_type":"code","exec_count":8,"id":"8ce61d","input":"df.head()","output":{"0":{"data":{"text/html":"<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>Age in years</th>\n      <th>Gender</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>25</td>\n      <td>Male</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>30</td>\n      <td>Female</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>35</td>\n      <td>Male</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>40</td>\n      <td>Female</td>\n    </tr>\n  </tbody>\n</table>\n</div>","text/plain":"   Age in years  Gender\n0            25    Male\n1            30  Female\n2            35    Male\n3            40  Female"},"exec_count":8,"output_type":"execute_result"}},"pos":6,"type":"cell"}
{"cell_type":"code","exec_count":9,"id":"94a17c","input":"# Creazione di un dataframe\ndata = {'A' : [2,12,12,19,19,20,20,20,25]}\ndf = pd.DataFrame(data)","pos":12,"type":"cell"}
{"cell_type":"code","exec_count":9,"id":"acc76a","input":"%who_ls","output":{"0":{"data":{"text/plain":"['data', 'df', 'matplotlib', 'moda', 'mode', 'os', 'pd', 'variable_labels']"},"exec_count":9,"output_type":"execute_result"}},"pos":17,"type":"cell"}
{"cell_type":"markdown","id":"3b45e4","input":"La moda in statistica si riferisce al valore o ai valori che appaiono più frequentemente in un insieme di dati. La formula per calcolare la moda dipende dal tipo di dati:\n\n- Per dati nominali (categorie senza un ordine specifico), la moda è semplicemente il valore che compare più spesso.\n\n- Per dati ordinali (categorie con un ordine specifico), puoi calcolare la moda come il valore con la massima frequenza.\n\nIn Python all'interno di Jupyter Notebook, puoi utilizzare la libreria `statistics` per calcolare la moda. Ad esempio:\n\n```python\nfrom statistics import mode\n\ndata = [1, 2, 2, 3, 3, 3, 4, 4, 5]\nmoda = mode(data)\nmoda\n```\n\nIn questo caso, il valore moda per i dati è 3. Assicurati di aver importato la libreria `statistics` prima di utilizzare la funzione `mode`.","pos":15,"type":"cell"}
{"cell_type":"markdown","id":"509302","input":"> Per trovare la mediana, è necessario individuare il punteggio che si trova al centro di questa classifica.\r\nelenco. Abbiamo nove punteggi, quindi il punteggio medio è il quinto (ha quattro punteggi sotto e quattro punteggi sopra).\r\nsotto e quattro sopra). La mediana è quindi 19, che è il quinto punteggio dell'elenco\n\nNell'esempio precedente è stato facile calcolare la mediana, dato che il numero di punteggi era dispari.\r\nQuando si ha un numero dispari di punteggi, c'è sempre un punteggio che rappresenta la mediana. Questo\r\nnon è così, invece, quando abbiamo un numero pari di punteggi. Se aggiungiamo il punteggio di 26 alla\r\nall'elenco precedente, ora abbiamo un numero pari di punteggIn questa situazione la mediana sarà compresa tra i due punteggi mediani, cioè tra il quinto e il sesto punteggio.\r\nquinto e sesto punteggio. La nostra mediana è, in questo caso, la media dei due punteggi in quinta e sesta posizione:\n$ (19 + 20) , 2 = 19,5$.\r\nsesta posizione$: (19 + 20) , 2 = 19,$5.n).n)","pos":14,"type":"cell"}
{"cell_type":"markdown","id":"673684","input":"## Statistica descrittiva\n### Media\n\n> Media = Somma di tutti i valori / Numero totale di valori\n\n$[\\bar{x} = \\frac{\\sum_{i=1}^{n} x_i}{n}]$\n\nDove:\n\n* $\\bar{x}$ rappresenta la media\n* $x_1$ rappresenta ogni valore individuale nel dataset\n* $n$ rappresenta il numero totale dei valori nel dataset\n\nLa media rappresenta una misura di tendenza centrale","pos":10,"type":"cell"}
{"id":0,"time":1699285126930,"type":"user"}
{"last_load":1699286741250,"type":"file"}