Skip to content

Latest commit

 

History

History
25 lines (19 loc) · 1.28 KB

DA_Python.md

File metadata and controls

25 lines (19 loc) · 1.28 KB

Main Libraries and Packages for Data Analytics and Data Science

Python is a great programming language for Data Analytics and Data Science as it provides a number of free and open source libraries.

Numpy

Numerical Python (Numpyy) is a package for fast scientific computation that revolves around arrays. The main array type is the ndarray that stores homogenous (same) data types. They objects provides fast and efficient computations.

  • Numpy arrays are of fixed size at creation and changing the size will delete the original array
  • Linear algebra operations, Fourier transform, and random number generation
  • Provides a C API to allow C, C++, and Python extensions to access Numpy's DS and computatinal facilities

In DA it allows data to be passed to other Python libraries.

Pandas

Pandas is used for creating and manipulating data that is stored in a structured form (tabular format)

Scipy

A collection of packages for scientific computing:

  • scipy.stats: Continuous and Discrete Probability, descriptive analysis, and statistical testing
  • scipy.linalg: Linear algebra and matrix operations
  • scipy.integrate: Numerical integration routines and differential equations

Scikit-Learn

Used for machine learning in Python

Matplotlib and Seaborn

Provides the tools for data visualisation.