Python is a great programming language for Data Analytics and Data Science as it provides a number of free and open source libraries.
Numerical Python (Numpyy) is a package for fast scientific computation that revolves around arrays. The main array type is the ndarray that stores homogenous (same) data types. They objects provides fast and efficient computations.
- Numpy arrays are of fixed size at creation and changing the size will delete the original array
- Linear algebra operations, Fourier transform, and random number generation
- Provides a C API to allow C, C++, and Python extensions to access Numpy's DS and computatinal facilities
In DA it allows data to be passed to other Python libraries.
Pandas is used for creating and manipulating data that is stored in a structured form (tabular format)
A collection of packages for scientific computing:
- scipy.stats: Continuous and Discrete Probability, descriptive analysis, and statistical testing
- scipy.linalg: Linear algebra and matrix operations
- scipy.integrate: Numerical integration routines and differential equations
Used for machine learning in Python
Provides the tools for data visualisation.