As part of the ML SIG Summer Project.
The Data is obtained from Quandl (restricted to the WIKI table) which requires an API key. The file get_data.py contains the necessary functions.
Usage:
python get_data.py [symbols]
For a list of available symbols for download, see: WIKI-datasets-codes.csv
- High-Low: It is the difference between High and Low prices of a stock for a particular day.
- PCT_change: It calculates the percent change shift on 5 days.
- MDAV5: It is the Rolling Mean Window calculation for 5 days.
- EMA5: Exponential Moving Average for 5 days.
- MACD/MACD_SignalLine: Moving Average Convergence/Divergence Oscillator. Difference between EMA26 - EMA12.
- Return Out: Shifts the Adj. Close for stock prices by 1 day.
- SVM (SVC)
- Linear Kernel
- Polynomial Kernel
- Radial Basis Functional Kernel
- Sigmoid Kernel For reading: refer this
- Ensemblers
- Random Forest Classifier For reading: refer this
The repository houses:
- 'datasets' folder that is populated with stock data the first time script is run. To repopulate data:
python get_data.py [quandl_symbol]
- 'research-papers' folder - the papers referred during the development of the model.
- 'environment.yml', 'requirements.txt' - See this
- 'WIKI-datasets-codes.csv' - A list of symbols to download data from Quandl.
The files environment.yml, requirements.txt make it easy to replicate the environment required for running the model.
- For anaconda:
- To install anaconda, refer this
- The base directory contains 'environment.yml' file. To replicate the same environment:
conda env create -f environment.yml
- To install anaconda, refer this
- For python3 virtual environment:
- To install virtualenv, refer this
pip install virtualenv virtualenv --python=python3 ml-stock-prediction
- The base directory contains 'requirements.txt' file. To install the required packages:
pip install -r requirements.txt
- To install virtualenv, refer this
Though the datasets folder has some symbol stock prices. You can populate with more.
python get_data.py [symbols]
You can run the model on a list of symbols supplied as command line arguments.
python main.py [symbols]