- Interactive jupyter demo with binder
- Config restructured - doctrings with explanation on variables visible in intellisense help. Config values options and type validations applied (much more strict).
- Return type config change. Now always same result - class with best models resulte, all results and results with history
- Early stopping in Conjugate gradient and LNU
- Analysis of optimized values in compare models
- Printed tables better print as long names are broken automatically.
- find_optimal_input_for_models that can compare various input forms for models.
- Annoying warnings filtered on import filtered
- Type hints
- String embedding - One hot or label encoding
- Data_preprocessing and logging from misc moved into own projects and imported. Projects are called mylogging and mydatapreprocessing. It'necessary to have corresponding version
- Various transforms added as derived columns. For example: Difference transform, Rolling window and rolling std transformation, Distance from the mean.
- Fast fourier transform as new information.
- Short way of using functions with positional argument - predictit.main.predict(np.random.randn(1000)). Same with predict_multiple and compare models.
- Devil terminology fail fixed (using correct multistep instead of batch)
- New custom models added (ridge regresion and Levenberg-Marquardt)
- File input can contain more files in list.
- Reformated with black
- Lazy imports (faster import)
- Plot module moved into mypythontools library and imported
- Import functions from main in init to be able to call predictit.predict instead of predictit.main.predict
- Classifiers add to sklearn model. Also possibility to input sklearn model as parameter and call all the models as a string
- Data discretization possibility (binning)
- CI moved from TravisCI (tests to computationally intensive) to own (mypythontools) python scripts (called from utils folder)
- database.py removed to mydatapreprocessing
- Internal stuff renamed with _ appendix for IDE intellisense be more readable
- New data input. Config Api changed! No data_source anymore. Setup path, url or python data in Config.data. Check configuration for some examples.
- New data input formats - dictionary, list, parquet, json, h5 or url that return data in request
- Data preprocessing functions possible now in dataframe
- Data consolidation simplified - only in dataframe now
- Remove nans changed. Now first columns with nans over threshold removed, then removed or replaced based on Config.
- Data_load and data_consolidation arguments changed. Config instance no necessary now, so functions can be used outside this library.
-
Travis CI building releases on Pypi as well on github.
-
Config redefined to class and so enabling intellisense in IDE.
- API change!!! instead of config['value'] use Config.value. But Config.update({'variable': 'value'}) still works !!!
-
Creating inputs defined in visual test
-
Csv locale option - separator and decimal for correct data reading.
-
Tables remade in tabulate and returned as dataframe.
-
Return type 'all' removed (replaced with dataframe).
- Config value optimization. Some config variable can be optimized on defined values. For each model the best option will be used
- Plot results of all models versions (optmimized)
- Option whether to evaluate error criterion on preprocessed data or on original data
- Option to sort detailed result table by error or by name
- Compare models redefined. More fair for all types of models, but repeatit = 1, so more samples necessary.
- Many new models from sklearn - Decision trees, bagging, Gradient boosting...
- Multiprocessing applied. Two options - 'pool' or 'process'
- Data smoothing - Savitzky-Golay filter in data preprocessing
- Publish script to generate documentation and push to pypi
- Data processed on 'standard' shape [n_samples, n_features] so in same shape as dataframe
- Config print_config function to print config file as help
- Updated config values confirmation. If value misspelled (not in list) then error.
- Detailed results in table (not printed)
- Config how many results to plot and print
- Line time and memory profiling
- Validation mode in Config. Results are evaluated on data that was not in train data - used in compare_models.
- Remove nan values - all column option was added
- User colored warnings in misc (not only traceback warnings) and colorized error raising
- Two options / modes of analyze - Originall data and preprocessed data
- Simple GUI (just config and output)
- Added customizable config presets (fast, normal, optimize)
- Creating inputs in new define_inputs module called from main (just once), not in models (for each model separatedly). It use no lists but numpy stride tricks
- Optimize loop , find best models loop and predict loop joined into one main loop
- Models divided into train and predict functions - Repeat loop is only on predict function
- Redefined models, to use (X, y, x_input) tuple as input
- Config values putted in dictionary [! other way to use it!]
- Basic data postprocessing - Power transformation - Two options 1) On train data (change error criterion) 2) Only on output
- Tensorflow models architecture configurable with arguments - layers and its parameters in list.
- Choose which models to optimize (comment out config models_parameters_limits dictionary)
- Similiar models generalized to one (E.g. AR, ARMA, ARIMA > statsmodels_autoregressive)
- Plot and data preprocessing are made in own module, not in main
- New model similarity based error criteria (not exact point to point comparison) 1) Imported time warping 2) Own sliding window error
- One more level of debug - stop at first warning - add warning exceptions to config to hide outer depracation etc. warnings
- More user friendly warnings (color syntax highlighted) + Error separated from error location