Skip to content

Releases: jrudar/LANDMark

LANDMark Classifier v2.1.1

27 Apr 04:52
5237377
Compare
Choose a tag to compare
  • Removed support for Python <3.10
  • Updated minimum supported NumPy package
  • Minor bug fixes related to dependencies
  • 'terminal' is the default proximity method until 'path' is properly implemented

LANDMarkClassifier Version 2.1.0

12 Jul 20:42
99f5b05
Compare
Choose a tag to compare
  • 'use_cascade' parameter is now active. This parameter extends X usng the results of the decision function at each node. These new features are then used alongside the original features during the training (and prediction) steps within each node in the tree.
  • Updated API
  • Updated tests
  • Added notebooks on simple and advanced usage
  • Added notebooks demonstrating the effects of some important parameters
  • Changed code in linear models so that 'y_min' is equal to 80% of minimum count of the bootstrapped resampled data
  • 5-Fold Stratified Cross-Validation is now explicitly stated to be used for linear models
  • Use of a neural network model to split a node now requires that at greater than 'minority_sz_nnet' samples of the minority class is present in the bootstrapped resampled data.
  • 'minority_sz_lm' controls how many samples must be present to split using a linear model
  • Autodetection of sparse matrix and conversion to CSR format if sparsity >= 90% sparsity
  • New way to create high-dimensional embedding for calculation of dissimilarities using the proximity() function: "path". By using this parameter a binary matrix containing all nodes visited by sample is created rather than just the terminal nodes. Original method is now "terminal".
  • Initial layer of neural network model uses 'mish' activation function and other small changes to the architecture of the network
  • Replaced the _get_node_ids() function in tree.py with _get_all_nodes(). This is done so that the new embedding approach can be implimented
  • Updated version (2.1.0) due to new non-breaking feature

What's Changed

  • Preservation of Proximity Information Within LANDMark Trees by @jrudar in #13

Full Changelog: LANDMarkClassifier-v.2.0.7...LANDMarkClassifier-v.2.1.0

LANDMarkClassifier Version 2.0.7

06 Jun 05:40
4c386c1
Compare
Choose a tag to compare
  • Changed the "decision_function()" behavior for models that return probabilities. Now all probabilities greater than 0.5 are 1 and less than 0.5 are -1. This does not change the behavior of LANDMark, but it does allow for the code to be cleaned up considerably as now probabilities no longer need to be handled. This affects the "get_split()", "_predict()", and "_proximity()" functions.
  • Simplified tree-traversal in the "_predict()", and "_proximity()" functions. No longer uses nested-if statements.
  • Added "predict_proba()" function to "ETClassifier()" wrapper.
  • Preparing to introduce a new hyper-parameter, "use_cascade". This parameter appends the output of the decision function onto X (Inspiratin from https://www.tandfonline.com/doi/full/10.1080/15481603.2021.1965399). This parameter is not currently enabled
  • Updated version to 2.0.7 to reflect these changes

LANDMarkClassifier Version 2.0.6

01 Jun 13:41
dfcaa1a
Compare
Choose a tag to compare

July 2023 Update 2

Updated version to 2.0.6
LANDMarkClassifier()._check_params() now returns type List[np.ndarray, np.ndarray]
Removed TransformerMixin from imports in LANDMark.py
Removed tensorflow dependencies
Removed unused imports to 'gc' and 'pandas' from lm_linear_clfs.py
Each type of classification model now has its own module
Random selection of 'alpha' for RidgeClassifier when samples are fewer than 6
Neural network now uses PyTorch (AMP not yet enabled)
Test coverage improvement
Linear models no longer split nodes with few samples. Extra Trees Classifier using max_depth of 1 used instead

LANDMarkClassifier Version 2.0.5

01 Jun 04:41
feee482
Compare
Choose a tag to compare

June 2023 Update

  • Code readability (eg: 'if some_var == False' changed to 'if some_var is False')
  • Fixed formatting using black
  • Added a function to validate LANDMark parameters
  • Error raised if predict() is called on a model which has not been fit
  • Simplified section in Node() that handles stopping criteria
  • Updated Line 209 in 'lm_base_clfs.py': Sometimes the efficient LOOCV fails so a switch to 5-fold CV solves the issue

LANDMark Classifier version 2.0.4

20 May 02:37
bc50c8c
Compare
Choose a tag to compare

LANDMark Classifier version 2.0.3

19 May 19:12
0b017f6
Compare
Choose a tag to compare

Fixed broken PyPI Release

LANDMark Classifier version 2.0.2

19 May 17:58
93106db
Compare
Choose a tag to compare
  • Enhanced ability to resample by accepting 'imbalanced-learn' approaches.
  • Added additional split criteria based on the gain ratio and Tsallis entropy (tsallis, gain, gain-ratio, tsallis-gain-ratio)
  • 'q' parameter has been exposed and is now available for hyper-parameter tuning (for Tsallis entropy)

LANDMark Classifier version 2.0.1

17 May 23:04
182333a
Compare
Choose a tag to compare
  • Minor fix to README

LANDMark Classifier version 2.0.0

17 May 22:50
563474c
Compare
Choose a tag to compare
  • Removed dependency on 'shap' - This can be assessed post-hoc using a variety of methods and improves LANDMark performance. May be re-introduced in the future.
  • Adding type annotations and parameter checking for LANDMark input and hyper-parameters. Additional annotations will be added in later patches for subsequent modules.
  • More informative class names (eg: BaggingClassifier -> Ensemble)
  • Simplified the Ensemble() class
  • Updated README, API, CONTRIBUTIONS, ISSUES, BUG_REPORT files
  • Added tests
  • Considerable reduction in redundant code by combining all linear models into a single base classifier (LMClassifier)
  • Removed unused modules
  • Bumped version to version 2.0