Skip to content

Methods for numerical differentiation of noisy data in python

License

Notifications You must be signed in to change notification settings

luckystarufo/PyNumDiff

 
 

Repository files navigation

Python for Numerical Differentiation of noisy time series data

Documentation Status PyPI version DOI

PyNumDiff: Python for Numerical Differentiation of noisy time series data.

Table of contents

Introduction

PyNumDiff is a Python package that implements various methods for computing numerical derivatives of noisy data, which can be a critical step in developing dynamic models or designing control. There are four different families of methods implemented in this repository: smoothing followed by finite difference calculation, local approximation with linear models, Kalman filtering based methods and total variation regularization methods. Most of these methods have multiple parameters involved to tune. We take a principled approach and propose a multi-objective optimization framework for choosing parameters that minimize a loss function to balance the faithfulness and smoothness of the derivative estimate. For more details, refer to this paper.

Structure

PyNumDiff/
  |- README.md
  |- pynumdiff/
     |- __init__.py
     |- __version__.py
     |- finite_difference/
     |- kalman_smooth/
     |- linear_model/
     |- smooth_finite_difference/
     |- total_variation_regularization/
     |- utils/
     |- optimize/
        |- __init__.py
        |- __optimize__.py
        |- finite_difference/
        |- kalman_smooth/
        |- linear_model/
        |- smooth_finite_difference/
        |- total_variation_regularization/
     |- tests/
  |- examples
     |- 1_basic_tutorial.ipynb
     |- 2a_optimizing_parameters_with_dxdt_known.ipynb
     |- 2b_optimizing_parameters_with_dxdt_unknown.ipynb
  |- docs/
     |- Makefile
     |- make.bat
     |- build/
     |- source/
        |- _static
        |- _summaries
        |- conf.py
        |- index.rst
        |- ...
  |- setup.py
  |- .gitignore
  |- .travis.yml
  |- LICENSE
  |- requirements.txt

Getting Started

Prerequisite

PyNumDiff requires common packages like numpy, scipy, matplotlib, pytest (for unittests), pylint (for PEP8 style check). For a full list, you can check the file requirements.txt

In addition, it also requires certain additional packages:

  • Total Variation Regularization methods: cvxpy
  • Linear Model Chebychev: pychebfun

When using cvxpy, our default solver is set to be MOSEK (highly recommended), you would need to download their free academic license from their website. Otherwise, you can also use other solvers which are listed here.

Installing

The code is compatible with Python 3.x. It can be installed using pip or directly from the source code.

Installing via pip

pip install pynumdiff

Installing from source

To install this package, run python ./setup.py install from inside this directory.

Usage

PyNumDiff uses Sphinx for code documentation. So you can see more details about the API usage there.

Basic usages

  • Basic Usage: you provide the parameters
        x_hat, dxdt_hat = pynumdiff.sub_module.method(x, dt, params, options)     
  • Advanced usage: automated parameter selection through multi-objective optimization
        params, val = pynumdiff.optimize.sub_module.method(x, dt, params=None, 
                                                           tvgamma=tvgamma, # hyperparameter
                                                           dxdt_truth=None, # no ground truth data
                                                           options={})
        print('Optimal parameters: ', params)
        x_hat, dxdt_hat = pynumdiff.sub_module.method(x, dt, params, options={'smooth': True})`

Notebook examples

We will frequently update simple examples for demo purposes, and here are currently exisiting ones:

Important notes

  • Larger values of tvgamma produce smoother derivatives
  • The value of tvgamma is largely universal across methods, making it easy to compare method results
  • The optimization is not fast. Run it on subsets of your data if you have a lot of data. It will also be much faster with faster differentiation methods, like savgoldiff and butterdiff, and probably too slow for sliding methods like sliding DMD and sliding LTI fit.
  • The following heuristic works well for choosing tvgamma, where cutoff_frequency is the highest frequency content of the signal in your data, and dt is the timestep: tvgamma=np.exp(-1.6*np.log(cutoff_frequency)-0.71*np.log(dt)-5.1)

Running the tests

We are using Travis CI for continuous intergration testing. You can check out the current status here.

To run tests locally, type:

> pytest pynumdiff

Citation

@ARTICLE{9241009, author={F. {van Breugel} and J. {Nathan Kutz} and B. W. {Brunton}}, journal={IEEE Access}, title={Numerical differentiation of noisy data: A unifying multi-objective optimization framework}, year={2020}, volume={}, number={}, pages={1-1}, doi={10.1109/ACCESS.2020.3034077}}

License

This project utilizes the MIT LICENSE. 100% open-source, feel free to utilize the code however you like.