Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: extract trend from signal #8

Draft
wants to merge 24 commits into
base: main
Choose a base branch
from
Draft

Conversation

VebjornG
Copy link
Contributor

@VebjornG VebjornG commented Oct 11, 2023

Trend extraction by the use of the Hilbert-Huang Transform.

PROBLEM: Some of the tests fail. This could be because we're calculating the Hilbert spectrum and not the Hilbert marginal spectrum.

Description

This function extracts the trend of a signal by the use of the Hilbert-Huang Transform. This is a powerful tool in the sense that it can take any non-stationary and non-linear time series and find the trend even though there's lots of noise in the signal. The package PyEMD is used which does not have a function to find the Hilbert spectrum of the transformed signal. In order to compensate for this, a manual calculation of the Hilbert spectrum is performed. The algorithm is defined in https://wwaw.researchgate.net/publication/261234992_Trend_extraction_based_on_Hilbert-Huang_transform and it goes as follows:

  1. Decompose signal into IMFs
  2. Compute the Hilbert spectrum
  3. Compute the cross energy ratio of the Hilbert marginal spectrums of the consecutive IMFs
  4. Find significant IMFs based on step 3
  5. Calculate the trend as a sum of significant IMFs

Motivation and Context

It will be used to develop methods required to measure data quality.

How Has This Been Tested?

Generated synthetic signal with nonuniform timestamps, Gaussian noise and gaps, and compared with the obtained trend.

Screenshots:

Figure_1
Figure_2

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Refactor (non-breaking change which improves implementation)
  • Performance (non-breaking change which improves performance. Please add associated performance test and results)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Non-functional change (xml comments/documentation/etc)

Contributor Checklist:

  • My code follows the code style of this project.
  • I have added an example of my new feature and included it in the documentation.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • My Pull Request name follows the naming convention fix: <description>, feat: <description>, etc.

Reviewer Checklist for Charts compliant functions:

  • The docstrings of the new function follow the contributing guidelines.
  • The new function is professionally documented
  • The new function and associated scripts are covered by one or more unit tests and code coverage did not decrease.
  • The new function is accompanied by an example and it is included in the Gallery of Charts.
  • The new function is reviewed in Chromatic. Access the storybook build results url and comment, approve or deny.
  • All function inputs, arguments, and outputs have a supported data type and have human readable names.
  • No code language is included in the description of the function or parameters (e.g use "polynomial order" instead of "poly_order")

@github-actions
Copy link

github-actions bot commented Oct 11, 2023

Unit Test Results

0 tests  ±0   0 ✔️ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
0 files   ±0   0 ±0 

Results for commit eb5733b. ± Comparison against base commit 5610c1b.

♻️ This comment has been updated with latest results.

Copy link

codecov bot commented Nov 21, 2023

Codecov Report

Merging #8 (1f082a2) into main (5943add) will decrease coverage by 21.19%.
The diff coverage is 13.75%.

Additional details and impacted files
@@             Coverage Diff             @@
##             main       #8       +/-   ##
===========================================
- Coverage   91.22%   70.03%   -21.19%     
===========================================
  Files         103      104        +1     
  Lines        3770     3958      +188     
  Branches      815      851       +36     
===========================================
- Hits         3439     2772      -667     
- Misses        207     1056      +849     
- Partials      124      130        +6     
Files Coverage Δ
indsl/filter/__init__.py 100.00% <100.00%> (ø)
indsl/filter/hilbert_huang_transform.py 12.83% <12.83%> (ø)

... and 39 files with indirect coverage changes

pyproject.toml Outdated Show resolved Hide resolved
from .simple_filters import status_flag_filter
from .wavelet_filter import wavelet_filter


TOOLBOX_NAME = "Filter"

__all__ = ["wavelet_filter", "status_flag_filter"]
__all__ = ["hilbert_huang_transform", "wavelet_filter", "status_flag_filter"]

__cognite__ = ["wavelet_filter", "status_flag_filter"]
Copy link
Contributor

@neringaalt neringaalt Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add it here as well



def generate_synthetic_signal():
wave = sine_wave(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the tests that you removed, they are testing different types of signals

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The algorithm should work with different signals


# DROP_SENTINAL is used to get the smallest (most negative)
# number that can be represented with a 32-bit signed integer.
DROP_SENTINAL = np.iinfo(np.int32).min
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And use it directly in the code

error_tolerance: float = 0.05,
return_trend: bool = True,
) -> pd.Series:
r"""Perform the Hilbert-Huang Transform (HHT) to find the trend of a signal.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can look at the description from removed function to get some idea how to write documentation.
You dont need to explain what different type of signals are...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants