😋 Data processing utilities

Check the CHANGELOG file to have a global overview of the latest updates / new features ! 😋

Project structure

Check the provided notebooks to have an overview of the available features !

├── example_data        : data used for the demonstrations
├── loggers             : custom utilities for the `logging` module
│   ├── __init__.py         : defines useful utilities to control `logging`
│   ├── telegram_handler.py : custom logger using the telegram bot api
│   ├── time_logging.py     : custom timer features
│   └── tts_handler.py      : custom logger using the Text-To-Speech models
├── tests               : custom unit-testing for the different modules
│   ├── data               : test data files
│   ├── __reproduction     : expected output files for reproducibility tests
│   ├── test_custom_train_objects.py
│   ├── test_utils_audio.py
│   ├── test_utils_boxes.py
│   ├── test_utils_compile.py
│   ├── test_utils_distance.py
│   ├── test_utils_embeddings.py
│   ├── test_utils_files.py
│   ├── test_utils_image.py
│   ├── test_utils_keras.py
│   ├── test_utils_ops.py
│   ├── test_utils_sequence.py
│   ├── test_utils_stream.py
│   └── test_utils_text.py
├── utils
│   ├── audio                   : audio utilities
│   │   ├── audio_annotation.py     : annotation features for new TTS/STT dataset creation
│   │   ├── audio_io.py             : audio loading / writing
│   │   ├── audio_player.py         : audio playback functionality
│   │   ├── audio_processing.py     : audio normalization / processing
│   │   ├── audio_recorder.py       : audio recording functionality
│   │   ├── audio_stream.py         : audio streaming support
│   │   ├── mkv_utils.py            : processing for .mkv video format
│   │   ├── noisereducev1.py        : maintained version of the old `noisereduce` library
│   │   └── stft.py                 : implementations of various mel-spectrogram methods
│   ├── callbacks               : callback management system
│   │   ├── __init__.py
│   │   ├── callback.py             : base callback implementation
│   │   ├── displayer.py            : display-related callbacks
│   │   ├── file_saver.py           : file saving callbacks
│   │   └── function_callback.py    : function-based callbacks
│   ├── datasets                : dataset utilities
│   │   ├── audio_datasets          : audio dataset implementations
│   │   │   ├── common_voice.py         : Mozilla Common Voice dataset
│   │   │   ├── libri_speech.py         : LibriSpeech dataset
│   │   │   ├── processing.py           : audio dataset processing
│   │   │   ├── siwis.py                : SIWIS dataset
│   │   │   └── voxforge.py             : VoxForge dataset
│   │   ├── builder.py               : dataset building utilities
│   │   ├── loader.py                : dataset loading utilities
│   │   └── summary.py               : dataset summary tools
│   ├── image                   : image features
│   │   ├── bounding_box            : features for bounding box manipulation
│   │   │   ├── combination.py          : combines group of boxes
│   │   │   ├── converter.py            : box format conversion
│   │   │   ├── filters.py              : box filtering
│   │   │   ├── locality_aware_nms.py   : LA-NMS implementation
│   │   │   ├── metrics.py              : box metrics (IoU, etc.)
│   │   │   ├── non_max_suppression.py  : NMS implementation
│   │   │   ├── processing.py           : box processing
│   │   │   └── visualization.py        : box extraction / drawing
│   │   ├── custom_cameras.py       : custom camera implementations
│   │   ├── image_io.py             : image loading / writing
│   │   ├── image_normalization.py  : normalization schema
│   │   └── image_processing.py     : image processing utilities
│   ├── keras                   : keras and hardware acceleration utilities
│   │   ├── ops                     : operation interfaces for different backends
│   │   │   ├── builder.py              : operation builder
│   │   │   ├── core.py                 : core operations
│   │   │   ├── execution_contexts.py   : execution context management
│   │   │   ├── image.py                : image operations
│   │   │   ├── linalg.py               : linear algebra operations
│   │   │   ├── math.py                 : mathematical operations
│   │   │   ├── nn.py                   : neural network operations
│   │   │   ├── numpy.py                : numpy-compatible operations
│   │   │   └── random.py               : random operations
│   │   ├── runtimes                : model runtime implementations
│   │   │   ├── onnx_runtime.py         : ONNX runtime
│   │   │   ├── runtime.py              : base runtime class
│   │   │   ├── saved_model_runtime.py  : saved model runtime
│   │   │   ├── tensorrt_llm_runtime.py : TensorRT LLM runtime
│   │   │   └── tensorrt_runtime.py     : TensorRT runtime
│   │   ├── compile.py              : graph compilation features
│   │   └── gpu.py                  : GPU utilities
│   ├── text                    : text-related features
│   │   ├── abreviations
│   │   ├── parsers                 : document parsers (new implementation)
│   │   │   ├── combination.py      : box combination for parsing
│   │   │   ├── docx_parser.py      : DOCX document parser
│   │   │   ├── java_parser.py      : Java code parser
│   │   │   ├── md_parser.py        : Markdown parser
│   │   │   ├── parser.py           : base parser implementation
│   │   │   ├── pdf_parser.py       : PDF parser
│   │   │   ├── py_parser.py        : Python code parser
│   │   │   └── txt_parser.py       : text file parser
│   │   ├── cleaners.py             : text cleaning methods
│   │   ├── ctc_decoder.py          : CTC-decoding
│   │   ├── metrics.py              : text evaluation metrics
│   │   ├── numbers.py              : numbers cleaning methods
│   │   ├── sentencepiece_tokenizer.py : sentencepiece tokenizer interface
│   │   ├── text_processing.py      : text processing functions
│   │   ├── tokenizer.py            : tokenizer implementation
│   │   └── tokens_processing.py    : token-level processing
│   ├── threading               : threading utilities
│   │   ├── async_result.py        : asynchronous result handling
│   │   ├── priority_queue.py      : priority queue with order consistency
│   │   ├── process.py             : process management
│   │   └── stream.py              : data streaming implementation
│   ├── comparison_utils.py     : convenient comparison features for various data types
│   ├── distances.py            : distance and similarity metrics
│   ├── embeddings.py           : embeddings saving / loading
│   ├── file_utils.py           : data saving / loading
│   ├── generic_utils.py        : generic features 
│   ├── plot_utils.py           : plotting functions
│   ├── sequence_utils.py       : sequence manipulation
│   └── wrappers.py             : function wrappers and decorators
├── example_audio.ipynb
├── example_custom_operations.ipynb
├── example_generic.ipynb
├── example_image.ipynb
├── example_text.ipynb
├── LICENSE
├── Makefile
├── README.md
└── requirements.txt

The loggers module is independant from the utils one, making it easily reusable / extractable.

Installation and usage

See the installation guide for a step-by-step installation 😄

Here is a summary of the installation procedure, if you have a working python environment :

Clone this repository : git clone https://github.com/yui-mhcp/data_processing.git
Go to the root of this repository : cd data_processing
Install requirements : pip install -r requirements.txt
Open an example notebook and follow the instructions !

The utils/{audio / image / text} modules are not loaded by default, meaning that it is not required to install the requirements for a given submodule if you do not want to use it. In this case, you can simply remove the submodule and run the pipreqs command to compute a new requirements.txt file !

Important Note : no backend (i.e., tensorflow, torch, ...) is installed by default, so make sure to properly install them before !

TO-DO list

Audio

Image

Add image loading / writing support
Add video loading / writing support
Add support for rotated bounding boxes
Implement a keras 3 Non-Maximal Suppression (NMS)
Implement the Locality-Aware NMS (LaNMS)

Text

Generic utilities

Contacts and licence

Contacts :

Mail : [email protected]
Discord : yui0732

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). See the LICENSE file for details.

This license allows you to use, modify, and distribute the code, as long as you include the original copyright and license notice in any copy of the software/source. Additionally, if you modify the code and distribute it, or run it on a server as a service, you must make your modified version available under the same license.

For more information about the AGPL-3.0 license, please visit the official website

Notes and references

The text cleaning module (text.cleaners) is inspired from NVIDIA tacotron2 repository. Their implementation of Short-Time Fourrier Transform (STFT) is also available in audio/stft.py, adapted in keras 3.
The provided embeddings in example_data/embeddings/embeddings_256_voxforge.csv has been generated based on samples of the VoxForge dataset, and embedded with an AudioSiamese model (audio_siamese_256_mel_lstm).

Tutorials :

The Keras 3 API which has been (partially) adapted in the keras_utils/ops module to enable numpy backend, and tf.data compatibility
The tf.function guide

Citation

If you find this project useful in your work, please add this citation to give it more visibility ! 😋

@misc{yui-mhcp
    author  = {yui},
    title   = {A Deep Learning projects centralization},
    year    = {2021},
    publisher   = {GitHub},
    howpublished    = {\url{https://github.com/yui-mhcp}}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

😋 Data processing utilities

Project structure

Installation and usage

TO-DO list

Audio

Image

Text

Generic utilities

Contacts and licence

Notes and references

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
example_data		example_data
loggers		loggers
tests		tests
utils		utils
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENCE		LICENCE
README.md		README.md
example_audio.ipynb		example_audio.ipynb
example_custom_operations.ipynb		example_custom_operations.ipynb
example_generic.ipynb		example_generic.ipynb
example_image.ipynb		example_image.ipynb
example_text.ipynb		example_text.ipynb
requirements.txt		requirements.txt

License

yui-mhcp/data_processing

Folders and files

Latest commit

History

Repository files navigation

😋 Data processing utilities

Project structure

Installation and usage

TO-DO list

Audio

Image

Text

Generic utilities

Contacts and licence

Notes and references

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages