Deep neural network inference on energy harvesting tiny devices has emerged as a solution for sustainable edge intelligence. However, compact models optimized for continuously-powered systems may become suboptimal when deployed on intermittently-powered systems. This paper presents the pruning criterion, pruning strategy, and prototype implementation of iPrune, the first framework which introduces intermittency into neural network pruning to produce compact models adaptable to intermittent systems. The pruned models are deployed and evaluated on a Texas Instruments device with various power strengths and TinyML applications. Compared to an energy-aware pruning framework, iPrune can speed up intermittent inference by 1.1 to 2 times while achieving comparable model accuracy.
Demo video: https://youtu.be/Dzg46_MO66w
Below is an explanation of the directories/files found in this repository.
pruning/datasets/
contains the datasets used for three models.pruning/models/
contains the model information for training.pruning/onnx_models/
contains the onnx models deployed on TI-MSP430FR5994.pruning/pruning_utils/
contains auxiliary functions used in the network pruning.pruning/main.py
is the entry file in the intermittent-aware neural network pruning.pruning/config.py
contains the model configuration and tile size used during runtime inference.pruning/prune.squeezenet.sh
,pruning/prune.har.sh
, andpruning/prune.kws_cnn_s.sh
are the scripts that run the intermittent-aware neural network pruning.inference-library/
contains the both inference runtime library designed for intermittently-powered systems and coutinuously-powered systems (currently supports convolution, sparse convolution, fully connected layers, sparse fully connected layers, max pooling, global average pooling, and batch normalization layers).
Here are basic software to build the intermittent-aware neural network pruning
- Python 3.7
- Several deep learning Python libraries defined in
inference-library/requirements.txt
. Those libraries can be installed withpython3.7 -m pip install -r requirements.txt
.
Here is the basic software and hardware needed to build/run the intermittent inference runtime library.
- Code composer studio >= 11.0
- MSP-EXP430FR5994 LaunchPad
- MSP DSP Library 1.30.00.02
- MSP430 driverlib 2.91.13.01
- Download/clone this repository
- Install dependencies
python3.7 -m pip install -r inference-library/requirements.txt
- Run intermittent-aware neural network pruning scripts:
bash prune.kws_cnn_s.sh
- Download/clone this repository
- Convert the provided pre-trained models with the command
cd inference-library && python3.7 transform.py --target msp430 --hawaii (pruned_cifar10|pruned_har|pruned_kws_cnn) --method (intermittent|energy) --sparse
to specify the target platform, the inference engine, the model, and pruning method to deploy. - Download and extract MSP DSP Library to
inference-library/TI-DSPLib
and apply the patch with the following command:
cd TI-DSPLib/ && patch -Np1 -i ../TI-DSPLib.diff
- Download and extract MSP430 driverlib, and copy
driverlib/MSP430FR5xx_6xx
folder into theinference-library/msp430/
folder. - Import the folder
inference-library/msp430/
as a project in CCSTUDIO