|
|
--- |
|
|
license: other |
|
|
license_name: sla0044 |
|
|
license_link: >- |
|
|
https://github.com/STMicroelectronics/stm32ai-modelzoo/raw/refs/heads/main/hand_posture/LICENSE.md |
|
|
|
|
|
--- |
|
|
# st_cnn2d_handposture model |
|
|
|
|
|
## **Use case** : `Hand posture recognition` |
|
|
|
|
|
# Model description |
|
|
|
|
|
CNN2D_ST_HandPosture is a network topology designed by ST Teams to solve basic Hand Posture recognition use cases based on ST multi-zone Time-of-Flight sensor data. It is a convolutional neural network based model before feeding the data to the fully-connected (Dense) layer. It uses the distance and signal per spad 8x8 data. This is a very light model with very small foot prints in terms of FLASH and RAM as well as computational requirements. |
|
|
|
|
|
We recommend to use input size (8 x 8 x 2) but this network can support greater input size. |
|
|
|
|
|
The only input required to the model is the input shape and the number of outputs. |
|
|
|
|
|
In this folder you will find multiple copies of the CNN2D_ST_HandPosture model pretrained on a ST custom datasets - Please refer to th [stm32ai-modelzoo-services](https://github.com/STMicroelectronics/stm32ai-modelzoo-services) GitHub for more informations |
|
|
|
|
|
## Network information (for 8 hand postures) |
|
|
|
|
|
|
|
|
| Network Information | Value | |
|
|
|:-----------------------:|:---------------:| |
|
|
| Framework | TensorFlow | |
|
|
| Params | 2,752 | |
|
|
|
|
|
|
|
|
## Network inputs / outputs |
|
|
|
|
|
|
|
|
For an Time of Flight frame resolution of 8x8 and P classes |
|
|
|
|
|
| Input Shape | Description | |
|
|
| :----:| :-----------: | |
|
|
| (N, 8, 8, 2) | Batch ( 8 x 8 x 2 ) matrix of Time of Flight values (distance, signal per spad) for a 8x8 frame in FLOAT32.| |
|
|
|
|
|
| Output Shape | Description | |
|
|
| :----:| :-----------: | |
|
|
| (N, P) | Batch of per-class confidence for P classes in FLOAT32| |
|
|
|
|
|
|
|
|
## Recommended platforms |
|
|
|
|
|
|
|
|
| Platform | Supported | Recommended | |
|
|
|:--------:|:---------:|:-----------:| |
|
|
| STM32F4 | [x] | [x] | |
|
|
| STM32L4 | [x] | [x] | |
|
|
| STM32U5 | [x] | [] | |
|
|
|
|
|
|
|
|
# Performances |
|
|
|
|
|
## Metrics |
|
|
|
|
|
Measures are done with default STEdge AI Dev Cloud configuration with enabled input / output allocated option. |
|
|
|
|
|
|
|
|
### Reference memory footprint based on ST_VL53LxCX_handposture_dataset (see Accuracy for details on dataset) |
|
|
|
|
|
| Model | Format | Input Shape | Series | Activation RAM (KiB) | Runtime RAM (KiB) | Weights Flash (KiB) | Code Flash (KiB) | Total RAM (KiB) | Total Flash (KiB) | STEdge AI Core version | |
|
|
|:--------------------------------------------------------------------------------------------------------------------------------------------------------:|:------:|:-----------:|:-------:|:--------------:|:-----------:|:-------------:|:----------:|:-----------:|:-----------:|:---------------------:| |
|
|
| [st_cnn2d_handposture](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/hand_posture/st_cnn2d_handposture/ST_pretrainedmodel_custom_dataset/ST_VL53L8CX_handposture_dataset/st_cnn2d_handposture_8classes/st_cnn2d_handposture_8classes.keras) | FLOAT32 | 8 x 8 x 2 | STM32F4 | 1.63 | 0.28 | 10.75 | 6.16 | 1.91 | 16.19 | 3.0.0 | |
|
|
|
|
|
|
|
|
### Reference inference time based on ST_VL53LxCX_handposture_dataset (see Accuracy for details on dataset) |
|
|
|
|
|
|
|
|
| Model | Format | Resolution | Board | Frequency | Inference time (ms) | STEdge AI Core version | |
|
|
|:-----------------:|:------:|:----------:|:----------------:|:-------------:|:-------------------:|:---------------------:| |
|
|
| [st_cnn2d_handposture](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/hand_posture/st_cnn2d_handposture/ST_pretrainedmodel_custom_dataset/ST_VL53L8CX_handposture_dataset/st_cnn2d_handposture_8classes/st_cnn2d_handposture_8classes.keras) | FLOAT32 | 8 x 8 x 2 | STM32F401 | 84 MHz | 1.46 ms | 3.0.0 | |
|
|
|
|
|
### Accuracy with ST_VL53LxCX_handposture_dataset |
|
|
|
|
|
Number of classes: 8 [None, FlatHand, Like, Dislike, Fist, Love, BreakTime, CrossHands]. Training dataset number of frames: 3,031. Test dataset number of frames: 1146. |
|
|
|
|
|
|
|
|
| Model | Dataset |Format | Resolution | Accuracy (%) | |
|
|
|:--------------------------------------------------------------------------------------------------------------------------------------------------------:|:-------------------------------:|:---------:|:----------:|:------------:| |
|
|
| [st_cnn2d_handposture](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/hand_posture/st_cnn2d_handposture/ST_pretrainedmodel_custom_dataset/ST_VL53L8CX_handposture_dataset/st_cnn2d_handposture_8classes/st_cnn2d_handposture_8classes.keras) | ST_VL53L8CX_handposture_dataset | FLOAT32 | 8 x 8 x 2 | 98.47 | |
|
|
| [st_cnn2d_handposture](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/hand_posture/st_cnn2d_handposture/ST_pretrainedmodel_custom_dataset/ST_VL53L5CX_handposture_dataset/st_cnn2d_handposture_8classes/st_cnn2d_handposture_8classes.keras) | ST_VL53L5CX_handposture_dataset | FLOAT32 | 8 x 8 x 2 | 99.21 | |
|
|
|
|
|
|
|
|
## Retraining and Integration in a simple example: |
|
|
|
|
|
Please refer to the stm32ai-modelzoo-services GitHub [here](https://github.com/STMicroelectronics/stm32ai-modelzoo-services) |
|
|
|