SAMO: Streaming Architecture Mapping Optimisation

Last update: Dec 10, 2022

Overview

SAMO: Streaming Architecture Mapping Optimiser

The SAMO framework provides a method of optimising the mapping of a Convolutional Neural Network model onto an FPGA platform for Streaming Architecture frameworks. Both a Simulated Annealing and Brute Force optimiser are implemented. We currently support the following frameworks:

Installation

You can install this package using:

python -m pip install samo

Usage

The general usage of the SAMO tool can be seen by running python -m samo --help.

Example platform configurations are given in the platforms directory and example CNN models can be generated by running python scripts/generate_networks.py.

FINN

In order to run the optimiser with the FINN toolflow, the first step is to download the following fork

git clone https://github.com/Yu-Zhewen/finn.git
cd finn
git checkout 4cc0b6fdae2f5c06f0b5bcc6fa45fba4d8b69111

As FINN requires docker, set SAMO_DIR to the path of SAMO in run_docker.sh, before entering the docker.

bash run_docker.sh

Within the docker, generate the FINN-ONNX through the following steps.

cd ../samo
cp models/${network}.onnx outputs/saved/finn/${network}.onnx
cp ../finn/notebooks/samo/config/${network}.json ../finn/notebooks/samo/config.json
jupyter nbconvert --to notebook --execute ../finn/notebooks/samo/pre_optimiser_steps.ipynb
mv ../finn/notebooks/samo/pre_optimiser_steps.nbconvert.ipynb outputs/saved/finn/${network}_pre_optimiser_steps.nbconvert.ipynb

To optimise the CNN model in the FINN-ONNX format, you need to do:

python -m samo --optimiser annealing --model outputs/saved/finn/${network}_pre_optimiser.onnx  \
    --backend finn --platform platforms/zedboard.json \
    --output-path outputs/saved/finn/${network}_post_optimiser.onnx

Finally, the following command is used to generate the hardware.

jupyter nbconvert --to notebook --execute ../finn/notebooks/samo/post_optimiser_steps.ipynb

HLS4ML

This tool can be used to generate optimised designs for the HLS4ML framework. SAMO tunes the reuse-factor for layers of the CNN model, and generates a Resource driven design.

To optimise a keras model for a given platform, run the following:

python -m samo --optimiser annealing --model models/model.keras \
    --backend hls4ml --platform platforms/zedboard.json \
    --output-path outputs/model_hls4ml.json

The previous command generates a configuration file (outputs/model_hls4ml.json), which can be used by the HLS4ML to generate hardware. To do this, you will need to use the HLS4ML API to convert this configuration file into a HLS project.

import hls4ml
from tensorflow import keras

# load the configuration
with open("outputs/model_hls4ml.json", "r") as f:
    config = json.load(f)

# load the platform
with open("platforms/zedboard.json", "r") as f:
    platform = json.load(f)

# load the keras model
model = keras.models.load_model("models/model.keras")

# create the hls model
hls_model = hls4ml.converters.convert_from_keras_model(model, hls_config=config,
        output_dir="outputs/hls4ml_prj",  io_type="io_stream", fpga_part=platform["part"])

# build the HLS project
hls_model.build(csim=True, cosim=True)

Feel free to post an issue if you have any questions or problems!

SAMO: Streaming Architecture Mapping Optimisation

Related tags

Overview

SAMO: Streaming Architecture Mapping Optimiser

Installation

Usage

FINN

HLS4ML

Owner

Alexander Montgomerie-Corcoran

Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020

Code for CVPR2021 paper 'Where and What? Examining Interpretable Disentangled Representations'.

Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (CVAMD)

Alpha-Zero - Telegram Group Manager Bot Written In Python Using Pyrogram

Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild

A Python wrapper for Google Tesseract

Repository for self-supervised landmark discovery

A flexible ML framework built to simplify medical image reconstruction and analysis experimentation.

Vehicle Detection Using Deep Learning and YOLO Algorithm

Integrated physics-based and ligand-based modeling.

Can we do Customers Segmentation using PHP and Unsupervized Machine Learning ? Yes we can ! 🤡

Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Malware Analysis Neural Network project.

Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data based on Pytorch Framework

Deep deconfounded recommender (Deep-Deconf) for paper "Deep causal reasoning for recommendations"

End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)

Repo for the ACMMM20 submission: "Personalized breath based biometric authentication with wearable multimodality".

Chainer implementation of recent GAN variants

Elastic weight consolidation technique for incremental learning.

CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing