Serving PyTorch 1.0 Models as a Web Server in C++

Last update: Jan 04, 2023

Related tags

Overview

Serving PyTorch Models in C++

This repository contains various examples to perform inference using PyTorch C++ API.
Run git clone https://github.com/Wizaron/pytorch-cpp-inference in order to clone this repository.

Environment

Dockerfiles can be found at docker directory. There are two dockerfiles; one for cpu and the other for cuda10. In order to build docker image, you should go to docker/cpu or docker/cuda10 directory and run docker build -t <docker-image-name> ..
After creation of the docker image, you should create a docker container via docker run -v <directory-that-this-repository-resides>:<target-directory-in-docker-container> -p 8181:8181 -it <docker-image-name> (We will use 8181 to serve our PyTorch C++ model).
Inside docker container, go to the directory that this repository resides.
Download libtorch from PyTorch Website (CPU : https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-1.3.1%2Bcpu.zip - CUDA10 : https://download.pytorch.org/libtorch/cu101/libtorch-cxx11-abi-shared-with-deps-1.3.1.zip).
Unzip libtorch via unzip. This will create libtorch directory that contains torch shared libraries and headers.

Code Structure

models directory stores PyTorch models.
libtorch directory stores C++ torch headers and shared libraries to link the model against PyTorch.
utils directory stores various utility function to perform inference in C++.
inference-cpp directory stores codes to perform inference.

Exporting PyTorch ScriptModule

In order to export torch.jit.ScriptModule of ResNet18 to perform C++ inference, go to models/resnet directory and run python3 resnet.py. It will download pretrained ResNet18 model on ImageNet and create models/resnet_model_cpu.pth and (optionally) models/resnet_model_gpu.pth which we will use in C++ inference.

Serving the C++ Model

We can either serve the model as a single executable or as a web server.

Single Executable

In order to build a single executable for inference:
1. Go to inference-cpp/cnn-classification directory.
2. Run ./build.sh in order to build executable, named as predict.
3. Run the executable via ./predict <path-to-image> <path-to-exported-script-module> <path-to-labels-file> <gpu-flag{true/false}>.
4. Example: ./predict image.jpeg ../../models/resnet/resnet_model_cpu.pth ../../models/resnet/labels.txt false

Web Server

In order to build a web server for production:
1. Go to inference-cpp/cnn-classification/server directory.
2. Run ./build.sh in order to build web server, named as predict.
3. Run the binary via ./predict <path-to-exported-script-module> <path-to-labels-file> <gpu-flag{true/false}> (It will serve the model on http://localhost:8181/predict).
4. Example: ./predict ../../../models/resnet/resnet_model_cpu.pth ../../../models/resnet/labels.txt false
5. In order to make a request, open a new tab and run python test_api.py (It will make a request to localhost:8181/predict).

Serving PyTorch 1.0 Models as a Web Server in C++

Related tags

Overview

Serving PyTorch Models in C++

Environment

Code Structure

Exporting PyTorch ScriptModule

Serving the C++ Model

Single Executable

Web Server

Acknowledgement

Owner

Onur Kaplan

STEM: An approach to Multi-source Domain Adaptation with Guarantees

Official PyTorch implementation of our AAAI22 paper: TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework via Self-Supervised Multi-Task Learning. Code will be available soon.

"Learning and Analyzing Generation Order for Undirected Sequence Models" in Findings of EMNLP, 2021

Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

This repo will contain code to reproduce and build upon understanding transfer learning

Python Assignments for the Deep Learning lectures by Andrew NG on coursera with complete submission for grading capability.

Point cloud processing tool library.

This repository contains all data used for writing a research paper Multiple Object Trackers in OpenCV: A Benchmark, presented in ISIE 2021 conference in Kyoto, Japan.

ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

A python package to perform same transformation to coco-annotation as performed on the image.

Rethinking the U-Net architecture for multimodal biomedical image segmentation

An end-to-end PyTorch framework for image and video classification

Implementation of the paper NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting.

MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens

Contextualized Perturbation for Textual Adversarial Attack, NAACL 2021

Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages"

Attention-based Transformation from Latent Features to Point Clouds (AAAI 2022)

Technical Analysis library in pandas for backtesting algotrading and quantitative analysis

Code for our SIGCOMM'21 paper "Network Planning with Deep Reinforcement Learning".

BasicRL: easy and fundamental codes for deep reinforcement learning。It is an improvement on rainbow-is-all-you-need and OpenAI Spinning Up.