Active learning for Mask R-CNN in Detectron2

Related tags

Deep Learningmaskal
Overview

MaskAL - Active learning for Mask R-CNN in Detectron2

maskAL_framework

Summary

MaskAL is an active learning framework that automatically selects the most-informative images for training Mask R-CNN. By using MaskAL, it is possible to reduce the number of image annotations, without negatively affecting the performance of Mask R-CNN. Generally speaking, MaskAL involves the following steps:

  1. Train Mask R-CNN on a small initial subset of a bigger dataset
  2. Use the trained Mask R-CNN algorithm to make predictions on the unlabelled images of the remaining dataset
  3. Select the most-informative images with a sampling algorithm
  4. Annotate the most-informative images, and then retrain Mask R-CNN on the most informative-images
  5. Repeat step 2-4 for a specified number of sampling iterations

The figure below shows the performance improvement of MaskAL on our dataset. By using MaskAL, the performance of Mask R-CNN improved more quickly and therefore 1400 annotations could be saved (see the black dashed line):

maskAL_graph

Installation

See INSTALL.md

Data preparation and training

Split the dataset in a training set, validation set and a test set. It is not required to annotate every image in the training set, because MaskAL will select the most-informative images automatically.

  1. From the training set, a smaller initial dataset is randomly sampled (the dataset size can be specified in the maskAL.yaml file). The images that do not have an annotation are placed in the annotate subfolder inside the image folder. You first need to annotate these images with LabelMe (json), V7-Darwin (json), Supervisely (json) or CVAT (xml) (when using CVAT, export the annotations to LabelMe 3.0 format). Refer to our annotation procedure: ANNOTATION.md
  2. Step 1 is repeated for the validation set and the test set (the file locations can be specified in the maskAL.yaml file).
  3. After the first training iteration of Mask R-CNN, the sampling algorithm selects the most-informative images (its size can be specified in the maskAL.yaml file).
  4. The most-informative images that don't have an annotation, are placed in the annotate subfolder. Annotate these images with LabelMe (json), V7-Darwin (json), Supervisely (json) or CVAT (xml) (when using CVAT, export the annotations to LabelMe 3.0 format).
  5. OPTIONAL: it is possible to use the trained Mask R-CNN model to auto-annotate the unlabelled images to further reduce annotation time. Activate auto_annotate in the maskAL.yaml file, and specify the export_format (currently supported formats: 'labelme', 'cvat', 'darwin', 'supervisely').
  6. Step 3-5 are repeated for several training iterations. The number of iterations (loops) can be specified in the maskAL.yaml file.

Please note that MaskAL does not work with the default COCO json-files of detectron2. These json-files contain all annotations that are completed before the training starts. Because MaskAL involves an iterative train and annotation procedure, the default COCO json-files lack the desired format.

How to use MaskAL

Open a terminal (Ctrl+Alt+T):

(base) [email protected]:~$ cd maskal
(base) [email protected]:~/maskal$ conda activate maskAL
(maskAL) [email protected]:~/maskal$ python maskAL.py --config maskAL.yaml

Change the following settings in the maskAL.yaml file:
Setting Description
weightsroot The file directory where the weight-files are stored
resultsroot The file directory where the result-files are stored
dataroot The root directory where all image-files are stored
use_initial_train_dir Set this to True when you want to start the active-learning from an initial training dataset. When False, the initial dataset of size initial_datasize is randomly sampled from the traindir
initial_train_dir When use_initial_train_dir is activated: the file directory where the initial training images and annotations are stored
traindir The file directory where the training images and annotations are stored
valdir The file directory where the validation images and annotations are stored
testdir The file directory where the test images and annotations are stored
network_config The Mask R-CNN configuration-file (.yaml) file (see the folder './configs')
pretrained_weights The pretrained weights to start the active-learning. Either specify the network_config (.yaml) or a custom weights-file (.pth or .pkl)
cuda_visible_devices The identifiers of the CUDA device(s) you want to use for training and sampling (in string format, for example: '0,1')
classes The names of the classes in the image annotations
learning_rate The learning-rate to train Mask R-CNN (default value: 0.01)
confidence_threshold Confidence-threshold for the image analysis with Mask R-CNN (default value: 0.5)
nms_threshold Non-maximum suppression threshold for the image analysis with Mask R-CNN (default value: 0.3)
initial_datasize The size of the initial dataset to start the active learning (when use_initial_train_dir is False)
pool_size The number of most-informative images that are selected from the traindir
loops The number of sampling iterations
auto_annotate Set this to True when you want to auto-annotate the unlabelled images
export_format When auto_annotate is activated: specify the export-format of the annotations (currently supported formats: 'labelme', 'cvat', 'darwin', 'supervisely')
supervisely_meta_json When supervisely auto_annotate is activated: specify the file location of the meta.json for supervisely export

Description of the other settings in the maskAL.yaml file: MISC_SETTINGS.md

Please refer to the folder active_learning/config for more setting-files.

Other software scripts

Use a trained Mask R-CNN algorithm to auto-annotate unlabelled images: auto_annotate.py

Argument Description
--img_dir The file directory where the unlabelled images are stored
--network_config Configuration of the backbone of the network
--classes The names of the classes of the annotated instances
--conf_thres Confidence threshold of the CNN to do the image analysis
--nms_thres Non-maximum suppression threshold of the CNN to do the image analysis
--weights_file Weight-file (.pth) of the trained CNN
--export_format Specifiy the export-format of the annotations (currently supported formats: 'labelme', 'cvat', 'darwin', 'supervisely')
--supervisely_meta_json The file location of the meta.json for supervisely export

Example syntax (auto_annotate.py):

python auto_annotate.py --img_dir datasets/train --network_config COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml --classes healthy damaged matured cateye headrot --conf_thres 0.5 --nms_thres 0.2 --weights_file weights/broccoli/model_final.pth --export_format supervisely --supervisely_meta_json datasets/meta.json

Troubleshooting

See TROUBLESHOOTING.md

Citation

See our research article for more information or cross-referencing:

@misc{blok2021active,
      title={Active learning with MaskAL reduces annotation effort for training Mask R-CNN}, 
      author={Pieter M. Blok and Gert Kootstra and Hakim Elchaoui Elghor and Boubacar Diallo and Frits K. van Evert and Eldert J. van Henten},
      year={2021},
      eprint={2112.06586},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url = {https://arxiv.org/abs/2112.06586},
}

License

Our software was forked from Detectron2 (https://github.com/facebookresearch/detectron2). As such, the software will be released under the Apache 2.0 license.

Acknowledgements

The uncertainty calculation methods were inspired by the research of Doug Morrison:
https://nikosuenderhauf.github.io/roboticvisionchallenges/assets/papers/CVPR19/rvc_4.pdf

Two software methods were inspired by the work of RovelMan:
https://github.com/RovelMan/active-learning-framework

MaskAL uses the Bayesian Active Learning (BaaL) software:
https://github.com/ElementAI/baal

Contact

MaskAL is developed and maintained by Pieter Blok.

Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

DistMIS Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation. DistriMIS Distributing Deep Learning Hyperparameter Tuning

HiEST 2 Sep 09, 2022
A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

NeRF Minimal Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Result of Tiny-NeRF RGB Depth

Soumik Rakshit 11 Jul 24, 2022
Automatically creates genre collections for your Plex media

Plex Auto Genres Plex Auto Genres is a simple script that will add genre collection tags to your media making it much easier to search for genre speci

Shane Israel 63 Dec 31, 2022
Code release for SLIP Self-supervision meets Language-Image Pre-training

SLIP: Self-supervision meets Language-Image Pre-training What you can find in this repo: Pre-trained models (with ViT-Small, Base, Large) and code to

Meta Research 621 Dec 31, 2022
PyTorch implementation of NeurIPS 2021 paper: "CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration"

PyTorch implementation of NeurIPS 2021 paper: "CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration"

76 Jan 03, 2023
Azua - build AI algorithms to aid efficient decision-making with minimum data requirements.

Project Azua 0. Overview Many modern AI algorithms are known to be data-hungry, whereas human decision-making is much more efficient. The human can re

Microsoft 197 Jan 06, 2023
Classification of EEG data using Deep Learning

Graduation-Project Classification of EEG data using Deep Learning Epilepsy is the most common neurological disease in the world. Epilepsy occurs as a

Osman Alpaydın 5 Jun 24, 2022
ESL: Event-based Structured Light

ESL: Event-based Structured Light Video (click on the image) This is the code for the 2021 3DV paper ESL: Event-based Structured Light by Manasi Mugli

Robotics and Perception Group 29 Oct 24, 2022
Official implementation of MSR-GCN (ICCV 2021 paper)

MSR-GCN Official implementation of MSR-GCN: Multi-Scale Residual Graph Convolution Networks for Human Motion Prediction (ICCV 2021 paper) [Paper] [Sup

LevonDang 42 Nov 07, 2022
Implementation of FitVid video prediction model in JAX/Flax.

FitVid Video Prediction Model Implementation of FitVid video prediction model in JAX/Flax. If you find this code useful, please cite it in your paper:

Google Research 62 Nov 25, 2022
Image Segmentation Animation using Quadtree concepts.

QuadTree Image Segmentation Animation using QuadTree concepts. Usage usage: quad.py [-h] [-fps FPS] [-i ITERATIONS] [-ws WRITESTART] [-b] [-img] [-s S

Alex Eidt 29 Dec 25, 2022
This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.

WTW-Dataset This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on ICCV 2021. Here, you can download the

109 Dec 29, 2022
OCRA (Object-Centric Recurrent Attention) source code

OCRA (Object-Centric Recurrent Attention) source code Hossein Adeli and Seoyoung Ahn Please cite this article if you find this repository useful: For

Hossein Adeli 2 Jun 18, 2022
Tree Nested PyTorch Tensor Lib

DI-treetensor treetensor is a generalized tree-based tensor structure mainly developed by OpenDILab Contributors. Almost all the operation can be supp

OpenDILab 167 Dec 29, 2022
AutoDeeplab / auto-deeplab / AutoML for semantic segmentation, implemented in Pytorch

AutoML for Image Semantic Segmentation Currently this repo contains the only working open-source implementation of Auto-Deeplab which, by the way out-

AI Necromancer 299 Dec 17, 2022
code for Image Manipulation Detection by Multi-View Multi-Scale Supervision

MVSS-Net Code and models for ICCV 2021 paper: Image Manipulation Detection by Multi-View Multi-Scale Supervision Update 22.02.17, Pretrained model for

dong_chengbo 131 Dec 30, 2022
Attention-based Transformation from Latent Features to Point Clouds (AAAI 2022)

Attention-based Transformation from Latent Features to Point Clouds This repository contains a PyTorch implementation of the paper: Attention-based Tr

12 Nov 11, 2022
StarGAN - Official PyTorch Implementation (CVPR 2018)

StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Yunjey Choi 5.1k Dec 30, 2022
Process JSON files for neural recording sessions using Medtronic's BrainSense Percept PC neurostimulator

percept_processing This code processes JSON files for streamed neural data using Medtronic's Percept PC neurostimulator with BrainSense Technology for

Maria Olaru 3 Jun 06, 2022
Edge Restoration Quality Assessment

ERQA - Edge Restoration Quality Assessment ERQA - a full-reference quality metric designed to analyze how good image and video restoration methods (SR

MSU Video Group 27 Dec 17, 2022