Repository for DNN training, theory to practice, part of the Large Scale Machine Learning class at Mines Paritech

Overview

DNN Training, from theory to practice

This repository is complementary to the deep learning training lesson given to les Mines ParisTech on the 11th of March 2022 as part of the Large Scale Machine Learning class.

You can find here the slides of the class.

Requirements

To get started, clone it and prepare a new virtual env.

git clone https://github.com/adefossez/dnn_theo_practice
cd dnn_theo_practice
python3 -m venv env
source env/bin/activate
python3 -m pip install -r requirements.txt

Note: it can be safer to install PyTorch through a conda environment to make sure all proper versions of CUDA realted libraries are installed and used. We use pip here for simplicity.

Basic training pipeline

To get started, you can run

python -m basic.train

You can tweak some hyper parameters:

python -m basic.train --lr 0.1 --epochs 30 --model mobilenet_v2

This basic pipeline provides all the essential tools for training a neural network:

  • automatic experiment naming,
  • logging and metric dumping,
  • checkpointing with automatic resume.

Looking at basic/train.py you will see that 90% of the code is not deep learning but pure engineering. Some frameworks like PyTorch Lightning can save you some of this trouble, at the cost of losing control and understanding over what happens. In any case it is good to have an idea of how things work under the hood!

PyTorch-Lightning training pipeline

Insite the pl_hydra folder, I provide the same pipeline, but using PyTorch-Lightning along with Hydra, as an alternative to argparse. Have a look at pl_hydra/train.py to see the differences with the previous implementation.

python -m pl_hydra.train optim.lr=0.1 model=mobilenet_v2

Using existing frameworks:

At this point, it is a good time to introduce a few frameworks you might want to use for your projects.

Hydra

Hydra handles things like logging, configuration parsing (based on YAML files, which is a bit nicer than argparse, especially for large projects), and also has support for some grid search scheduling with a dedicated language. It also supports meta-optimizers like Nevergrad (see after).

Nevergrad

Nevergrad is a framework for gradient free optimization. It can be used to automatically tune your model or optimization hyper-parameters with smart random search.

PyTorch-Lightning

PyTorch Lightning takes care of logging, distributed training, checkpointing and many more boilerplate parts of a deep learning research project. It is powerful but also quite complex, and you will lose some control over the training pipeline.

Dora

Dora is an experiment management framework:

  • Grid searches are expressed as pure python.
  • Experiments have an automatic signature assigned based on its args.
  • Keeps in sync experiments defined in grid files, and those running on the cluster.
  • Basic terminal based reporting of job states, metrics etc.

Dora allows you to scale up to hundreds of experiments without losing your sanity.

Plotting and monitoring utilities

While it is always good to have basic metric reporting inside logs, it can be more conveniant to track experimental progress through a web browser. TensorBoard, initially developed for TensorFlow provide just that. A fully hosted alternative is Wandb. Finally, HiPlot is a lightweight package to easily make sense of the impact of hyperparameters on the metrics of interest.

Unix tools

It is a good idea to learn to master the standard Unix/Linux tools! For large scale machine learning, you will often have to run experiments on a remote cluster, with only SSH access. tmux is a must have, as well as knowing at least of one terminal based file editor (nano is the simplest, emacs or vim are more complex but quite powerful). Take some time to learn about tuning your bashrc, setting up aliases for often used commands etc.

You will probably need tools like grep, less, find or ack. I personnaly really enjoy fd, an alternative to find with some intuitive interface. Similarly ag is a nice way to quickly look through a codebase in the terminal. If you need to go through a lot of logs, you will enjoy ripgreg.

License

This code in this repository is released into the public domain. You can freely reuse any part of it and you don't even need to say where you found it! See the LICENSE for more information.

The slides are released under Creative Commons CC-BY-NC.

Owner
Alexandre Défossez
Alexandre Défossez
A webapp for taking fast notes, designed for business, school, and collaboration with groups.

JOTS Journal of the Session A webapp for taking fast notes, designed for business, school, and collaboration with groups.

Zebadiah S. Taylor 2 Jun 10, 2022
Chicks get hostloc points regularly

hostloc_getPoints 小鸡定时获取hostloc积分 github action大规模失效,mjj平均一人10鸡,以下可以部署到自己的小鸡上

59 Dec 28, 2022
An After Effects render queue for ShotGrid Toolkit.

AEQueue An After Effects render queue for ShotGrid Toolkit. Features Render multiple comps to locations defined by templates in your Toolkit config. C

Brand New School 5 Nov 20, 2022
The Official Jaseci Code Repository

Jaseci Release Notes Version 1.2.2 Updates Added new built-ins for nodes and edges (context, info, and details) Fixed dot output Added reset command t

136 Dec 20, 2022
Python library for creating and parsing HSReplay XML files

python-hsreplay A python module for HSReplay support. https://hearthsim.info/hsreplay/ Installation The library is available on PyPI. pip install hsre

HearthSim 45 Mar 28, 2022
Program Input Data Mahasiswa Oop

PROGRAM INPUT NILAI MAHASISWA MENGGUNAKAN OOP PENGERTIAN OOP object-oriented-programing/OOP adalah paradigma pemrograman berdasarkan konsep "objek", y

Maulana Reza Badrudin 1 Jan 05, 2022
Senior Comprehensive Project For Python

Senior Comprehensive Project Author: Grey Hutchinson My project, which I nicknamed “Murmur”, was to create a research tool that would use neural netwo

1 May 29, 2022
Usos Semester average helper

Usos Semester average helper Dzieki temu skryptowi mozesz sprawdzic srednia ocen na kazdy odbyty przez ciebie semestr PARAMETERS required: '--username

2 Jan 17, 2022
IOP Support for Python (Experimental)

TAGS Experimental IOP Framework for Python WARNING: Currently, this project has NO EXCEPTION HANDLING. USE AT YOUR OWN RISK! I. Introduction to Interf

1 Oct 22, 2021
A basic ticketing software.

Ticketer A basic ticketing software. Screenshots Program Launched Issuing Ticket Show your Ticket Entry Done Program Exited Code Features to implement

Samyak Jain 2 Feb 10, 2022
A topology optimization framework written in Taichi programming language, which is embedded in Python.

Taichi TopOpt (Under Active Development) Intro A topology optimization framework written in Taichi programming language, which is embedded in Python.

Li Zhehao 41 Nov 17, 2022
Slotscheck - Find mistakes in your slots definitions

🎰 Slotscheck Adding __slots__ to a class in Python is a great way to reduce mem

Arie Bovenberg 67 Dec 31, 2022
NFT-Image-Generator - Utility to generate a large collection of unique images

NFT-Image-Generator Utility for creating a generative art collection from suppli

Sem Moolenschot 60 Dec 15, 2022
Hacking and Learning consistently for 100 days straight af.

#100DaysOfHacking Hacking and Learning consistently for 100 days straight af. [yes, no breaks except mental-break ones, Obviously.] This Repo is one s

FENIL SHAH 17 Sep 09, 2022
Package pyVHR is a comprehensive framework for studying methods of pulse rate estimation relying on remote photoplethysmography (rPPG)

Package pyVHR (short for Python framework for Virtual Heart Rate) is a comprehensive framework for studying methods of pulse rate estimation relying on remote photoplethysmography (rPPG)

PHUSE Lab 261 Jan 03, 2023
A tool to quickly create codeforces contest directories with templates.

Codeforces Template Tool I created this tool to help me quickly set up codeforces contests/singular problems with templates. Tested for windows, shoul

1 Jun 02, 2022
RISE allows you to instantly turn your Jupyter Notebooks into a slideshow

RISE RISE allows you to instantly turn your Jupyter Notebooks into a slideshow. No out-of-band conversion is needed, switch from jupyter notebook to a

Damian Avila 3.4k Jan 04, 2023
Data 25 Star Wars Project With Python

Data 25 Star Wars Project Instructions The character data in your MongoDB database has been pulled from https://swapi.tech/. As well as 'people', the

1 Dec 24, 2021
Automates the fixing of problems reported by yamllint by parsing its output

yamlfixer yamlfixer automates the fixing of problems reported by yamllint by parsing its output. Usage This software automatically fixes some errors a

OPT Nouvelle Caledonie 26 Dec 26, 2022
Extend the maya channel box with searchability and colour

channel-box-plus will add search-ability over its attributes, and it will colour user defined attributes, making them easier to distinguish.

Robert Joosten 12 Jun 08, 2022