Software to help automate collecting crowdsourced annotations using Mechanical Turk.

Overview

Video Crowdsourcing

Software to help automate collecting crowdsourced annotations using Mechanical Turk.

The goal of this project is to enable crowdsourced collection of annotations on video data. This was built to collect skill annotations on medium length snippets of video (1-2 minutes), but was built with flexibility in mind so researchers can adapt the code to fit their needs.


How it Works

Videos from a YouTube playlist are used to programatically build surveys, including a "qualification" survey to verify responses. These surveys are sent to Mechanical Turk to create HITs for crowd workers. Once on Mechanical Turk, this software includes tools to manage payments to workers who do and do not pass the qualification questions. Finally, all responses from the workers can be collected in one place.


Instructions

1) Install requirements

You will need:

  • Access to a command line (terminal)
  • Download of this respository
    • git clone https://github.com/mpeven/Video_Crowdsourcing.git
  • Python
    • Note: this can be done easily using Conda to install Python and required libraries
  • Installation of required libraries
    • If using conda: conda install -c conda-forge --file requirements.txt
    • If using pip: pip install -r requirements.txt

2) Run Command Line Interface (CLI)

The CLI can be run with python main.py and should guide you through the rest of steps outlined below. Refer to this README if more details are needed.

3) Upload videos

  1. Upload videos to YouTube
    • Go to https://studio.youtube.com/ and click 'Create' to upload videos
    • Make sure videos are published and do not have 'Draft' status
    • IMPORTANT: Make sure videos are listed as Unlisted or Public (Private YouTube videos can't be seen in the survey)
  2. Create YouTube playlists for qualification videos and survey (un-annotated) videos
    • Once the videos are uploaded, create these two playlists and move them into the correct playlist
  3. Put title of the YouTube playlists in the SURVEY section of the config file

4) Create surveys

  1. Get access to YouTube Data API
    • Instructions here: link
    • IMPORTANT: Make sure you set "Application type" as Desktop app when you are on the page "Create OAuth client ID"
    • Download the JSON file of the OAuth client secrets and remember the path for the next step
  2. Fill out the needed sections of the config file
    • YOUTUBE section: oauth client secrets json file location
    • SURVEY section: number of videos per survey
  3. Create surveys using the option in the CLI
  4. Verify the survey is correct by opening the sample survey in a web browser

5) MTURK steps

  1. Create an AWS account
    • Instructions here: link
    • Put the access keys in the config file
  2. Create a Mechanical Turk Account
  3. Create a Mechanical Turk "Sandbox" Account for testing
  4. Upload sandbox-mode HITs using CLI
  5. Upload live HITs using CLI
  6. Periodically check on status and manage payments

Authors

  • Michael Peven (main contact - mpeven@jhu.edu)
  • Tingwen Guo

This work builds upon previous work done by Anand Malpani and Colin Lea


Acknowledgements

We would like to thank the following for support and funding:

  • Swaroop Vedula
  • Gregory Hager
  • Science of Learning Institute
Owner
Mike Peven
Mike Peven
Tool for generating Memory.scan() compatible instruction search patterns

scanpat Tool for generating Frida Memory.scan() compatible instruction search patterns. Powered by r2. Examples $ ./scanpat.py arm.ks:64 'sub sp, sp,

Ole André Vadla Ravnås 13 Sep 19, 2022
A workflow management tool for numerical models on the NCI computing systems

Payu Payu is a climate model workflow management tool for supercomputing environments. Payu is currently only configured for use on computing clusters

The Payu Organization 11 Aug 25, 2022
Password generator

Password generator technologies used What is? It is Password generator How to Download? Download on releases Clone repo git clone https://github.com/m

Miek 1 Nov 02, 2021
Check username

Checker-Oukee Check username It checks the available usernames and creates a new account for them Doesn't need proxies Create a file with usernames an

4 Jun 05, 2022
Python program for Linux users to change any url to any domain name they want.

URLMask Python program for Linux users to change a URL to ANY domain. A program than can take any url and mask it to any domain name you like. E.g. ne

2 Jun 20, 2022
API for obtaining results from the Beery-Bukenica test of the visomotor integration development (VMI) 4th edition.

VMI API API for obtaining results from the Beery-Bukenica test of the visomotor integration development (VMI) 4th edition. Install docker-compose up -

Victor Vargas Sandoval 1 Oct 26, 2021
Finger is a function symbol recognition engine for binary programs

Finger is a function symbol recognition engine for binary programs

332 Jan 01, 2023
✨ Un code pour voir les disponibilités des vaccins contre le covid totalement fait en Python par moi, et en français.

Vaccine Notifier ❗ Un chois aléatoire d'un article sur Wikipedia totalement fait en Python par moi, et en français. 🔮 Grâce a une requète API, on peu

MrGabin 3 Jun 06, 2021
🦩 A Python tool to create comment-free Jupyter notebooks.

Pelikan Pelikan lets you convert notebooks to comment-free notebooks. In other words, It removes Python block and inline comments from source cells in

Hakan Özler 7 Nov 20, 2021
Export watched content from Tautulli to the Letterboxd CSV Import Format

Export watched content from Tautulli to the Letterboxd CSV Import Format

Evan J 5 Aug 31, 2022
Small project to interact with python, C, HTML, JavaScript, PHP.

Micro Hidroponic Small project to interact with python, C, HTML, JavaScript, PHP. Table of Contents General Info Technologies Used Screenshots Usage P

Filipe Martins 1 Nov 10, 2021
API Rate Limit Decorator

ratelimit APIs are a very common way to interact with web services. As the need to consume data grows, so does the number of API calls necessary to re

Tomas Basham 575 Jan 05, 2023
This tool lets you perform some quick tasks for CTFs and Pentesting.

This tool lets you convert strings and numbers between number bases (2, 8, 10 and 16) as well as ASCII text. You can use the IP address analyzer to find out details on IPv4 and perform abbreviation a

Ayomide Ayodele-Soyebo 1 Jul 16, 2022
Dependency injection lib for Python 3.8+

PyDI Dependency injection lib for python How to use To define the classes that should be injected and stored as bean use decorator @component @compone

Nikita Antropov 2 Nov 09, 2021
ticktock is a minimalist library to view Python time performance of Python code.

ticktock is a minimalist library to view Python time performance of Python code.

Victor Benichoux 30 Sep 28, 2022
Macro recording and metaprogramming in Python

macro-kit is a package for efficient macro recording and metaprogramming in Python using abstract syntax tree (AST).

8 Aug 31, 2022
A color library based on pokemons colors!

pokepalette A simple pokemon color chooser " This repo is based on CDWimmer/PokePalette and was originated from this tweet. If you don't remember your

Thomas Capelle 5 Aug 30, 2021
Extract XML from the OS X dictionaries.

Extract XML from the OS X dictionaries.

Joshua Olson 13 Dec 11, 2022
✨ Un générateur de mot de passe aléatoire totalement fait en Python par moi, et en français.

Password Generator ❗ Un générateur de mot de passe aléatoire totalement fait en Python par moi, et en français. 🔮 Grâce a une au module random et str

MrGabin 3 Jul 29, 2021
Find unused resource keys in properties files in a Salesforce Commerce Cloud project and get rid of them.

Find Unused Resource Keys Find unused resource keys in properties files in a Salesforce Commerce Cloud project and get rid of them. It looks through a

Noël 5 Jan 08, 2022