Python programming language Test

Overview

Exercise

You are tasked with creating a data-processing app that pre-processes and enriches the data coming from crawlers, with the following requirements.

  • INPUT: csv-like data submitted by crawlers
  • OUTPUT: clean data saved into mongodb collections
  1. The app is an HTTP API server. Every year, crawlers will submit the data saved in a file, using the API endpoint designed by you.
  2. Examples of data the crawlers will submit every year: see data-2018.txt, data-2019.txt, data-2020.txt. You can't change the format of the data.
  3. As you can see, the data coming from crawlers is not 100% well-structured, the API should parse it correctly.
  4. a repeated submission with the data of the same year should perform an update on the existing yearly data.
  5. if there is any error in the submission or processing, the API should return a proper error message with proper HTTP response status
  6. For each university, enrich the data with a URL and a text description of it using Duckduckgo API e.g. https://api.duckduckgo.com/?q=harvard&format=json&pretty=1
  7. The app inserts or updates clean data in 2 mongodb tables/collections:
  • table 1 - the yearly data table
  • table 2 - the universities info
  1. table 1 contains data from every year, table 2 contains only the latest data.
  2. the data processing and transformations should be covered by tests.
  3. The solution should be in Python programming language, however you may use any 3rd party library you like.

Feel free to clarify the requirements further, if you have any doubts.

Bonus (Optional)

If you have indicated any DevOps skillsets in your resume, please create a Dockerfile, and using docker deploy the web app onto a free cloud platform, such as Heroku.

How the solution is assessed

The criteria are as follows (descending importance)

  1. Your code should perform the functionalities required
  2. Your code should be well-covered by tests
  3. Your code should be modular, readable and maintenable by other engineers.
  4. Your code should be robust, and can handle failure such as missing field, disconnection from DB or external server.
  5. Your code should be efficient and fast.
  6. Your code should be pretty.
Owner
Monirul Islam Khan
Database Engineer | DBA | Data Analyst | Data Scientist | Dev Ops
Monirul Islam Khan
Learn Python Regular Expressions step by step from beginner to advanced levels

Python re(gex)? Learn Python Regular Expressions step by step from beginner to advanced levels with hundreds of examples and exercises The book also i

Sundeep Agarwal 1.3k Dec 28, 2022
A place where the most basic, basic of python coding exists

python-basics A place where the most basic, basic of python coding exists As you can see, there are four folders and the best order to read is: appeti

Chuqin 2 Oct 05, 2022
A gamey, snakey esoteric programming language

Snak Snak is an esolang based on the classic snake game. Installation You will need python3. To use the visualizer, you will need the curses module. T

David Rutter 3 Oct 10, 2022
This repo created to complete the task HACKTOBER 2021, contribute now and get your special T-Shirt & Sticker. TO SUPPORT OWNER PLEASE PRESS STAR BUTTON

❤ THIS REPO WILL CLOSED IN 31 OCT 00:00 ❤ This repository will automatically assign the hacktoberfest and hacktoberfest-accepted labels to all submitt

Rajendra Rakha 307 Dec 27, 2022
scap is a tool for putting code in places and for other purposes

Scap is the deployment script used by Wikimedia Foundation to publish code and configuration on production web servers.

Wikimedia 7 Nov 02, 2022
A nonebot2 plugin, send news information in a picture form.

A nonebot2 plugin, send news information in a picture form.

幼稚园园长 7 Nov 18, 2022
Demo of patching a python context manager

patch-demo-20211203 demo of patching a python context manager poetry install poetry run python -m my_great_app to run the code poetry run pytest to te

Brad Smith 1 Feb 09, 2022
Voldemort's Python import helper

importmagician Voldemort's Python import helper pip install importmagician Import from uninstalled Python directories Say you have a directory (relat

Zhengyang Feng 4 Mar 09, 2022
Traits for Python3

Do you like Python, but think that multiple inheritance is a bit too flexible? Are you looking for a more constrained way to define interfaces and re-use code?

121 Nov 15, 2022
Creates infinite amount of guilded accounts in seconds.

Guilded Cookie Creator [fuck guilded i quit working on this, they patch like every fucking method after 2/3 days i release shit] Optimizations Asynchr

scripted 7 Feb 28, 2022
Twikoo自定义表情列表 | HexoPlusPlus自定义表情列表(其实基于OwO的项目都可以用的啦)

Twikoo-Magic 更新说明 2021/1/15 基于2021/1/14 Twikoo 更新1.1.0-beta,所有表情都将以缩写形式(如:[ text ]:)输出。1/14之前本仓库有部分表情text缺失及重复, 导致无法正常使用表情 1/14后的所有表情json列表已全部更新

noionion 90 Jan 05, 2023
A person does not exist image bot

A person does not exist image bot

Fayas Noushad 3 Dec 12, 2021
Streamlit Component, for a Chatbot UI

st-chat Streamlit Component, for a Chat-bot UI, example app authors - @yashppawar & @YashVardhan-AI Installation Install streamlit-chat with pip pip i

Yash AI 99 Jan 07, 2023
Very Simple 2 Message Spammer!

Very Simple 2 Message Spammer!

Syntax. 4 Dec 06, 2022
Use a real time weather API to apply wind to your mouse cursor.

wind-cursor Use a real time weather API to apply wind to your mouse cursor. Requirements PyAutoGUI pyowm Usage This program uses the OpenWeatherMap AP

Andreas Schmid 1 Feb 07, 2022
Advanced python code - For students in my advanced python class

advanced_python_code For students in my advanced python class Week Topic Recordi

Ariel Avshalom 3 May 27, 2022
This is a backport of the BaseExceptionGroup and ExceptionGroup classes from Python 3.11.

This is a backport of the BaseExceptionGroup and ExceptionGroup classes from Python 3.11. It contains the following: The exceptiongroup.BaseExceptionG

Alex Grönholm 19 Dec 15, 2022
Cylinder volume calculator features the calculations of the volume of a Right /oblique full cylinder

Cylinder-Volume-Calculator Cylinder volume calculator features the calculations of the volume of a Right /oblique full cylinder. Size : 10.5 mb compat

Abhijeet 4 Nov 07, 2022
Display your data in an attractive way in your notebook!

Bloxs Bloxs is a simple python package that helps you display information in an attractive way (formed in blocks). Perfect for building dashboards, re

MLJAR 192 Dec 28, 2022
Experiments with Tox plugin system

The project is an attempt to add to the tox some missing out of the box functionality. Basically it is just an extension for the tool that will be loa

Volodymyr Vitvitskyi 30 Nov 26, 2022