This is a practice on Airflow, which is building virtual env, installing Airflow and constructing data pipeline (DAGs)

Overview

airflow-test

This is a practice on Airflow, which is

First of all, build virtual env using virtualbox

This is optional but highly recommended

  • It is very important to make sure that development env is independent and has no conflict with other env setting

Next, make python virtual env on proper directory

python3 -m venv sandbox
source ~/sandbox/bin/activate

Let start to install Airflow and initial setting

Install Airflow

pip3 install apache-airflow==2.1.0 \ ## airflow version depends on when you install and constraint 
  --constraint 'git address' ## constraint is ???. It is optional. refer to https://gist.github.com/marclamberti

Airflow Inital Setting

airflow db init ## airflow metadata initializing
airflow users create -u admin -p admin -f jaeyoung -l jang -r Admin -e [email protected] ## create admin user

airflow webserver ## launch web server. after that we can access airflow web server through localhost:8080
airflow scheduler ## activate airfow scheduler to run DAGs

Construct data pipeline with Airflow

One example of data pipeline (one DAGs) - let follow below steps and make data pipeline

  • creating table -> is-api-available -> extending -> processing-use ->
  • what is it? when new user comes in, sense that and process and store user info to destination (?)
  1. create table
  • make python script user_processing.py in dags folder
  • let create table using sqlite operator in user_processing.py
from airflow.models import DAG
from airflow.providers.sqlite.operators.sqlite import SqliteOperator

from datetime import datetime

default_args = {
    'start_date': datetime(2020, 1, 1)
}

with DAG('user_processing', schedule_interval='@daily', default_args=default_args, catchup=False) as dag:
    # Define tasks/operators

    creating_talbe = SqliteOperator(
        task_id='creating_table',
        sqlite_conn_id='db_sqlite',
        sql='''
            CREATE TABLE users (
                firstname TEXT NOT NULL,
                lastname TEXT NOT NULL,
                country TEXT NOT NULL,
                username TEXT NOT NULL,
                password TEXT NOT NULL,
                email TEXT NOT NULL PRIMARY KEY
            );
            '''
    )
  • set sqlite connection on Airflow Web UI
pip install apache-airflow-providers-sqlite ## install airflow provider of sqlite. refer airflow doc. may already be installed from intial installing Airflow..
# next, make a new sqltie connection named 'db_sqlite' in Airflow
  • after that, do test airflow tasks!

airflow tasks test user_processing creating_table 2020-01-01

  • one more things to do just for check again
sqlite3 airflow.db ## connect to sqlite3 in airflow.db
sqlite>.tables; ## see the tables in airflow.dbb
sqlite>select * form users; ## test query
Owner
Jaeyoung
Jaeyoung
Simple package to make requests throughout Tor with circuit renewal.

AutoTor Table of Contents About the Project Contents Dependencies Getting Started Installation Coding Contributing About the Project Simple package to

Salvador Belenguer 6 Jan 01, 2023
A simple, light-weight and highly maintainable online judge system for secondary education

y³OJ a simple, light-weight and highly maintainable online judge system for secondary education 一个简单、轻量化、易于维护的、为中学信息技术学科课业教学设计的 Online Judge 系统。 Onlin

20 Oct 04, 2022
Audio-analytics for music-producers! Automate tedious tasks such as musical scale detection, BPM rate classification and audio file conversion.

Click here to be re-directed to the Beat Inspect Streamlit Web-App You are a music producer? Let's get in touch via LinkedIn Fundamental Analytics for

Stefan Rummer 11 Dec 27, 2022
School helper, helps you at your pyllabus's.

pyllabus, helps you at your syllabus's... WARNING: It won't run without config.py! You should add config.py yourself, it will include your APIKEY. e.g

Ahmet Efe AKYAZI 6 Aug 07, 2022
This repo holds custom callback plugin, so your Ansible could write everything in the PostgreSQL database.

English What is it? This is callback plugin that dumps most of the Ansible internal state to the external PostgreSQL database. What is this for? If yo

Sergey Pechenko 19 Oct 21, 2022
Tensorboard plugin 3d with python

tensorboard-plugin-3d Overview In this example, we render a run selector dropdown component. When the user selects a run, it shows a preview of all sc

KitwareMedical 26 Nov 14, 2022
Telegram bot to search quotes from brainyquote.com

Brainy Quote Bot @BrainQuoteBot A star ⭐ from you means a lot to us! Telegram bot to search quotes from brainyquote.com Usage Deploy to Heroku Tap on

21 Nov 24, 2022
Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just about anything.

Retrying Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just

Ray Holder 1.9k Dec 29, 2022
A Python program for calculating the 95%CI for GNSS-derived site velocities

GNSS_Vel_95%CI A Python program for calculating the 95%CI for GNSS-derived site velocities Function_GNSS_95CI.py is a Python function for calculating

<a href=[email protected]"> 4 Dec 16, 2022
Automated Changelog/release note generation

Quickly generate changelogs and release notes by analysing your git history. A tool written in python, but works on any language.

Documatic 95 Jan 03, 2023
🙌Kart of 210+ projects based on machine learning, deep learning, computer vision, natural language processing and all. Show your support by ✨ this repository.

ML-ProjectKart 📌 Repository This kart showcases the finest collection of all projects based on machine learning, deep learning, computer vision, natu

Prathima Kadari 203 Dec 28, 2022
CarolinaCon CTF Online

CarolinaCon Online CTF CTF challenges from CarolinaCon Online April 23 through April 25, 2021. All challenges from the CTF will eventually be here. Co

49th Security Division 6 May 04, 2022
Script that creates graphical representations of Julia an Mandelbrot sets.

Julia and Mandelbrot Picture Maker This simple functions create simple plots of the Julia and Mandelbrot sets. The Julia set require the important par

Juan Riera Gomez 1 Jan 10, 2022
Standard mutable string (character array) implementation for Python.

chararray A standard mutable character array implementation for Python.

Tushar Sadhwani 3 Dec 18, 2021
Virtual webcam that takes real webcam footage and replaces the background in order to have Virtual Backgrounds in MS Teams for Linux where the feature is unimplemented.

Background Remover The Need It's been good long while since Microsoft first released a Teams version for Linux and yet, one of Teams' coolest features

Dylan Turner 80 Dec 20, 2022
🤖️ Plugin for Sentry which allows sending notification via DingTalk robot.

Sentry DingTalk Sentry 集成钉钉机器人通知 Requirments sentry = 21.5.1 特性 发送异常通知到钉钉 支持钉钉机器人webhook设置关键字 配置环境变量 DINGTALK_WEBHOOK: Optional(string) DINGTALK_CUST

1 Nov 04, 2021
Incident Response Process and Playbooks | Goal: Playbooks to be Mapped to MITRE Attack Techniques

PURPOSE OF PROJECT That this project will be created by the SOC/Incident Response Community Develop a Catalog of Incident Response Playbook for every

Austin Songer 987 Jan 02, 2023
Web3 Solidity Connector

With this project, you can compile your sol files and create new transactions including creating contract and calling the state changer functions. You can integrate integrate your sol files with Pyth

Fethi Tekyaygil 3 Oct 09, 2022
Ballistic calculator for Airsoft

Ballistic-calculator-for-Airsoft 用于Airsoft的弹道计算器 This is a ballistic calculator for airsoft gun. To calculate your airsoft gun's ballistic, you should

3 Jan 20, 2022
A Desktop application for the signalum python library

Signalum Desktop A Desktop application on the Signalum Python Library/CLI Tool. The Signalum Desktop application is an attempt to develop a single too

BISOHNS 35 Feb 15, 2021