MidTerm Project for the Data Analysis FT Bootcamp, Adam Tycner and Florent ZAHOUI

Overview

MidTermProject_IH_DA_02_22

MidTerm Project for the Data Analysis FT Bootcamp, Adam Tycner and Florent ZAHOUI Hallo

 + Adam TYCNER, Florent ZAHOUI
 + Modelling project :the objective is to predict the selling price of houses based on the available features in the dataset
 + Additional tasks: Business Intelligence : Explore the characteristics of the data and visualise interesting insights 
 + Source : https://github.com/raafat-hantoush/IH_RH_DA_FT_JAN_2022/tree/main/Class_Materials/MidTerm_Project/Data
 + Number of rows and columns : (21597, 21)
 + Number of features : 20
 + Number of targets : 1 (price)
 + High-level Planning of the project :
     - Day 1 - EDA
     - Day 2 - Plotting / Visualization
     - Day 3 - Modelling ( Transformation, encoding)
     - Day 4 - Modelling ( Split, Model Accuracy) / Presentation ( Slides / Notebook / Tableau preparation) / Mock presentation
     - Day 5 - Presentation in front of the class
Owner
Florent Zahoui
Previously working as an account manager. Looking forward to developing my data analytics skills.
Florent Zahoui
Approximate Nearest Neighbor Search for Sparse Data in Python!

Approximate Nearest Neighbor Search for Sparse Data in Python! This library is well suited to finding nearest neighbors in sparse, high dimensional spaces (like text documents).

Meta Research 906 Jan 01, 2023
Python ELT Studio, an application for building ELT (and ETL) data flows.

The Python Extract, Load, Transform Studio is an application for performing ELT (and ETL) tasks. Under the hood the application consists of a two parts.

Schlerp 55 Nov 18, 2022
Senator Trades Monitor

Senator Trades Monitor This monitor will grab the most recent trades by senators and send them as a webhook to discord. Installation To use the monito

Yousaf Cheema 5 Jun 11, 2022
fds is a tool for Data Scientists made by DAGsHub to version control data and code at once.

Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc

DAGsHub 359 Dec 22, 2022
Datashredder is a simple data corruption engine written in python. You can corrupt anything text, images and video.

Datashredder is a simple data corruption engine written in python. You can corrupt anything text, images and video. You can chose the cha

2 Jul 22, 2022
Find exposed data in Azure with this public blob scanner

BlobHunter A tool for scanning Azure blob storage accounts for publicly opened blobs. BlobHunter is a part of "Hunting Azure Blobs Exposes Millions of

CyberArk 250 Jan 03, 2023
Port of dplyr and other related R packages in python, using pipda.

Unlike other similar packages in python that just mimic the piping syntax, datar follows the API designs from the original packages as much as possible, and is tested thoroughly with the cases from t

179 Dec 21, 2022
A forecasting system dedicated to smart city data

smart-city-predictions System prognostyczny dedykowany dla danych inteligentnych miast Praca inżynierska realizowana przez Michała Stawikowskiego and

Kevin Lai 1 Nov 08, 2021
Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks

The following Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks (MOFs). The training set is extracted from the Cambridge S

1 Jan 09, 2022
Elementary is an open-source data reliability framework for modern data teams. The first module of the framework is data lineage.

Data lineage made simple, reliable, and automated. Effortlessly track the flow of data, understand dependencies and analyze impact. Features Visualiza

898 Jan 09, 2023
Package for decomposing EMG signals into motor unit firings, as used in Formento et al 2021.

EMGDecomp Package for decomposing EMG signals into motor unit firings, created for Formento et al 2021. Based heavily on Negro et al, 2016. Supports G

13 Nov 01, 2022
Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

2 Nov 20, 2021
Python tools for querying and manipulating BIDS datasets.

PyBIDS is a Python library to centralize interactions with datasets conforming BIDS (Brain Imaging Data Structure) format.

Brain Imaging Data Structure 180 Dec 18, 2022
Learn machine learning the fun way, with Oracle and RedBull Racing

Red Bull Racing Analytics Hands-On Labs Introduction Are you interested in learning machine learning (ML)? How about doing this in the context of the

Oracle DevRel 55 Oct 24, 2022
PySpark Structured Streaming ROS Kafka ApacheSpark Cassandra

PySpark-Structured-Streaming-ROS-Kafka-ApacheSpark-Cassandra The purpose of this project is to demonstrate a structured streaming pipeline with Apache

Zekeriyya Demirci 5 Nov 13, 2022
Evaluation of a Monocular Eye Tracking Set-Up

Evaluation of a Monocular Eye Tracking Set-Up As part of my master thesis, I implemented a new state-of-the-art model that is based on the work of Che

Pascal 19 Dec 17, 2022
Uses MIT/MEDSL, New York Times, and US Census datasources to analyze per-county COVID-19 deaths.

Covid County Executive summary Setup Install miniconda, then in the command line, run conda create -n covid-county conda activate covid-county conda i

Ahmed Fasih 1 Dec 22, 2021
Catalogue data - A Python Scripts to prepare catalogue data

catalogue_data Scripts to prepare catalogue data. Setup Clone this repo. Install

BigScience Workshop 3 Mar 03, 2022
Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data.

PremiershipPlayerAnalysis Using Python to scrape some basic player information from www.premierleague.com and then use Pandas to analyse said data. No

5 Sep 06, 2021
INFO-H515 - Big Data Scalable Analytics

INFO-H515 - Big Data Scalable Analytics Jacopo De Stefani, Giovanni Buroni, Théo Verhelst and Gianluca Bontempi - Machine Learning Group Exercise clas

Yann-Aël Le Borgne 58 Dec 11, 2022