Skip to content

Instantly share code, notes, and snippets.

View Azariagmt's full-sized avatar
🎯
Focusing

Azaria Gebremichael Azariagmt

🎯
Focusing
View GitHub Profile
@Azariagmt
Azariagmt / setup_kafka.sh
Created January 9, 2022 11:55
Setup Kafka
if [ ! -f kafka_2.12-3.0.0.tgz ]; then
wget https://dlcdn.apache.org/kafka/3.0.0/kafka_2.12-3.0.0.tgz
fi
if [ ! -f kafka_2.12-3.0.0 ]; then
tar -xvf kafka_2.12-3.0.0.tgz
fi
mv kafka_2.12-3.0.0 /usr/local/kafka
@Azariagmt
Azariagmt / setup_zookeeper.sh
Created January 9, 2022 11:54
Setup zookeeper
if [ ! -f /home/sem/Documents/Tenacious/kafka-local/apache-zookeeper-3.6.3-bin.tar.gz ]; then
wget https://dlcdn.apache.org/zookeeper/zookeeper-3.6.3/apache-zookeeper-3.6.3-bin.tar.gz
fi
tar -xzf apache-zookeeper-3.6.3-bin.tar.gz
mv -r apache-zookeeper-3.6.3-bin /usr/local/zookeeper
mkdir -p /var/lib/zookeeper
cat > /usr/local/zookeeper/conf/zoo.cfg << EOF
> tickTime=2000
> dataDir=/var/lib/zookeeper
LOAD DATA LOCAL INFILE
'/usr/local/airflow/include/I80_stations.csv'
INTO TABLE I80_stations
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n';
@Azariagmt
Azariagmt / create_I80_stations.SQL
Created October 9, 2021 09:35
SQL script to create I80_stations table
CREATE TABLE IF NOT EXISTS I80_stations
( ID int,
Fwy int,
Dir CHAR(20),
District int,
County int,
City FLOAT,
State_PM FLOAT,
Abs_PM FLOAT,
Latitude FLOAT,
@Azariagmt
Azariagmt / constraints.py
Created September 1, 2021 04:51
causalnex model constraints class
from causalnex.structure.notears import from_pandas, from_pandas_lasso
class Constraints:
"""
Aids construct manual interference on structural model
"""
def __init__(self, structural_model:from_pandas_lasso = None):
self.structural_model = structural_model
@Azariagmt
Azariagmt / first_causal_graph.py
Created September 1, 2021 04:41
Creates initial unmodified causal graph
import pandas as pd
data = pd.read_csv("../data/data.csv")
df = data[['perimeter_mean', 'concavity_mean',
'radius_worst', 'perimeter_worst', 'area_worst',
'diagnosis']]
print("DataFrame loaded")
df, non_numeric_cols = preprocess_data.check_numeric(df)
@Azariagmt
Azariagmt / construct_lr_model.py
Created September 1, 2021 04:34
Takes in dataframe and returns a fitted Logistic regression classification model
def construct_model(data:pd.DataFrame)-> LogisticRegression:
"""Constructs classification model
Args:
data (pd.DataFrame): DataFrame to be used to train and evaluate classification model
Returns:
LogisticRegression: Logistic Regression model to be used for classification
"""
X = data.drop('diagnosis', axis=1)
@Azariagmt
Azariagmt / draw_causal_graph.py
Created September 1, 2021 04:26
Function which takes in structural model and draws causal graph
def draw_graph(structural_model: from_pandas_lasso, path, prog="dot"):
"""Draws Causal graph
Args:
structural_model (from_pandas_lasso): Structural model of causalnex
prog (str, optional): Graphics tool to draw pygraphiz graph. Defaults to "dot".
Returns:
image (png) : Causal graph img
"""
@Azariagmt
Azariagmt / construct_structural_model.py
Created September 1, 2021 04:18
Constructs causalnex structural model
from causalnex.plots import plot_structure, NODE_STYLE, EDGE_STYLE
from causalnex.structure import notears
from causalnex.structure.notears import from_pandas, from_pandas_lasso
def construct_structural_model(df:pd.DataFrame, notears=from_pandas_lasso, tabu_parent_nodes=None)-> notears:
"""Constructs structural model to be used to draw causal graph
Args:
df (pd.DataFrame): Preprocessed DataFrame that will construct structural model
notears ([type], optional): [description]. Defaults to from_pandas_lasso.
@Azariagmt
Azariagmt / preprocess.py
Created September 1, 2021 04:13
Preprocessing script for causalnex medium article
from sklearn.preprocessing import LabelEncoder
def check_numeric(df: pd.DataFrame) -> list:
"""checks non-numeric columns
Args:
df (pd.DataFrame): Dataframe to be checked for non-numeric value
Returns:
struct_data (pd.DataFrame): Copied DataFrame