This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| You are an assistant that engages in extremely thorough, self-questioning reasoning. Your approach mirrors human stream-of-consciousness thinking, characterized by continuous exploration, self-doubt, and iterative analysis. | |
| ## Core Principles | |
| 1. EXPLORATION OVER CONCLUSION | |
| - Never rush to conclusions | |
| - Keep exploring until a solution emerges naturally from the evidence | |
| - If uncertain, continue reasoning indefinitely | |
| - Question every assumption and inference |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import numpy as np | |
| from numba import njit, prange | |
| from sklearn.base import BaseEstimator, RegressorMixin | |
| @njit(inline='always') | |
| def weighted_variance_from_sums(sum_w, sum_wy, sum_wy_sq): | |
| # Weighted variance: var_w = (sum_wy_sq / sum_w) - (sum_wy / sum_w)**2 | |
| if sum_w <= 1e-14: | |
| return 0.0 | |
| mean_w = sum_wy / sum_w |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def fit_catboost_model(df, | |
| features, | |
| label_col, | |
| sample_weight_col = None, | |
| raw_model = CatBoostRegressor(), | |
| cv_splitter = GroupShuffleSplit(n_splits=2, test_size = 0.3), | |
| group_id_col = None, | |
| verbose = False, | |
| split_type = None): | |
| """ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def fit_mrmr(df, | |
| features, | |
| label_col, | |
| sample_weight_col = None, | |
| group_id_col = 'nba_id', | |
| max_features = 50, | |
| n_repeats = 1, | |
| n_splits = 5, | |
| n_jobs = 1, | |
| early_stopping_rounds = None): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def shorten_features_boruta(df, | |
| features, | |
| label_col, | |
| sample_weight_col = None, | |
| n_trials = 100, | |
| gpu = False, | |
| n_jobs = 8, | |
| classification_fl = False, | |
| sample_fl = True, | |
| base_model = None): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def shorten_features_catboost(df, | |
| features, | |
| label_col, | |
| n_estimators = 200, | |
| sample_weight_col = None, | |
| model = None, | |
| group_id = None, | |
| steps = 6, | |
| gpu = False, | |
| n_jobs = 8, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from scipy.stats import rankdata | |
| import numpy as np | |
| def weighted_chaterjee_correlation(x, y, sample_weight=None): | |
| """x and y must be 1d, probably""" | |
| n = x.size | |
| rk_x = rankdata(x, method="average") | |
| rk_y = rankdata(y, method="average") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # requirements to pip install: | |
| # | |
| # numpy | |
| # pandas | |
| # sklearn | |
| # HEBO | |
| # | |
| # | |
| import numpy as np | |
| import pandas as pd |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from threadpoolctl import threadpool_limits | |
| from sklearn.experimental import enable_hist_gradient_boosting | |
| from sklearn.ensemble import HistGradientBoostingRegressor | |
| class SkBoost(HistGradientBoostingRegressor): | |
| def __init__( | |
| self, | |
| loss="least_squares", | |
| learning_rate=0.1, | |
| max_iter=100, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def get_pbpstats(nba_game_id, stats_type = 'box'): | |
| # Convert the game ID | |
| nba_game_id = str(nba_game_id) | |
| if len(nba_game_id) < 10: | |
| nba_game_id = '00' + nba_game_id | |
| # Figure out the right URL |
NewerOlder