sudoevans · December 15, 2023 08:39 · Dec 15, 2023
diff --git a/machine-learning-notes.md b/machine-learning-notes.md
@@ -0,0 +1,83 @@
+## This is my summary of machine learning fundamental concepts.
+ 1. **Supervised Learning:**
+
+- Goal: To train a model to predict output based on labeled training data.
+- Algorithms:
+   - Linear Regression: Used for predicting continuous outcomes.
+   - Logistic Regression: Used for binary classification problems.
+   - Decision Trees: Simplifies data into rules to make predictions.
+   - Support Vector Machines (SVMs): Finds the best decision boundary to separate data.
+   - k-Nearest Neighbors (k-NN): Predicts based on similarity to nearby data points.
+
+
+2. **Unsupervised Learning:**
+
+- Goal: To find structure or patterns in unlabeled data.
+- Algorithms:
+   - Clustering: Groups similar data points into clusters.
+   - Principal Component Analysis (PCA): Reduces data dimensionality while preserving key features.
+   - Anomaly Detection: Identifies unusual data points that deviate significantly from the norm.
+
+
+3. **Reinforcement Learning:**
+
+- Goal: To train an agent to make decisions in an environment to maximize rewards.
+- Algorithms:
+   - Q-Learning: Updates value estimations based on past decisions and rewards.
+   - SARSA (State-Action-Reward-State-Action): Similar to Q-Learning but uses only one action per state.
+   - Deep Q-Networks (DQN): Uses neural networks for value estimation in complex environments.
+
+
+4. **Deep Learning:**
+
+- Goal: To create neural networks that learn from large amounts of data.
+- Architectures:
+   - Convolutional Neural Networks (CNNs): Effective for image and speech recognition.
+   - Recurrent Neural Networks (RNNs): Used for sequential data like text and time series.
+   - Transformers: Recent advances for language processing and machine translation.
+
+
+5. **Evaluation Metrics:**
+
+- Accuracy: Measures the percentage of correct predictions.
+- Precision: Measures the proportion of true positives among all predicted positives.
+- Recall: Measures the proportion of true positives among all actual positives.
+- F1-score: Combines precision and recall into a single metric.
+- Mean Squared Error (MSE): Measures the average squared difference between predicted and actual values.
+
+
+6. **Bias and Variance:**
+
+- Bias: The systematic error introduced by a model due to assumptions or simplifications.
+- Variance: The random error introduced by a model due to the randomness in the data.
+- Bias-Variance Tradeoff: Balancing bias and variance to optimize model performance.
+
+
+7. **Overfitting and Underfitting:**
+
+- Overfitting: When a model performs well on training data but poorly on unseen data.
+- Underfitting: When a model fails to capture the underlying patterns in the data.
+
+
+8. **Regularization:**
+
+- Techniques to reduce overfitting by penalizing model complexity.
+- L1 Regularization (Lasso): Penalizes the sum of absolute coefficients.
+- L2 Regularization (Ridge): Penalizes the sum of squared coefficients.
+
+
+9. **Feature Engineering:**
+
+- The process of transforming raw data into features that are more suitable for machine learning models.
+- Techniques:
+   - Feature Scaling: Normalizing features to have a consistent range.
+   - Feature Selection: Selecting the most informative features.
+   - Feature Extraction: Creating new features from original features.
+
+
+10. **Model Selection and Validation:**
+
+- Techniques to select the best model and avoid overfitting:
+   - Cross-Validation: Evaluates a model on multiple subsets of the data.
+   - Train-Validation-Test Split: Divides the data into separate sets for training, validation, and testing.
+   - Hyperparameter Tuning: Optimizing the model's hyperparameters to improve performance.