Supervised learning learns an accurate model that maps input to output label . To do this, we search over a hypothesis class to minimize the loss on the training set,

Note

We can use any function, simple or complex, to fit the data. However, the 🥪 No Free Lunch Theorem states that there’s no single algorithm that performs best on all data. All models require built-in biases, and it’s our job to choose a class that works for our data.

Moreover, we want the model to generalize well. Mathematically, our goal is to minimize risk

where is the distribution of our data.

Non-Parametric Models

Non-parametric models don’t optimize any weights.

🏠 K-Nearest Neighbors compares an input with the known data.
💭 Decision Trees split up the data on features that give the most information about the label.
🏯 Kernel Regression is a soft form of KNN that considers example similarity.

An 🎻 Ensemble combines non-parametric models to generate more accurate predictions.

Parametric Models

Parametric models, on the other hand, have a set of weights to optimize. Many optimize the their probability likelihood using 🪙 Maximum Likelihood Estimate or 🎲 Maximum A Posteriori, while others use more model-specific methods.

While there’s a huge diversity of solutions, parametric models can be categorized into the type of problem they solve: regression or classification.

Regression

Regression problems deal with predicting continuous .

🏦 Linear Regression fits a line to linear data.
🥢 Generalized Linear Model adapts this idea to work for non-linear data.
🔨 Principle Component Regression combines dimensionality reduction with linear regression.

Classification

Classification problems, on the other hand, work with discrete , or classes.

🦠 Logistic Regression is an extension of linear regression that outputs class probabilities.
🛩️ Support Vector Machines divide the data into two classes with a hyperplane.
👶 Naive Bayes classifies extremely large features sets by assuming that features are conditionally independent given the label.

Online Learning

There are some online variants of these algorithms that train via streaming, going through each datapoint one-by-one and updating the weights.

🗼 Least Mean Squares is an online version of linear regression.
👓 Perceptrons perform online updates for SVMs.

If there’s not enough labeled data, ✋ Active Learning strategies identify the most useful datapoints to label next.

Explorer

🎓 Supervised Learning

Non-Parametric Models

Parametric Models

Regression

Classification

Online Learning

Table of Contents

Backlinks

Graph View

Explorer

🎓 Supervised Learning

Non-Parametric Models §

Parametric Models §

Regression §

Classification §

Online Learning §

Table of Contents

Backlinks

Graph View

Non-Parametric Models

Parametric Models

Regression

Classification

Online Learning