Logistic Regression

Definition

A linear binary classifier that models $P (y = 1 ∣ x)$ via the sigmoid of a linear combination of inputs; the building block of a neural network neuron.

Intuition

The sigmoid “squashes” the linear score into a probability in $(0, 1)$ . Training maximizes the log-likelihood of the labels, which is equivalent to minimizing cross-entropy. Despite the name, logistic regression is a classifier, not a regressor.

Formal Description

Model:

$z = w^{⊤} x + b, \overset{y}{^} = σ (z) = \frac{1}{1 + e ^{- z}}$

Loss per example (binary cross-entropy):

$ℓ (\overset{y}{^}, y) = - [y lo g \overset{y}{^} + (1 - y) lo g (1 - \overset{y}{^})]$

Gradients:

$\frac{\partial ℓ}{\partial w} = (\overset{y}{^} - y) x, \frac{\partial ℓ}{\partial b} = \overset{y}{^} - y$

The gradient has a clean form: residual × input.

Vectorized batch form (columns are examples):

$Z = W^{⊤} X + b, A = σ (Z)$

$d W = \frac{1}{m} X (A - Y)^{⊤}, d b = \frac{1}{m} 1^{⊤} (A - Y)$

Applications

Baseline binary classifier for any task
Direct interpretation as calibrated probabilities
Each neuron in a sigmoid-activation network
Output layer for binary classification problems

Trade-offs

Linear decision boundary — underfits non-linear problems
Assumes features are individually informative; correlated features can cause instability
Sensitive to class imbalance (consider reweighting or focal loss)
Easily extended to multi-class via softmax regression

Notes

Explorer

logistic_regression

Logistic Regression

Definition

Intuition

Formal Description

Applications

Trade-offs

Links

Graph View

Table of Contents

Backlinks