Hands On "AI Engineering"

Hands On "AI Engineering"

180-Day AI and Machine Learning Course from Scratch

Day 128 — Activation Functions

The Decision Gates of Neural Networks

May 24, 2026
∙ Paid

What We’re Building Today

  • Understand why activation functions exist and what happens without them

  • Explore the four activation functions powering every major AI system today — ReLU, Sigmoid, Tanh, and Softmax

  • Implement each from scratch in NumPy, then validate against PyTorch

  • Connect activation function choice to real production model architectures


Why This Matters

Yesterday you hit the wall: a perceptron can only draw straight lines. XOR is unsolvable. Every real-world problem — detecting cancer in an MRI, routing a self-driving car, ranking your Instagram feed — is non-linear by nature. Activation functions are the single mechanism that transforms a rigid linear machine into a universal function approximator. Without them, stacking 100 layers is mathematically identical to having one layer. Every neural network at Google, OpenAI, and Tesla runs on the four functions you’ll implement today. This is not academic warmup — this is the engine.


Core Concepts

1. The Linear Trap — Why Stacking Doesn’t Help Without Activation

Picture a standard neuron: it takes inputs, multiplies them by weights, adds a bias, and produces an output. That operation is entirely linear — y = Wx + b. Stack two such layers: y = W2(W1x + b1) + b2. Expand it: y = (W2·W1)x + (W2·b1 + b2). This collapses back to y = W'x + b' — a single linear transformation. You could have a thousand layers and they’d all collapse into one. Activation functions break this collapse by applying a non-linear squeeze or gate between each layer, making the composition genuinely more expressive at every depth.

Layer 1       Layer 2         Without σ(·)        With σ(·)
─────────     ─────────       ────────────        ──────────
W₁x + b₁  →  W₂(·) + b₂  =  W'x + b'           truly deep
  linear        linear          (one layer)        (non-linear)

Preparing for a distributed systems interview?
→Download the free Interview Pack
→ Subscribe now to access source code repository - 200 + coding lessons

Github Link:

https://github.com/sysdr/aiml-p/tree/main/day128/lesson

User's avatar

Continue reading this post for free, courtesy of AI Engineering.

Or purchase a paid subscription.
© 2026 AIE · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture