1. What Is Deep Learning?

  1. Short definition – Deep Learning (DL) is the part of Machine Learning (ML) that learns automatically by building many layers of artificial neurons.
  2. Why “deep” – Each extra layer lets the network capture progressively higher‑level patterns, just like how the human brain processes visual, auditory, and linguistic data in stages.
  3. From human intuition to computer code – A computer can’t “see a cat” until it’s trained on thousands of cat images, while a human can identify a cat instantly. DL teaches the computer this skill.
  4. Common applications

2. The Core Building Blocks of a Neural Network

Building Block What it Does How it Works
Neurons Basic decision units Each neuron receives several weighted inputs, adds a bias, then applies an “activation function” to decide whether or not it should fire.
Layers Structural organization • Input Layer – takes raw data (pixel values, word tokens, etc.).• Hidden Layers – one or more intermediates that learn features and patterns.• Output Layer – delivers the final prediction (a probability, a class label, a real number, etc.).
Weights Connection strengths Numbers that scale the influence of each input on the neuron’s output. They’re learned during training.
Bias Offset term A small adjustment added before the activation; it lets a neuron fire even if all its inputs are zero.

3. How a Network Learns – Forward & Backward Flow

3.1 Forward Pass (What the network “does” first)

  1. Send the data into the first layer – the raw image pixels or text tokens become inputs.
  2. Move through the hidden layers – each neuron multiplies its incoming signals by its weight, adds its bias, and then applies an activation (e.g., turning a raw sum into a squashed value).
  3. Reach the output layer – the final neuron(s) produce a clean, interpretable result (a probability between 0 and 1 for “is a cat?”).
  4. Evaluate the result – calculate a loss by comparing the prediction with the known correct answer. A common loss is the “mean squared error” for regression or “cross‑entropy” for classification.

3.2 Backward Pass (How the network “learns” from its mistakes)

  1. Compute the gradient – figure out how tiny changes to each weight would change the loss. Think of the gradient as a slope that tells the network which way to adjust.
  2. Adjust the weights – move a tiny step in the opposite direction of the slope. The size of the step is governed by the learning rate (a small number like 0.0001).
  3. Repeat for many batches and epochs – “epoch” means the entire training set has passed through the network once. Over many epochs the network’s predictions improve and the loss shrinks, ideally settling at a deep minimum in the loss landscape.

Quick tip – Too high a learning rate → huge, unstable jumps; too low → painfully slow progress.