Click to expand the mind map for a detailed view.

Introduction
- Recognition of Handwritten Digits
- Example: Recognizing a sloppily written “3” at 28×28 pixels.
- Brain’s ability to recognize variations of the same digit effortlessly.
- Challenge: Writing a program to recognize digits from pixel grids.
Series Preview
- Relevance of Machine Learning and Neural Networks
- Importance in the present and future.
- Goal: Explain neural networks as a piece of math, not just a buzzword.
- Structure of the Series
- Part 1: Structure of neural networks.
- Part 2: Learning process.
- Example: Recognizing handwritten digits.
- Resources for further learning and experimentation.
What are Neurons?
- Basic Definition
- Neurons hold a number between 0 and 1.
- Example: 784 neurons for a 28×28 pixel image.
- Activation: Represents grayscale value (0 for black, 1 for white).
Introducing Layers
- Layers in Neural Networks
- Input Layer: 784 neurons (one per pixel).
- Output Layer: 10 neurons (one per digit).
- Hidden Layers: Intermediate layers (e.g., two layers with 16 neurons each).
- Activation Flow
- Activations in one layer determine activations in the next.
- Analogous to biological neural networks.
Why Layers?
- Hierarchical Recognition
- Breaking down digits into subcomponents (e.g., loops, lines).
- Hope: Neurons in hidden layers correspond to subcomponents.
- Example: Recognizing edges, patterns, and combining them into digits.
- Generalization to Other Tasks
- Useful for image recognition, speech parsing, etc.
Edge Detection Example
- Neuron Activation for Edge Detection
- Weights assigned to connections between neurons.
- Weighted sum of activations determines neuron activation.
- Use of positive and negative weights to detect edges.
- Sigmoid Function
- Squishes weighted sum into a range between 0 and 1.
- Bias: Adjusts threshold for neuron activation.
Counting Weights and Biases
- Complexity of Connections
- Each neuron in a layer connected to all neurons in the previous layer.
- Example: 784 x 16 weights and 16 biases for one hidden layer.
- Total weights and biases in the network: ~13,000.
How Learning Relates
- Learning as Optimization
- Adjusting weights and biases to solve the problem.
- Thought experiment: Manually setting weights and biases.
- Importance of understanding weights and biases for troubleshooting.
Notation and Linear Algebra
- Compact Representation
- Activations as vectors.
- Weights as matrices.
- Matrix multiplication for weighted sums.
- Equation: activations of second layer = σ(Wa1 + b)
- W: Weight matrix.
- a1: Activations from the first layer.
- b: Bias vector.
- σ: Sigmoid function.