Explorables

Small interactive essays. Drag a slider, hover a cell, and the idea reveals its mechanism instead of being summarised. A slow archive of mine.

  1. Image kernels, explained visually.2026

    A 3×3 matrix is the engine behind sharpen, blur, emboss, and edge detection, and the first move inside every convolutional neural network. Slid across an image, one pixel at a time.

  2. Attention, in seven tokens.2026

    Self-attention as three lines of arithmetic on a six-word sentence. Click any token to make it the query and watch the dot products, softmax, and weighted value sum fall out.

  3. Gradient descent, by hand.2026

    Four canonical 2D loss surfaces and four canonical optimizers. Drop a starting point, push the learning rate up until things get violent, and race SGD against Adam down the same valley.

  4. An image, as a sum of waves.2026

    The 2D Fourier transform on a 32×32 portrait. Click a frequency cell to see the cosine wave it represents; sweep a low-pass radius and watch the picture lose, or recover, its details.

  5. An image is sixteen patches wide.2020

    The Vision Transformer chops an image into a regular grid of patches, treats them like words, and runs a vanilla transformer over the sequence. Slide patch size; click a patch to see what it attends to.

  6. R-CNN. The slow ancestor.2014

    Object detection, part 1 of 4. Two thousand region proposals, each cropped and pushed through AlexNet. Forty-seven seconds an image. The IoU widget is here too.

  7. Fast R-CNN. Share the features.2015

    Object detection, part 2 of 4. Run the backbone once on the whole image; let proposals pull their features out of one feature map through RoI pooling. Forty-seven seconds becomes two.

  8. Faster R-CNN. Learn the proposals.2015

    Object detection, part 3 of 4. Replace Selective Search with a small convolutional head, the RPN, that predicts offsets to a library of anchor boxes. Detection becomes a single network.

  9. YOLO. Skip the proposals.2016

    Object detection, part 4 of 4. An S×S grid; each cell predicts boxes, confidences, and classes in one forward pass. Non-max suppression cleans up the duplicates.