Investigations Mar 2026

Heavy Metals in Japanese Green Tea

Literature review · Risk modelling · Source‑backed numbers

I drink a lot of matcha and wanted to know if I should worry. Pulled twenty‑odd primary sources, normalized the units, and worked out which elements actually matter at realistic daily doses. The answer turns out to be more nuanced than either the alarmist or the dismissive crowd suggests.

  • Aluminium, manganese, lead, cadmium, fluoride: ranges and dose context
  • Brewing physics: which metals leach into the cup vs. stay in the leaf
  • Comparison against EFSA, WHO, and US reference values
Practical Tools Mar 2026

Extracting Sprites From a 1993 Game

Reverse engineering · C++ · Pillow · 480 sprite tiles

A pipeline that pulls every graphical asset out of Per.Oxyd's proprietary .DAT archive, slices the sprite sheets into a clean 32×32 grid, and emits separate 1‑bit transparency masks for use in a modern engine. A direct sequel to the de‑dithering CNN, and feedstock for the maze prototype below.

  • Wraps the open‑source Enigma C++ oxydlib parser
  • Auto‑sorts backgrounds, UI, and 480 individual tiles into folders
  • One‑shot Python orchestrator: download, compile, extract, clean up
Creative Works Mar 2026

A Hide‑and‑Seek Labyrinth Game

LÖVE · Lua · Box2D · Procedural mazes

A game I sketched in 2017 and finally started building: navigate a maze of monsters with mines, cameras, and keyhole peeks. Per‑level mazes generated with chamber‑aware DFS and rounded subtractive corners; physics, audio, and key/lock layers tuned by config. Uses the sprites extracted above.

  • Box2D physics with rounded corner collision geometry
  • Graph‑distance debug overlays for shortcut analysis (F3)
  • Detached audio that survives level reloads
Practical Tools Mar 2026

Finding Hotels Without Google Maps

Open data · GTFS routing · Multi‑API pipeline

A hotel‑search pipeline built entirely on open data: OpenStreetMap for locations, MTA’s GTFS feed for schedule‑aware subway routing, OSRM for walking distances. Found 316 hotels reachable within 20 minutes, no Google APIs required.

  • Dijkstra‑like GTFS propagation with real departure times
  • OSRM batch routing for walking distance calculations
  • Interactive scatterplot of price vs. value
Read the write‑up →
AI Experiments Feb 2026

Training an LLM Brand Advisor

LoRA fine‑tuning · ONNX quantization · In‑browser inference

Fine‑tuned Gemma 3 1B into a brand‑loyal car advisor using LoRA, then shipped the quantized model to the browser via ONNX and WebGPU. Training to deployment in one pipeline.

  • LoRA rank 8 on attention matrices, trained on 50 examples
  • Response‑only loss masking for focused gradient updates
  • 5 GB → 1.3 GB via INT8 quantization, runs in‑browser
Read the write‑up →
AI Experiments Feb 2026

A Search Engine for My Own Life

Semantic search · HNSW · Fully offline

Semantic search over years of AI conversations, audio logs, diary entries, and Discord messages. Sentence‑transformers and HNSW for instant queries. No data leaves the machine.

  • HNSW approximate nearest‑neighbour search via hnswlib
  • Configurable chunking per document type
  • Incremental indexing: only new content gets processed
Read the write‑up →
Hardware & Tinkering Feb 2026

An E‑Paper Dashboard for the Hallway

Raspberry Pi · Ollama · 12.48″ e‑ink

An ambient display showing AI‑curated news, gold prices, and my calendar, powered by a Raspberry Pi running a unified scheduler that orchestrates RSS collection, embedding‑based filtering, and LLM curation. No backlight, no notifications.

  • News pipeline: RSS → BGE‑M3 embeddings → Ollama curation → HTML
  • Central scheduler with per‑task intervals and state persistence
  • Deployed via SSH/SFTP with systemd service management
Read the write‑up →
AI Experiments Feb 2026

A Podcast Transcription Pipeline

ASR · Speaker diarization · LLM post‑processing

A local pipeline that chains NeMo Parakeet for speech recognition, pyannote for speaker identification, and Ollama for cleanup. Produces speaker‑labelled transcripts from any audio file.

  • Chunked transcription with Levenshtein‑based overlap merging
  • Word‑level timestamps aligned to diarization segments
  • LLM post‑processing for speaker naming and correction
Read the write‑up →
AI Experiments Jan 2026

Building a Personal AI Newspaper

Embeddings · Vector calibration · LLM curation

An automated news‑curation pipeline that filters RSS feeds using embedding‑based similarity matching against personal interests, with careful calibration to handle AND‑combined topic constraints.

  • Solved single‑concept vs multi‑concept bias via centroid scoring
  • Contrastive lift gating to reject random‑looking matches
  • Final curation by Gemini, deployed to a Raspberry Pi
Read the write‑up →
AI Experiments Jan 2026

Rating Photos by Aesthetic Quality

AesCLIP · Offline scoring · CLI app

Scores photos 0–10 for aesthetic quality using AesCLIP, entirely offline. Processes a full photo archive incrementally, persists scores in SQLite, and includes a slideshow for browsing the best results.

  • AesCLIP (ACM MM 2023) with CLIP ViT‑B/16
  • Incremental processing: skips unchanged files via size + mtime
  • Production CLI with Typer, Rich output, fail‑fast errors
Read the write‑up →
Practical Tools Jan 2026

Letting an LLM Name My Files

Document automation · Ollama · Metadata extraction

Batch‑renames academic papers and invoices by extracting metadata with a local LLM. Papers become “[year] [author] - [title]”, invoices become “[date] [institution] - [title]”.

  • Supports PDF, EPUB, XPS, CBZ, and FB2 via PyMuPDF
  • TOML‑configurable prompts for different document types
  • Duplicate‑aware naming with conflict resolution
Read the write‑up →
Creative Works Aug 2025

Painting With Noise Fields

Stable Diffusion · CLIP · Spatial control · Gradio

Explores what happens when noise in Stable Diffusion carries spatial meaning, with fixed concepts in some regions and free morphing in others. CLIP embeddings with spatial masks for per‑location control over generation.

  • Three noise dimensions: visual, concept (512‑dim), feature (768‑dim)
  • Spatial masks control where concepts are fixed vs. variable
  • Gradio interface for interactive mask painting
Read the write‑up →
Creative Works Jun 2025

Growing Mazes From Images

OpenCV · Generative art · SVG

Maze walls that follow the contours of an image. Gradient fields guide the maze topology so passages flow along edges and textures, producing mazes that look like they grew from the picture.

  • Scharr gradients → angle fields → Floyd‑Steinberg dithering → SVG
  • Square and triangle grid support with per‑cell orientation
  • Tunable loopiness, branchiness, and dead‑end rate
Read the write‑up →
Investigations Jun 2024

Mazes as Spectral Graphs

Spectral graph theory · Eigenvalues · Tensor factorization

Applying spectral methods to maze design. The Laplacian eigenvectors reveal a maze’s fundamental topology, and changing eigenvalues changes how a maze feels without touching individual walls.

  • Normalized Laplacian and spectral embedding for maze topology
  • Interactive eigenvalue control via ipywidgets sliders
  • Non‑negative PARAFAC tensor factorization for decomposition
Read the write‑up →
Practical Tools Oct 2023

Removing Music From Audiobooks

YAMNet · Audio classification · Segment removal

Uses Google’s YAMNet to classify audio segments as speech or music, then surgically removes background music from audiobooks. Also merges multi‑chapter M4A files with preserved metadata.

  • Frame‑by‑frame speech vs. music scoring via TensorFlow Hub
  • Minimum segment length filtering to avoid false positives
  • FFmpeg normalization pipeline for consistent output
Read the write‑up →
AI Experiments Oct 2020

Finding Similar Books With Embeddings

Universal Sentence Encoder · Book similarity · NLP

Embeds entire ebooks into 512‑dimensional vector space using Universal Sentence Encoder. Similarity search, style matching, categorization, and duplicate detection across large libraries.

  • Samples 120 sentences from mid‑book to capture voice, not front matter
  • Median embedding per book for noise‑robust representation
  • Gaussian models for genre categorization
Read the write‑up →
Investigations Mar 2019

Sequences That Look Random to Humans

Cognitive science · Algorithm design · Perception

True randomness feels wrong to humans; we see patterns in noise. This algorithm generates sequences that actively avoid the structures our brains detect: mirrors, runs, and repetitions.

  • Exponential penalties with distance‑based mitigation
  • Tuned per pattern type: repetition, mirror, monotonic run
  • Output rated as “more random” than actual random sequences
Read the write‑up →
AI Experiments Mar 2019

Teaching a Network to See Past Dithering

CNN · Retro game restoration · Custom training pipeline

Built a CNN to convert 4‑bit dithered textures from the retro game Per.Oxyd back to full 24‑bit colour. Custom training data from Floyd‑Steinberg dithering, with augmentation to handle checkerboard shadow artifacts.

  • ESPCN‑based architecture, 5 conv layers, trained on 2,500 photos
  • Checkerboard shadow augmentation to prevent false de‑dithering
  • 30 dB PSNR, 63 hours of documented iteration
Read the write‑up →
AI Experiments Feb 2019

Real‑Time Video Upscaling With ESPCN

Super‑resolution · PyTorch · Sub‑pixel convolution

A PyTorch implementation of the ESPCN super‑resolution network, upscaling low‑resolution video without the smeared look of bicubic interpolation. Image and video input with standard benchmark evaluation.

  • Sub‑pixel convolution moves upscaling to the end for efficiency
  • Solved YCbCr colour shift by switching to RGB training
  • ~31.5 dB PSNR on Set5, Set14, BSD100
Read the write‑up →
Investigations Nov 2014

Simulating Grapheme‑Colour Synesthesia

Color science · Cognitive perception · PDF generation

A simulation of what text looks like to a grapheme‑colour synesthete. Each letter has a fixed colour, but neighbouring letters blend and influence each other through Gaussian‑weighted fields, modelling the perceptual experience, not just the mapping.

  • Cassidy Curtis colour mapping with Gaussian neighbour influence
  • HSL polar mixing for perceptually uniform blending
  • Outputs coloured PDF via ReportLab
Read the write‑up →
Hardware & Tinkering Oct 2012

Reverse Engineering an E‑Bike Controller

PIC Assembly · Firmware analysis · Embedded systems

Disassembled and analyzed PIC16F946 firmware from a TranzX e‑bike motor controller. Traced motor PWM, battery monitoring, pedal‑assist logic, and display communication across multiple firmware versions.

  • Multiple firmware hex dumps analyzed (SS606, SU806, AT8U3)
  • Created instruction set and register reference tables
  • Hardware analysis with EAGLE schematics
Read the write‑up →

From the wider archive

The catalogue above is a curated slice. Below is a glimpse of the rest: roughly eighty hobby projects accumulated over fifteen years, sometimes finished, often deliberately unfinished, always written down somewhere.

  • 2026‑04 Inference Optimization AI Benchmarking Ollama against LMDeploy on a Qwen 3.5 122B AWQ build, comparing throughput, memory, and quantization tradeoffs across WSL and Windows hosts.
  • 2026‑03 Flight Route Schedule Tools A small CLI that infers weekly flight schedules between two airports from the AeroDataBox FIDS endpoint, designed to stay inside the free tier's 600 monthly API units.
  • 2026‑03 Semantic Library Search AI Embedding‑based search across a personal ebook library, tuned for thematic matches rather than keyword overlap.
  • 2026‑02 Waveshare e‑paper hacking Hardware Reverse‑engineering the timing and refresh sequences of a 12.48″ e‑paper panel to drive it directly from a Pi.
  • 2025‑09 Binaural Beat Generator Hardware A self‑contained binaural beats player: schematic, MCU connections, CAD enclosure, and the protracted design conversations that came before any of it was real.
  • 2025‑08 extractInformation AI LLM‑based parsing and condensation of long audio transcripts into structured notes.
  • 2025‑07 YouTube transcript subtitling AI Bulk subtitle generation across a downloaded YouTube collection using NVIDIA Parakeet ASR.
  • 2025‑05 SleepData Investigation An attempt to convert Apple Watch sleep data into human‑readable charts; abandoned when the high‑resolution data turned out to be locked on‑device.
  • 2024‑03 Book Duplicate Search Tools A small Python utility that finds duplicate ebooks across a sprawling library by comparing normalized titles, authors, and content hashes.
  • 2024‑07 Drink Cooler Hardware A small Peltier‑based cooler experiment for keeping a single drink cold at the desk.
  • 2024‑03 LAB booster Investigation An algorithm that amplifies tiny colour differences in photos by manipulating the A/B channels of LAB space.
  • 2024‑03 Book Summaries AI An LLM‑based summariser that uses recursive context‑nesting to compress books larger than the model’s window.
  • 2023‑10 Neural activation theory for autism Investigation Two charts laying out the idea that autism may emerge from increased perceptive specificity at the neural level.
  • 2023‑07 Dreame W10 Rooting Hardware Gaining root access to a robot mop to inspect what it’s sending home and run my own logging.
  • 2023‑01 USB‑C Power Filter Hardware A small inline filter board to clean noisy USB‑C power for sensitive analogue gear.
  • 2022‑11 Hard‑drive Debugging Hardware Tracing a flaky bus and replacing components on a misbehaving external drive, with proper logging this time.
  • 2022‑04 iPad mini jailbreak Hardware Recovering an old iPad mini and unlocking it for kiosk‑style use cases.
  • 2021‑11 Trackball Hardware A custom trackball build sized to my hand, with a mind to ergonomics and replaceable bearings.
  • 2021‑10 Solar Battery Hardware A small balcony solar setup feeding a Li‑ion bank, with proper monitoring of charge cycles.
  • 2020‑12 Game motivation clusters Investigation Rating my 100 favourite games on a 12‑axis scale and finding emergent clusters in 12‑dimensional space.
  • 2019‑04 Blue Ocean Event Prediction Investigation A back‑of‑envelope analysis predicting the most likely year for the first ice‑free Arctic summer.
  • 2019‑02 Fahrradakku bauen Hardware Building a custom 18650 battery pack for an e‑bike: cell matching, BMS, weather‑sealed enclosure.
  • 2018‑03 Ahnenforschung Investigation A genealogical project to find the roots of my surname; partial, still unfinished, and probably will be for a while.
  • 2017‑04 Transfer Entropy in Markets Investigation Using transfer entropy to look for causal structure in stock‑market time series; a holdover from the energy‑risk years.
  • 2013‑12 30c3 Presentation Investigation A 30‑minute talk at the Chaos Communication Congress on the state of DIY brain‑computer interfaces.
  • 2012‑08 PhD‑level MEG/MRI investigation Investigation How information flows in the brain during a particular kind of language processing: beamforming, conductivity models, transfer entropy.