Heavy Metals in Japanese Green Tea
Literature review · Risk modelling · Source‑backed numbers
I drink a lot of matcha and wanted to know if I should worry. Pulled
twenty‑odd primary sources, normalized the units, and worked out which
elements actually matter at realistic daily doses. The answer turns out to
be more nuanced than either the alarmist or the dismissive crowd suggests.
- Aluminium, manganese, lead, cadmium, fluoride: ranges and dose context
- Brewing physics: which metals leach into the cup vs. stay in the leaf
- Comparison against EFSA, WHO, and US reference values
Write‑up forthcoming.
Extracting Sprites From a 1993 Game
Reverse engineering · C++ · Pillow · 480 sprite tiles
A pipeline that pulls every graphical asset out of Per.Oxyd's
proprietary .DAT archive, slices the sprite sheets into a
clean 32×32 grid, and emits separate 1‑bit transparency masks for use in a
modern engine. A direct sequel to the de‑dithering CNN, and feedstock for
the maze prototype below.
- Wraps the open‑source Enigma C++
oxydlib parser
- Auto‑sorts backgrounds, UI, and 480 individual tiles into folders
- One‑shot Python orchestrator: download, compile, extract, clean up
Write‑up forthcoming.
A Hide‑and‑Seek Labyrinth Game
LÖVE · Lua · Box2D · Procedural mazes
A game I sketched in 2017 and finally started building: navigate a maze of
monsters with mines, cameras, and keyhole peeks. Per‑level mazes generated
with chamber‑aware DFS and rounded subtractive corners; physics, audio, and
key/lock layers tuned by config. Uses the sprites extracted above.
- Box2D physics with rounded corner collision geometry
- Graph‑distance debug overlays for shortcut analysis (F3)
- Detached audio that survives level reloads
Prototype, no public build yet.
Finding Hotels Without Google Maps
Open data · GTFS routing · Multi‑API pipeline
A hotel‑search pipeline built entirely on open data: OpenStreetMap for
locations, MTA’s GTFS feed for schedule‑aware subway routing, OSRM for
walking distances. Found 316 hotels reachable within 20 minutes, no
Google APIs required.
- Dijkstra‑like GTFS propagation with real departure times
- OSRM batch routing for walking distance calculations
- Interactive scatterplot of price vs. value
Read the write‑up →
Training an LLM Brand Advisor
LoRA fine‑tuning · ONNX quantization · In‑browser inference
Fine‑tuned Gemma 3 1B into a brand‑loyal car advisor using LoRA, then
shipped the quantized model to the browser via ONNX and WebGPU. Training
to deployment in one pipeline.
- LoRA rank 8 on attention matrices, trained on 50 examples
- Response‑only loss masking for focused gradient updates
- 5 GB → 1.3 GB via INT8 quantization, runs in‑browser
Read the write‑up →
A Search Engine for My Own Life
Semantic search · HNSW · Fully offline
Semantic search over years of AI conversations, audio logs, diary entries,
and Discord messages. Sentence‑transformers and HNSW for instant queries.
No data leaves the machine.
- HNSW approximate nearest‑neighbour search via hnswlib
- Configurable chunking per document type
- Incremental indexing: only new content gets processed
Read the write‑up →
An E‑Paper Dashboard for the Hallway
Raspberry Pi · Ollama · 12.48″ e‑ink
An ambient display showing AI‑curated news, gold prices, and my calendar,
powered by a Raspberry Pi running a unified scheduler that orchestrates
RSS collection, embedding‑based filtering, and LLM curation. No backlight,
no notifications.
- News pipeline: RSS → BGE‑M3 embeddings → Ollama curation → HTML
- Central scheduler with per‑task intervals and state persistence
- Deployed via SSH/SFTP with systemd service management
Read the write‑up →
A Podcast Transcription Pipeline
ASR · Speaker diarization · LLM post‑processing
A local pipeline that chains NeMo Parakeet for speech recognition,
pyannote for speaker identification, and Ollama for cleanup. Produces
speaker‑labelled transcripts from any audio file.
- Chunked transcription with Levenshtein‑based overlap merging
- Word‑level timestamps aligned to diarization segments
- LLM post‑processing for speaker naming and correction
Read the write‑up →
Building a Personal AI Newspaper
Embeddings · Vector calibration · LLM curation
An automated news‑curation pipeline that filters RSS feeds using
embedding‑based similarity matching against personal interests, with
careful calibration to handle AND‑combined topic constraints.
- Solved single‑concept vs multi‑concept bias via centroid scoring
- Contrastive lift gating to reject random‑looking matches
- Final curation by Gemini, deployed to a Raspberry Pi
Read the write‑up →
Rating Photos by Aesthetic Quality
AesCLIP · Offline scoring · CLI app
Scores photos 0–10 for aesthetic quality using AesCLIP, entirely offline.
Processes a full photo archive incrementally, persists scores in SQLite,
and includes a slideshow for browsing the best results.
- AesCLIP (ACM MM 2023) with CLIP ViT‑B/16
- Incremental processing: skips unchanged files via size + mtime
- Production CLI with Typer, Rich output, fail‑fast errors
Read the write‑up →
Letting an LLM Name My Files
Document automation · Ollama · Metadata extraction
Batch‑renames academic papers and invoices by extracting metadata with a
local LLM. Papers become “[year] [author] - [title]”, invoices
become “[date] [institution] - [title]”.
- Supports PDF, EPUB, XPS, CBZ, and FB2 via PyMuPDF
- TOML‑configurable prompts for different document types
- Duplicate‑aware naming with conflict resolution
Read the write‑up →
Painting With Noise Fields
Stable Diffusion · CLIP · Spatial control · Gradio
Explores what happens when noise in Stable Diffusion carries spatial
meaning, with fixed concepts in some regions and free morphing in others. CLIP
embeddings with spatial masks for per‑location control over generation.
- Three noise dimensions: visual, concept (512‑dim), feature (768‑dim)
- Spatial masks control where concepts are fixed vs. variable
- Gradio interface for interactive mask painting
Read the write‑up →
Growing Mazes From Images
OpenCV · Generative art · SVG
Maze walls that follow the contours of an image. Gradient fields guide
the maze topology so passages flow along edges and textures, producing
mazes that look like they grew from the picture.
- Scharr gradients → angle fields → Floyd‑Steinberg dithering → SVG
- Square and triangle grid support with per‑cell orientation
- Tunable loopiness, branchiness, and dead‑end rate
Read the write‑up →
Mazes as Spectral Graphs
Spectral graph theory · Eigenvalues · Tensor factorization
Applying spectral methods to maze design. The Laplacian eigenvectors
reveal a maze’s fundamental topology, and changing eigenvalues changes
how a maze feels without touching individual walls.
- Normalized Laplacian and spectral embedding for maze topology
- Interactive eigenvalue control via ipywidgets sliders
- Non‑negative PARAFAC tensor factorization for decomposition
Read the write‑up →
Removing Music From Audiobooks
YAMNet · Audio classification · Segment removal
Uses Google’s YAMNet to classify audio segments as speech or music,
then surgically removes background music from audiobooks. Also merges
multi‑chapter M4A files with preserved metadata.
- Frame‑by‑frame speech vs. music scoring via TensorFlow Hub
- Minimum segment length filtering to avoid false positives
- FFmpeg normalization pipeline for consistent output
Read the write‑up →
Finding Similar Books With Embeddings
Universal Sentence Encoder · Book similarity · NLP
Embeds entire ebooks into 512‑dimensional vector space using Universal
Sentence Encoder. Similarity search, style matching, categorization, and
duplicate detection across large libraries.
- Samples 120 sentences from mid‑book to capture voice, not front matter
- Median embedding per book for noise‑robust representation
- Gaussian models for genre categorization
Read the write‑up →
Sequences That Look Random to Humans
Cognitive science · Algorithm design · Perception
True randomness feels wrong to humans; we see patterns in noise. This
algorithm generates sequences that actively avoid the structures our
brains detect: mirrors, runs, and repetitions.
- Exponential penalties with distance‑based mitigation
- Tuned per pattern type: repetition, mirror, monotonic run
- Output rated as “more random” than actual random sequences
Read the write‑up →
Teaching a Network to See Past Dithering
CNN · Retro game restoration · Custom training pipeline
Built a CNN to convert 4‑bit dithered textures from the retro game
Per.Oxyd back to full 24‑bit colour. Custom training data from
Floyd‑Steinberg dithering, with augmentation to handle checkerboard
shadow artifacts.
- ESPCN‑based architecture, 5 conv layers, trained on 2,500 photos
- Checkerboard shadow augmentation to prevent false de‑dithering
- 30 dB PSNR, 63 hours of documented iteration
Read the write‑up →
Real‑Time Video Upscaling With ESPCN
Super‑resolution · PyTorch · Sub‑pixel convolution
A PyTorch implementation of the ESPCN super‑resolution network, upscaling
low‑resolution video without the smeared look of bicubic interpolation.
Image and video input with standard benchmark evaluation.
- Sub‑pixel convolution moves upscaling to the end for efficiency
- Solved YCbCr colour shift by switching to RGB training
- ~31.5 dB PSNR on Set5, Set14, BSD100
Read the write‑up →
Simulating Grapheme‑Colour Synesthesia
Color science · Cognitive perception · PDF generation
A simulation of what text looks like to a grapheme‑colour synesthete. Each
letter has a fixed colour, but neighbouring letters blend and influence
each other through Gaussian‑weighted fields, modelling the perceptual
experience, not just the mapping.
- Cassidy Curtis colour mapping with Gaussian neighbour influence
- HSL polar mixing for perceptually uniform blending
- Outputs coloured PDF via ReportLab
Read the write‑up →
Reverse Engineering an E‑Bike Controller
PIC Assembly · Firmware analysis · Embedded systems
Disassembled and analyzed PIC16F946 firmware from a TranzX e‑bike motor
controller. Traced motor PWM, battery monitoring, pedal‑assist logic, and
display communication across multiple firmware versions.
- Multiple firmware hex dumps analyzed (SS606, SU806, AT8U3)
- Created instruction set and register reference tables
- Hardware analysis with EAGLE schematics
Read the write‑up →