Changelog
New features, improvements, and fixes for Voynich AI.
Data Quality & Enrichment PRD (Hard Gate)
new2,049-line spec establishing data foundation as HARD GATE for all compute. 9 phases: IVTFF re-parse with full metadata (quire, hand, paragraph markers), Stolfi slot tagging, folio mapping audit, Takahashi vs Landini cross-validation, standardized 31-token EVA tokenization, AI glyph cleanup, schema migration, 16 automated validation checks, 9 export formats.
Infrastructure & Compute PRD
new2,970-line spec for compute layer. Modal as primary GPU (serverless, pay-per-second), RunPod fallback. PostgreSQL-based job queue, VoynichJob Python base class, 6 Next.js API routes, Docker (CPU + GPU), cost optimization within $2,000 cap, structured monitoring with budget hard caps.
Four Implementation PRDs for Agent Execution
new6,110 lines of detailed specs across 4 PRDs: Cryptanalytic Attacks v2.0 (8 attacks, 6 sprints, DB schema), AI/ML Visual Research (6 tracks: DINOv2, diffusion morphing, CLIP, cipher fingerprinting, synthetic pre-training, LLM patterns), Post-Attack Decipherment (3 conditional paths: Naibbe recovery, grammar extraction, decompression), Web App Attack Dashboard (7 pages, 10 chart components, 7 API routes). All with checkboxes, pseudocode, and success criteria for autonomous execution.
Research Page & Cryptanalytic Framework
newComprehensive literature review AND original cryptanalytic framework. Constraint elimination table proving h2=1.84 eliminates simple substitution, polyalphabetic, and homophonic ciphers. Crib catalog with 5 plaintext sources (zodiac labels, paragraph formulas). 8-attack execution plan with success/failure criteria. Three surviving hypotheses with falsifiable predictions. Plus: 20 decipherment attempts, 14 AI prior art, 11 approaches, Datasets & Tools catalog.
Expert Transcription (IVTFF)
newImported 162,755 expert-transcribed characters from Takahashi IVTFF transcription across 225 folios. Page viewer now shows line-by-line EVA text with digraph-aware character frequency analysis.
Updated Pipeline Status
improvedDashboard now reflects actual progress: glyph catalog, expert transcription, and literature review all complete. Statistical analysis phase active.
Glyph Catalog
newBrowse 17,672 extracted glyphs organized by EVA character. Paginated image grid with click-to-enlarge modal, frequency bars, and page/line/word location metadata.
Glyph Cropping Pipeline
newPython pipeline crops individual glyph images from full-res manuscript pages, organizes by character, and uploads to R2. 17,672 images processed.
AI Glyph Extraction (M1)
newClaude Vision API extracts individual glyphs from Yale scans with bounding boxes. Overlay toggle shows glyph locations on manuscript images.
Image Loading Fixes
fixedFixed placeholder lines overlaying images, added retry with jitter for R2 rate limiting, fixed trailing slash 404s.
Cloudflare R2 Storage
new176 Yale manuscript scans uploaded to R2. Image serving from R2 in production, filesystem fallback for local dev.
Research PRD & Data Pipeline
newIVTFF parser, calibration baselines, reference corpora downloader, Phase 1 DB migration with 9 new tables and Bayesian prior seeding.
Complete UI Redesign
improvedPorted Lovable-generated design system to Next.js. Warm light theme with parchment background, terracotta accent. DM Sans, Inter, and JetBrains Mono typography. Collapsible sidebar with Lucide icons and mobile bottom nav.
High-Resolution Yale Scans
newReplaced low-res PDF extractions (~550px) with 176 Yale Beinecke Library scans from Internet Archive (~3000x4000px JPG). 1.1GB of high-res manuscript imagery.
Full-Viewport Page Viewer
improvedPage viewer now fills the entire screen height. Split view with tabbed detail panel (Info, Analyses, Findings). Breadcrumb navigation and zoom controls.
Hypotheses Accordion
improvedExpandable hypothesis rows with chevrons, confidence bars, falsifiability criteria, and evidence sections.
Section Filter Pills
newPage browser now has colored section filter pills (botanical, astronomical, biological, pharmaceutical, recipes, text) with search.
Dual Database Backend
newPostgreSQL for production (via DATABASE_URL) and SQLite for local development. All queries now async with auto-migration on first connect.
Railway Deployment
newStandalone output mode for Railway container deployment. Node 20+ enforced via engines field and nixpacks.toml.
Railway Build Fix
fixedNixpacks defaulted to Node 18 where better-sqlite3 had no prebuilt binaries. Fixed by requiring Node 20+ and making SQLite optional.
PDF Page Extraction
newExtract all pages from the Voynich Manuscript PDF at 300 DPI as individual PNGs. Page metadata stored in JSON. Python script with PyMuPDF.
Page Browser
newNext.js web app with sidebar navigation, page grid with section filtering, individual page viewer with zoom controls (0.25x-4x), and prev/next navigation.
Hypothesis Tracker
new6 initial hypotheses seeded with descriptions and falsifiability criteria. Status tracking, confidence scores, and evidence recording.
Findings Log
newPer-page observation recording with title, description, and finding type. Linked to hypotheses with a global findings view.
SQLite Database
newTables for pages (176 records), hypotheses (6 seeded), findings, analyses, and annotations. WAL mode enabled for concurrent access.
REST API
newEndpoints for pages (GET, PATCH), images (GET with path traversal protection), hypotheses (GET), and findings (GET, POST).