Documentation

v0.7.0Last updated March 7, 2026See changelog

Getting Started

1

Extract Pages

Run the Python extraction script to convert the Voynich Manuscript PDF into individual high-res page images.

2

Initialize Database

Set up the SQLite database with page metadata, seed hypotheses, and prepare the analysis pipeline.

3

Start Exploring

Launch the web app to browse pages, record findings, track hypotheses, and run AI analyses.

Quick Start
cd processing && python extract_pages.py /path/to/manuscript.pdfpython init_db.pycd ../app && npm install && npm run dev

AI Pipeline

Visual Analysis

Claude vision API analyzes page imagery — illustrations, layouts, glyph patterns.

Glyph Extraction

Segment and catalog individual glyphs from manuscript pages for frequency analysis.

Pattern Recognition

Statistical analysis of glyph sequences, word boundaries, and section correlations.

Technical Reference

REST API

Endpoints for pages, images, hypotheses, findings, and analyses with field-level PATCH.

Database Schema

SQLite (dev) / PostgreSQL (prod) with tables for pages, hypotheses, findings, analyses, annotations.

CLI Tools

Python scripts for PDF extraction, database initialization, and batch processing.

Architecture

Python writes to SQLite + filesystem (data processing, extraction, analysis)
Next.js reads from SQLite + filesystem, writes only findings/annotations via API
Never store image blobs in SQLite — always as files

Need help or want to contribute?

This project is an experimental AI-powered analysis of the Voynich Manuscript (MS 408, Beinecke Rare Book & Manuscript Library, Yale University).