Brain-V Dashboard
An autonomous cognitive architecture attempting to decipher the Voynich Manuscript (Beinecke MS 408, c. 1404-1438) through statistical analysis, hypothesis generation, and iterative testing.
Corpus Profile
Three mutually-reinforcing findings: _.oii encodes psychoactive plant content
Brain-V has a cluster of three hypotheses that lock together: (1) _.oii fires 4.84x more on plant folios than non-plant herbal folios (held-out 5-fold validated), (2) within plant folios, _.oii fires 8.14x more on psychoactive plants than non-psychoactive (pre-registered, confirmed), (3) no volvelle variant tested can reproduce the plant-folio enrichment. Together these three results constitute Brain-V's first defensible evidence that EVA encodes botanical content visible in the illustrations.
_.oii marks plant-identified folios
_.oii further enriched on psychoactive plants
No volvelle architecture reproduces the plant enrichment
Structural argument
Under the volvelle hypothesis, plant folios and non-plant herbal folios draw from the SAME section cartridge. Expected enrichment is ~1.0x by construction. Across 500 null runs spanning three architectures, max observed is 2.80x. Real is 5.04x. The finding is structurally incompatible with section-level volvelle mechanics, independent of rate calibration.
Remaining escape hatches
- • Non-volvelle mechanical process (Markov chain, HMM) - untested
- • Sherwood's plant-ID labels correlated with some other folio-level text property - untested
- • Counter-intuitive: folio-level cartridges produce LOWER enrichment than section-level because independence averages signal to 1.0x - the volvelle family is structurally ruled out
Strategic note
This is Brain-V's strongest positive finding. Unlike H-BV-VOWEL-01 (which fell to H-BV-VOLVELLE-01 trivially), H-BV-PLANT-01 survives three volvelle variants at p<0.01 each. Combined with H-BV-PSYCHOACTIVE-01 as a pre-registered sub-class confirmation, Brain-V has evidence that EVA encodes botanical content at a resolution deeper than mechanism signature alone.
Coverage as a decipherment metric is dead
1,300 phonotactically plausible nonsense skeletons drawn from the corpus's own bigram-Markov distribution, size-matched to Voynichese word-length distribution. 20 independent trials.
| System | Coverage | Δ vs null floor |
|---|---|---|
| Brady Syriac (1,334, claimed) | 86.90% | +3.34pp |
| Null nonsense (1,300) | 83.56% | — |
| Schechter Latin (4,063) | 82.81% | -0.75pp |
| Hebrew medieval (1,300) | 57.90% | -25.66pp |
Implication. Nonsense skeletons that match corpus word-length and bigram distributions achieve 83.6% coverage with sigma=1pp over 20 trials. Any abjad-reducible lexicon of 1,000-4,000 entries will hit ~80% on the Voynich corpus by virtue of length and bigram structure alone. Brady sits 3.3pp above the null floor; Schechter sits below it. Only Hebrew is statistically distinct from random, and it is 25pp worse. Coverage-based 'decipherment' claims require a z-score against a corpus-derived null, not a random-alphabet null.
Vowel-section decode FAILS held-out validation
4 folios (f101r, f89r2 pharma; f78r, f82r bio) removed entirely from training. Rules and naive-Bayes classifier re-derived on the remaining corpus. Every held-out token predicted from vowel pattern alone. Scored against ground-truth section label.
Target: pharmaceutical (prevalence 44.3%)
| method | prec | rec | F1 |
|---|---|---|---|
| always-predict | 0.443 | 1.000 | 0.614 |
| rule table | 0.827 | 0.146 | 0.248 |
| naive Bayes | 0.673 | 0.229 | 0.341 |
Signal does not survive: best F1 below always-predict baseline.
Target: biological (prevalence 55.7%)
| method | prec | rec | F1 |
|---|---|---|---|
| always-predict | 0.557 | 1.000 | 0.716 |
| rule table | 0.568 | 0.782 | 0.658 |
| naive Bayes | 0.655 | 0.600 | 0.626 |
Signal does not survive: best F1 below always-predict baseline.
Per-folio NB hit rate
Held-out pharmaceutical folios mis-classified as biological at ~75%+ rate. Multi-class NB accuracy 43.6% vs majority baseline 55.7%.
Interpretation. The 100% agreement on _.eo -> pharmaceutical in training (8 skeletons) was an in-sample artefact. On held-out pharmaceutical folios, the classifier confidently predicts biological for ~75% of tokens. What survives: rule precision 0.827 on pharma when the rule fires — a real but sparse signal covering 40% of tokens. H-BV-VOWEL-CODE-01 demoted 0.75 -> 0.35. Aggregate chi-square coupling (H-BV-VOWEL-01) is unaffected; what failed is per-token prediction, not distribution-level coupling.
Why this is published. Brain-V publishes this negative result instead of suppressing it. The coverage-game critique (H-BV-NULL-01) still stands; the vowel layer still has aggregate signal; but the per-token decoding path is not the right frontier. Future work: sparse high-precision mappings, non-vowel structural features (glyph-role combos, line-position interactions), or image-label targets instead of section-metadata targets.
EVA vowels encode section-linked information in 79% of testable skeleton groups
For every consonant skeleton with >=3 vowel variants and >=100 tokens, chi-square test of variant distribution across the 8 manuscript sections. Critical value at p<0.01 from df.
Case: skeleton 'kdy' — Brady's chedy vs chody case (§3.10). Same consonant skeleton, different vowel pattern.
| variant | total | biological | recipes | herbal | zodiac |
|---|---|---|---|---|---|
| chedy | 501 | 181 | 199 | 62 | 4 |
| chdy | 140 | 17 | 40 | 52 | 3 |
| okedy | 116 | 41 | 31 | 22 | 3 |
| okeedy | 108 | 36 | 47 | 15 | 4 |
| chody | 88 | 0 | 23 | 43 | 0 |
chi² = 262.2, df = 28, critical at p<0.01 = 50.9 → 5.15× over threshold
Top-5 skeletons by vowel-section coupling strength
Implication. EVA vowels are not padding. Vowel choice within a fixed consonant frame correlates with section (herbal vs biological vs recipes vs zodiac) strongly enough to reject the independence null at p<0.01 in 55 of 70 testable skeleton groups. This is a language-independent structural property of Voynichese. Any future decipherment that treats EVA vowels as noise, or as free positional slots, is discarding information that the manuscript demonstrably encodes.
Three-Lexicon Comparisonnew
2026-04-15Three independent decipherment lexicons (Latin, Syriac, Hebrew) run through Brain-V's honest pipeline against the full EVA corpus. All three fail the shuffle test on word-order syntax.
| Lexicon | Entries | Coverage | conn→content Δ | both-matched Δ |
|---|---|---|---|---|
| Schechter Latin/Occitan | 4,063 | 82.8% | — | +0.0000 |
| Brady Syriac (proxy, 71 terms) | 71 | 48.2% | -0.0098 | +0.0149 |
| Hebrew medieval medical | 1,300 | 57.9% | -0.0144 | +0.0281 |
Currier B > A across all three lexicons
Three independent methodologies, same direction. Currier A structurally resists lexical matching.
H-BRADY-02 confirmed: gallows are paragraph markers
Brain-V's first independent verification of an external Voynich structural claim.
Verdict: Three independent lexicons from three language families produce (a) varying but substantial coverage (48-83%), (b) zero connector->content word-order signal under shuffle test, (c) reproducible Currier B>A asymmetry (+3-8pp). Lexicon-based methods establish thematic clustering, not decipherment. Currier A appears structurally distinct from B in a way that resists lexical matching regardless of source language.
Leading Hypotheses
View allThe manuscript's herbal section uses a combination of substitution and transposition ciphers, which would explain the higher entropy levels compared to other sections.
The Voynich text encodes a natural language using a null-cipher or homophones, where multiple glyphs map to the same plaintext character, which would explain the high hapax ratio (70.1%) and lower glyph entropy (3.86 bits) relative to expected natural language entropy while preserving Zipf-like structure.
The Currier A/B split reflects two different scribal hands encoding the same underlying language with different but related cipher alphabets, such that glyph-level bigram transition matrices in A and B sections are structurally isomorphic under a permutation mapping.
The zodiac and astronomical sections use a systematically different word-order encoding than herbal and recipes sections, reflecting a positional transposition cipher layer applied on top of substitution, detectable as reduced local bigram predictability at section boundaries relative to within-section transitions.
The high hapax ratio (70.1%) is partially artifactual, caused by consistent scribal abbreviation or word-compounding conventions where morphological suffixes are concatenated inconsistently, such that word-final glyph sequences 'y', 'n', 'l', 'r' function as detachable morphological markers — splitting words on these terminals would reduce unique vocabulary by at least 25%.
Current Beliefs
View allHow Brain-V Works
Perceive
Parse the EVA transliteration (Zandbergen ZL3b). Compute glyph frequencies, entropy, Zipf fit, positional constraints, Currier A/B statistics across all 226 folios.
Predict
Generate testable hypotheses about the manuscript's cipher, language, and structure. Each hypothesis specifies the exact statistical test that would confirm or deny it.
Score
Run each test against the corpus. Update confidence scores. Eliminate hypotheses that fail. Promote those that pass. Log everything on-chain via AgentProof.