Hypothesis Board

213 total hypotheses. 162 active, 24 eliminated, 7 parked.

Active (162)

H0010.99

The manuscript's herbal section uses a combination of substitution and transposition ciphers, which would explain the higher entropy levels compared to other sections.

cipheractive2026-04-13

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H0200.99

The Voynich text encodes a natural language using a null-cipher or homophones, where multiple glyphs map to the same plaintext character, which would explain the high hapax ratio (70.1%) and lower glyph entropy (3.86 bits) relative to expected natural language entropy while preserving Zipf-like structure.

cipheractive2026-04-13

Against: Hapax ratio of 70.1% is anomalously high even for homophonic cipher â€” natural language hapax rates rarely exceed 50%

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0210.99

The Currier A/B split reflects two different scribal hands encoding the same underlying language with different but related cipher alphabets, such that glyph-level bigram transition matrices in A and B sections are structurally isomorphic under a permutation mapping.

cipheractive2026-04-13

Against: If isomorphic, combined vocabulary should show lower hapax ratio than either section alone â€” this needs verification

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H0220.99

The zodiac and astronomical sections use a systematically different word-order encoding than herbal and recipes sections, reflecting a positional transposition cipher layer applied on top of substitution, detectable as reduced local bigram predictability at section boundaries relative to within-section transitions.

structuralactive2026-04-13

Against: Lower entropy in zodiac/astronomical could reflect domain-specific vocabulary repetition (star names, month labels) rather than cipher difference

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0230.99

The high hapax ratio (70.1%) is partially artifactual, caused by consistent scribal abbreviation or word-compounding conventions where morphological suffixes are concatenated inconsistently, such that word-final glyph sequences 'y', 'n', 'l', 'r' function as detachable morphological markers â€” splitting words on these terminals would reduce unique vocabulary by at least 25%.

languageactive2026-04-13

Against: Simple positional rules already noted (belief 0.116) may already account for terminal glyph skew

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H0250.99

The Currier A/B split reflects two distinct scribal hands encoding the same language using different glyph-frequency profiles, such that the top-10 word overlap between A and B sections is less than 40%, and the word-initial glyph distributions of A and B differ by Jensen-Shannon divergence > 0.15

cipheractive2026-04-13

Against: If the same cipher is used with different keys, vocabulary overlap could still be low without reflecting language difference

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H0270.99

The text-only section (entropy 3.9016 bits, 7 folios) represents the closest approximation to natural language and encodes a different register or genre than the herbal/zodiac sections, such that its bigram entropy is statistically distinguishable (p < 0.05) from all other sections and its type-token ratio is highest among all sections

structuralactive2026-04-13

Against: Only 7 folios â€” small sample may inflate entropy estimate

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0290.99

The zodiac section (entropy 3.7149 bits, lowest of all sections) uses a label-oriented encoding in which most word tokens are proper nouns or short nominal forms, producing a word-length distribution significantly shorter than the herbal section (mean word length < 4.5 vs herbal > 5.0) and a lower bigram entropy reflecting repetitive naming conventions

structuralactive2026-04-13

Against: Low entropy could reflect a different cipher key rather than a different register

Last test: Observed avg word length: 5.03. Expected: ~5.0. Difference: 0.03.

H0330.99

The text-only section (entropy 3.9016 bits, highest of all sections) represents unenciphered or lightly enciphered text in the underlying language, and its word frequency distribution should most closely match expected distributions for medieval Latin or northern Italian compared to all other sections.

languageactive2026-04-13

Against: Even 3.9016 bits remains below Latin/Italian reference values, so some encoding may still be present

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0400.99

The Currier A/B split corresponds to two different scribal hands applying the same underlying cipher to the same language but with different glyph allograph preferences, predictable via bigram transition matrix divergence between A and B corpora exceeding random variation.

cipheractive2026-04-13

Against: H004/H008 at 0.88 confidence argue genuine linguistic difference

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H0430.99

The text-only section (entropy 3.9016 bits, highest of all sections) contains unenciphered or minimally enciphered natural language prose, distinguishable from other sections by significantly higher conditional entropy H(glyph | preceding glyph) that approaches values for unenciphered medieval Latin or Italian.

cipheractive2026-04-13

Against: Small sample (2,349 words / 7 folios) limits statistical power

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0470.99

The Currier A/B scribal split represents two different scribes encoding the same underlying language using two distinct but related glyph substitution tables (i.e., a two-key polyalphabetic or two-alphabet substitution), such that the same plaintext phoneme maps to different glyphs in A vs. B â€” testable by checking whether word-length distributions and initial/final glyph frequencies in A and B are statistically compatible with a common underlying vocabulary after glyph remapping.

cipheractive2026-04-13

Against: A/B split could reflect two different dialects or languages rather than two cipher keys on one language

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H0520.98

The zodiac section's anomalously low entropy (3.7149 bits, lowest of all sections) is caused by a label-encoding convention in which each label token is drawn from a closed vocabulary of fewer than 150 distinct words, and the type-token ratio for zodiac-section words is significantly lower than for any prose section (herbal, recipes, text-only).

structuralactive2026-04-13

Against: Low entropy could also result from a polyalphabetic cipher that cycles with short period in label contexts

Last test: Observed avg word length: 5.03. Expected: ~5.0. Difference: 0.03.

H0320.98

The Currier A/B dialect split reflects two scribal hands encoding the same underlying language using different homophonic substitution tables, where Currier A substitutes fewer glyphs per phoneme than Currier B, explaining B's larger word count (23,766 vs 11,022) via greater glyph variety per syllable.

cipheractive2026-04-13

Against: Glyph entropy (3.8627) is below what a dense homophonic cipher would typically produce (homophones increase entropy toward the theoretical maximum)

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0370.98

The zodiac and label-heavy sections (zodiac entropy 3.7149 bits, astronomical 3.7471 bits) use a homophonic substitution scheme in which common plaintext letters map to multiple glyphs, suppressing entropy relative to the herbal and text-only sections, while the text-only section (entropy 3.9016 bits) uses a simpler monoalphabetic or unenciphered encoding.

cipheractive2026-04-13

Against: Homophonic substitution should push entropy toward the maximum for the glyph set; lower entropy in label sections is more consistent with reduced vocabulary, not homophones

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0460.98

The zodiac section's anomalously low entropy (3.7149 bits) and label-heavy structure indicate it uses a homophonic cipher with a reduced symbol set (fewer active glyphs per label), while the text-only section's high entropy (3.9016 bits) reflects unenciphered or lightly enciphered natural language prose â€” meaning these two sections require different decipherment strategies.

cipheractive2026-04-13

Against: Section entropy differences are modest (~0.19 bits range), possibly explained by topic-specific vocabulary alone

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0030.98

The manuscript's text structure reflects a mix of prose and verse, with the herbal section being primarily composed of short, rhyming couplets.

structuralactive2026-04-13

Against: Word length distribution is similar across all sections

Last test: Observed avg word length: 5.03. Expected: ~5.0. Difference: 0.03.

H0360.98

The Currier A/B split encodes two different plaintext languages (e.g., Latin in sections assigned to A, Italian or an Italian dialect in sections assigned to B), detectable by different glyph-level entropy and different word-initial/final bigram profiles between A and B corpora.

languageactive2026-04-13

Against: Overall glyph entropy 3.8627 sits below both Latin and Italian benchmarks, making a clean two-language split harder to reconcile without invoking encoding overhead

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0530.98

The text-only section's elevated entropy (3.9016 bits, highest of all sections) reflects unenciphered or lightly enciphered natural-language prose, and its word bigram entropy is significantly higher than that of the zodiac and astronomical sections, consistent with natural syntactic variation rather than label repetition.

languageactive2026-04-13

Against: Higher entropy could reflect a more complex cipher layer rather than less encoding

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0040.98

The Currier A/B split reflects a genuine linguistic difference between two distinct languages or dialects.

languageactive2026-04-13

Against: Glyph positional constraints do not differ significantly between the two languages

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H0380.98

Word-initial glyph constraints (o, c, q, s, d) and word-final glyph constraints (y, n, l, r, o) are not phonological but structural tokens marking word boundaries or grammatical roles (e.g., prefixes and suffixes encoding case or tense), analogous to a morpheme-boundary cipher layered on top of a root encoding.

structuralactive2026-04-13

Against: Natural languages also show strong positional constraints (e.g., English words rarely start with certain consonant clusters), so constraints alone do not distinguish cipher structure from phonology

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0490.98

The recipes section (25 folios, 11,611 words, entropy 3.8586 bits â€” second highest and largest word count) encodes a list-structured text in which recurring syntactic templates produce predictable word-position bigrams, such that positional word-order entropy (entropy of word_n given word_{n-1} within a folio) is significantly lower in recipes than in the herbal or text-only sections.

structuralactive2026-04-13

Against: Without knowing the cipher, word-level positional entropy is confounded by vocabulary domain effects

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0510.98

The Currier A/B scribal split encodes the same underlying language but with two distinct homophonic substitution tables of different sizes: Currier A uses a smaller homophone set (lower token count, 11,022 words) with higher per-glyph entropy, while Currier B uses a larger homophone set producing lower per-glyph entropy, such that the weighted average glyph entropy of A-folios exceeds that of B-folios by at least 0.05 bits.

cipheractive2026-04-13

Against: Entropy difference between sections may reflect content type (label-heavy zodiac vs. prose recipes) rather than table size

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0140.97

The Currier A/B split reflects a genuine linguistic difference between two distinct languages or dialects.

languageactive2026-04-13

Against: No clear evidence of language shift or dialectical variation

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H0080.97

The Currier A/B split reflects a genuine linguistic difference between two distinct languages or dialects.

languageactive2026-04-13

Against: Other sections do not show similar patterns

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H0310.97

The zodiac section (entropy 3.7149 bits, lowest of all sections) uses a label-only encoding scheme where each word is a proper name or fixed label rather than running prose, producing a vocabulary distribution that deviates significantly from Zipf's law relative to other sections.

structuralactive2026-04-13

Against: If purely labels, word diversity should be very low, but the section still contributes to the overall 8,261-word unique vocabulary

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H0480.97

The dominant word-initial glyph constraint (o, c, q, s, d account for most word starts) reflects a cipher rule in which a fixed 'onset marker' glyph class must precede the root â€” analogous to a mandatory null prefix â€” rather than reflecting the initial phoneme distribution of the underlying language, which would produce a flatter initial-glyph distribution matching Latin or Italian onset frequencies.

cipheractive2026-04-13

Against: Some natural languages do have strong onset preferences; the underlying language could simply have restricted onset phonotactics

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0560.97

The Currier A/B split reflects two scribes encoding the same underlying language using two different cipher keys or glyph assignments, not two different languages or dialects, such that word-position glyph statistics are structurally identical across A and B but the specific glyph inventories differ.

cipheractive2026-04-13

Against: H004/H008 (confidence 0.808/0.742) favor genuine linguistic difference between A and B

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H0110.97

The manuscript's text structure reflects a mix of prose and verse, with the herbal section being primarily composed of short, rhyming couplets.

structuralactive2026-04-13

Against: No clear evidence of rhyme or meter in other sections

Last test: Observed avg word length: 5.03. Expected: ~5.0. Difference: 0.03.

H0440.97

The dominance of word-initial glyphs o, c, q, s, d and word-final glyphs y, n, l, r, o reflects a Vigenere-style polyalphabetic cipher in which cipher-alphabet assignment is positionally determined within the word (position 1, position-final), producing artificial initial/final glyph constraints that do not reflect underlying language phonotactics.

cipheractive2026-04-13

Against: Vigenere ciphers typically produce entropy closer to the key-alphabet entropy; observed 3.86 bits is lower than expected for a strong polyalphabetic cipher

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0160.97

The Currier A/B split reflects a genuine linguistic difference between two distinct languages or dialects.

languageactive2026-04-13

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H0420.96

The zodiac section's anomalously low entropy (3.7149 bits, lowest of all sections) reflects a label-only encoding where Voynich words are one-to-one mappings to a closed set of astrological or calendrical terms (month names, zodiac signs, star names), rather than running prose, making it the most tractable section for frequency-matching decipherment.

languageactive2026-04-13

Against: Label-only encoding would predict even lower type-token ratio than observed

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H0550.96

The Voynich text uses a homophonic substitution cipher on medieval Latin or Italian, where multiple distinct glyphs encode the same plaintext letter, which would artificially inflate the unique-word count and produce the observed hapax ratio of 70.1%.

cipheractive2026-04-13

Against: With only 25 unique glyphs, the homophone table would be very small, limiting cipher strength

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0580.96

The word-initial glyph constraints {o,c,q,s,d} and word-final glyph constraints {y,n,l,r,o} are artifacts of a cipher that maps plaintext syllable-onset and syllable-coda phonemes to distinct glyph classes, not morphological or grammatical constraints, meaning that these positional biases are cipher-structural rather than language-structural.

cipheractive2026-04-13

Against: Natural languages with alphabetic writing also show word-initial and word-final biases (e.g., Latin words rarely end in b, d, g)

Last test: Observed avg word length: 5.03. Expected: ~5.0. Difference: 0.03.

H0640.96

The text-only section (7 folios, entropy 3.9016 bits â€” highest of all sections) represents unenciphered or minimally enciphered running text in a natural language, while all other sections apply an additional layer of encoding (transposition, null insertion, or homophone expansion) that suppresses entropy, meaning the text-only section should serve as the primary decipherment anchor.

cipheractive2026-04-13

Against: Text-only section is only 7 folios (2,349 words) â€” too small for robust statistical conclusions

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0590.96

The text-only section's elevated entropy (3.9016 bits, highest of all sections) and the recipes section's second-highest entropy (3.8586 bits) reflect unenciphered or lightly enciphered running prose in the underlying natural language, while the diagrammatic sections (zodiac 3.7149, astronomical 3.7471, pharmaceutical 3.7772) reflect a more heavily enciphered or formulaic register â€” meaning entropy difference between sections correlates with cipher strength, not genre.

cipheractive2026-04-13

Against: Genre differences alone could explain entropy variation: prose naturally has higher per-glyph entropy than labels and formulaic lists

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0600.96

The Voynich text uses a homophonic substitution cipher on medieval Latin where 25 glyphs encode approximately 18-20 Latin phonemes, with 2-4 glyphs mapping to each high-frequency Latin phoneme (e, a, i, o, t, n), which would explain the observed entropy of 3.8627 bits â€” lower than plain Latin (~4.0) but higher than a monoalphabetic cipher.

cipheractive2026-04-13

Against: Hapax ratio of 70.1% is far higher than expected for Latin text under homophonic substitution (~15-25%)

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0630.96

The zodiac and astronomical sections (entropy 3.7149 and 3.7471 bits respectively, lowest of all sections) encode a label-and-numeral system where recurring short words function as positional labels (month names, star names, degree markers) rather than running prose, and these labels follow a fixed template grammar with â‰¤3 syntactic slots, explaining low entropy through high repetition of a small closed vocabulary.

structuralactive2026-04-13

Against: If labels follow fixed templates, type-token ratio should be very low in zodiac â€” but overall hapax ratio remains high corpus-wide

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0760.96

The zodiac section's anomalously low glyph entropy (3.7149 bits vs. corpus mean 3.8627 bits) is fully explained by a restricted label vocabulary: the section predominantly contains short repeated labels for month names, zodiac figures, and positional markers rather than running text. This predicts that the zodiac section's type-token ratio and word length distribution differ significantly from prose sections.

structuralactive2026-04-14

Against: Low entropy could also result from a different cipher key applied to the same underlying language rather than a restricted vocabulary

Last test: Observed avg word length: 5.03. Expected: ~5.0. Difference: 0.03.

H0410.96

The extremely high hapax ratio (70.1%) is produced by a systematic suffix-stripping or abbreviation convention in which a small set of word-final glyph sequences (e.g., -aiin, -dy, -ol) are optionally dropped, making many hapax legomena morphological variants of ~30% of the vocabulary rather than distinct words.

structuralactive2026-04-13

Against: High hapax ratio could reflect homophones or null cipher rather than morphology

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H0240.95

The text-only section (7 folios, entropy 3.9016 bits â€” highest of all sections) represents unenciphered or minimally-enciphered natural language, while all other sections apply an additional cipher layer, making text-only the most direct window into the underlying language and the optimal starting point for statistical language identification.

cipheractive2026-04-13

Against: 7 folios and 2,349 words is a small sample â€” entropy estimate has high variance

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H0570.95

The zodiac section's anomalously low entropy (3.7149 bits, lowest of all sections) is caused by a label-repetition encoding scheme in which a small set of high-frequency label words are systematically repeated around circular diagrams, and the section's word-frequency distribution is therefore best fit by a truncated Zipf distribution rather than a full Zipf law.

structuralactive2026-04-13

Against: Without section-level word-frequency breakdown it is not confirmed that top words are disproportionately concentrated in zodiac

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H0500.95

The extremely high hapax ratio (70.1%) is produced by a systematic null-suffix appended to a smaller core vocabulary: each base word receives one of a small set of suffixes (e.g., -y, -in, -ain, -dy, -edy) to produce surface tokens, meaning the effective vocabulary after stripping common suffixes would shrink to roughly 2,000â€“2,500 unique roots with a Zipf exponent closer to 1.0.

cipheractive2026-04-13

Against: If suffixes were purely null/decorative, word-internal glyph entropy should drop after stripping â€” not yet verified

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H0620.95

The Currier A/B split represents two different scribes encoding the same underlying Latin text using the same cipher but with different personal orthographic conventions â€” specifically, Scribe A preferentially uses one set of homophone variants while Scribe B uses a complementary set, causing the apparent 'dialect' difference to be a cipher-level artifact rather than a linguistic one.

cipheractive2026-04-13

Against: Belief H004 (0.837) and H008 (0.781) strongly support genuine linguistic difference between A and B

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H0710.94

The Currier A/B split encodes two different plaintext languages (e.g., Latin in Currier A sections, early Italian or Occitan in Currier B sections), with each language using the same substitution cipher key but differing in phonological inventory, producing measurable differences in bigram transition probabilities between the two corpora.

languageactive2026-04-14

Against: Same cipher key assumption is not established; A/B differences could reflect two different keys on one language

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H0690.94

The text-only section (entropy 3.9016 bits, highest of all sections) represents unenciphered or minimally enciphered natural language prose, while lower-entropy sections use a homophonic expansion layer that artificially reduces entropy by replacing single plaintext glyphs with multiple ciphertext glyphs.

cipheractive2026-04-14

Against: If text-only were unenciphered, word boundaries and word forms should resemble known medieval languages more closely

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0660.94

The Currier A/B split encodes two different scribal dialects of the same underlying language (not two different languages), where Scribe A uses a slightly different homophonic key than Scribe B, producing measurable differences in glyph bigram entropy between the two sub-corpora rather than differences in underlying vocabulary.

cipheractive2026-04-14

Against: Current belief [0.65] also supports two different plaintext languages â€” the evidence is genuinely ambiguous

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0750.93

The Currier A and B sub-corpora encode the same underlying plaintext language using two different but structurally related substitution alphabets (i.e., same cipher family, different keys). This would produce similar Zipf exponents and bigram entropy profiles within each sub-corpus but systematically divergent glyph-level frequency distributions between them.

cipheractive2026-04-14

Against: The Currier A/B split could alternatively reflect two different source languages rather than two keys on the same language

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0780.92

The text-only section's elevated entropy (3.9016 bits, highest of all sections) combined with its small size (7 folios, 2,349 words) reflects less-compressed or less-enciphered natural prose â€” possibly a section where the scribe applied fewer homophones or abbreviations. This predicts that the text-only section's word frequency distribution follows a Zipf law more closely (R-squared closer to 1.0) than the cipher-heavy sections.

cipheractive2026-04-14

Against: The text-only section has only 2,349 words â€” small sample size reduces reliability of section-level Zipf fit estimates

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H-BV-NULL-010.92

Coverage >=80% on the Voynich corpus is achievable by 1,300 phonotactically plausible nonsense skeletons (20-trial mean 83.56%, sigma 1.03pp) and therefore cannot, alone, be evidence of decipherment. Any abjad-reducible lexicon of 1,000-4,000 entries matching corpus word-length and bigram distributions clears this floor by construction.

nullactive2026-04-15

Against: Brady's character-permutation z=3.83 is a different null (alphabet-mapping, not key-selection) and survives this critique on its own terms

Last test: Publishable negative result. Challenges every coverage-based Voynich decipherment claim in the literature. Validated with 20 independent trials on 1,3

H-BV-Q-010.92

EVA 'q' is a categorical word-initial marker, appearing word-initial in 98.9% of its 5,416 corpus occurrences. This is the strongest positional constraint of any EVA glyph, supporting Brady's H-BRADY-03 (q maps to Syriac waw / wa- conjunction) as a structural claim independent of language identification.

structuralactive2026-04-15

Against: Position constraint doesn't specify phonetic value

Last test: Very strong positional signal. Future decipherment attempts should treat 'q' as a structural prefix/marker, not as a free phonetic consonant.

H0810.92

The Currier A and B sub-corpora encode the same underlying plaintext language using two different but systematically related cipher alphabets (a digraphic or keyed-variant cipher), such that glyph bigram transition probabilities in A and B are permutations of each other rather than independent distributions.

cipheractive2026-04-14

Against: Currier A and B show vocabulary differences that go beyond simple glyph substitution, with distinct word forms appearing preferentially in each sub-corpus

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H0920.92

Currier A and Currier B use two structurally distinct cipher tables (not merely two scribal hands) encoding the same underlying language: the rank-order frequency distribution of glyphs in Currier A and Currier B should show high Spearman rank correlation (> 0.85) if encoding the same language, but the specific high-frequency glyphs in each sub-corpus should differ, indicating a key rotation between the two tables.

cipheractive2026-04-14

Against: Belief [0.288] supports two different plaintext languages as an alternative explanation for the A/B split

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H1150.92

The Currier A and B sub-corpora encode the same underlying language but use two structurally distinct homophonic cipher tables, each mapping plaintext letters to different glyph sets, which explains why the A/B vocabularies share Zipf-law structure but differ in high-frequency word forms.

cipheractive2026-04-15

Against: If the tables are truly distinct, cross-corpus bigram transitions should show near-zero overlap, but some overlap is observed

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H-BV-EXTERNAL-NULL-010.90

Preparation route (external topical application vs internal ingestion) does NOT explain the _.oii vowel-pattern fire rate on plant folios. External-classified folios (n=24) and internal-classified folios (n=54) have essentially identical mean _.oii rates (0.561% vs 0.557%, ratio 1.01x). One-tailed Welch's t-test p = 0.494, Cohen's d = 0.003, Mann-Whitney U p = 0.341. Bootstrap 95% CI on difference [-0.0046, +0.0048] straddles zero.

nullactive2026-04-15

Against: Classification is keyword-heuristic from Sherwood notes, not expert ethnobotanical labels

Last test: p = 0.494 (far above 0.05). d = 0.003. Hypothesis refuted cleanly.

H0740.90

The extremely high hapax ratio (70.1%) is an artifact of systematic suffix variation: scribes appended variable word-final glyphs from the set {y,n,l,r,o} as inflectional or abbreviation markers, inflating apparent vocabulary size. Stripping these final glyphs should collapse unique word count toward natural-language hapax ratios (~40-50%).

cipheractive2026-04-14

Against: Natural inflected languages (Latin, Italian) typically show hapax ratios of 40-55%, so 70.1% requires strong additional explanation beyond inflection alone

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H0850.90

The Currier A and B sub-corpora use two distinct cipher alphabets encoding the same underlying Latin plaintext, such that glyph-frequency distributions are statistically different between A and B but word-length distributions and entropy levels are statistically indistinguishable.

cipheractive2026-04-14

Against: Belief [0.339] at moderate confidence that A/B encodes two different plaintext languages (e.g., Latin vs. Italian)

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0900.90

The 70.1% hapax ratio is substantially artifactual: stripping word-final glyphs drawn from the set {y, n, l, r, o} normalizes words to a stem form, reducing the unique vocabulary to roughly 30-40% of current count and the hapax ratio to below 45%, consistent with a systematic suffixation or abbreviation cipher.

cipheractive2026-04-14

Against: Some word-final variation may encode genuine semantic distinctions rather than inflectional endings

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0150.90

The manuscript's text structure reflects a mix of prose and verse, with the herbal section being primarily composed of short, rhyming couplets.

structuralactive2026-04-13

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H0190.90

The text uses a combination of substitution and transposition ciphers in the herbal section.

cipheractive2026-04-13

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H0060.90

The manuscript's text structure reflects a mix of prose and verse, with the herbal section being primarily composed of short, rhyming couplets.

structuralactive2026-04-13

Against: Other sections do not show similar patterns

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H0950.90

Currier A and Currier B use two structurally distinct cipher alphabets encoding the same underlying Latin plaintext: the two sub-corpora should exhibit the same bigram entropy and the same Zipf exponent once glyph-level correspondences are remapped

cipheractive2026-04-14

Against: If Zipf exponents diverge significantly between A and B after remapping, the same-language assumption is undermined

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0700.89

The extremely high hapax ratio (70.1%) is produced by a systematic suffix-agglutination cipher mechanism: a small closed vocabulary of roots (~500-800 types) is combined with a set of suffix glyphs drawn from the word-final constraint set {y, n, l, r, o}, generating surface forms that appear unique but share common roots.

cipheractive2026-04-14

Against: True suffix agglutination in natural languages typically produces lower hapax ratios than 70.1%

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H0930.88

The recipes section (25 folios, 11,611 words, entropy 3.8586 bits) encodes running prose rather than labels or lists, as evidenced by a word bigram conditional entropy significantly higher than label-heavy sections (zodiac: 3.7149, astronomical: 3.7471), and a type-token ratio and sentence-length distribution consistent with connected discourse rather than enumerated items.

structuralactive2026-04-14

Against: Pharmaceutical section has lower entropy (3.7772) and might represent the actual prose recipes while recipes section encodes something else

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H-BV-DIALECT-010.88

Voynich scribes A and B divide the manuscript by section — Hand A writes the herbal (95 folios), pharmaceutical (16), and a handful of recipes (2); Hand B writes the biological (19), cosmological (3), recipes (23), and some herbal (32). Only herbal and recipes have both hands present. In recipes — the only content-shared section — the two hands use completely disjoint vowel-pattern dialects: 0 of 8 Hand-A markers overlap with 8 Hand-B markers. Hand A's marker vocabulary across all sections is 'o'-heavy (o/e ratio 2.00); Hand B's is more balanced (o/e 1.36). Both hands use 'eo'-containing patterns for preparation-related content (Hand A pharma, Hand B recipes).

structuralactive2026-04-16

Against: Several sections have zero folios for one hand, preventing cross-hand comparison (pharma: 0 Hand B; biological: 0 Hand A; astronomical/zodiac: 0/0)

Last test: 5-fold CV naive-Bayes classifier: 74.2%% accuracy across 6 classes vs 50%% majority vs 16.7%% chance (+24.2pp over majority). Perfect accuracy on (B,

H0830.88

The strong word-initial and word-final glyph positional constraints ({o,c,q,s,d} and {y,n,l,r,o}) reflect a syllabic or consonant-vowel template in the underlying language (e.g., CV or CVC structure), not a cipher artifact, and can be modeled as a Markov chain of order 2 that predicts glyph position within a word with accuracy > 80%.

languageactive2026-04-14

Against: Existing belief [0.268] (confidence 0.268) attributes the positional constraints to a Vigenere-type cipher rather than linguistic structure â€” cipher explanations are competitive

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0970.87

The strong positional constraints on word-initial glyphs {o,c,q,s,d} and word-final glyphs {y,n,l,r,o} are artifacts of a structured nulls-and-affixes system rather than natural phonotactics: removing these positional glyphs as null markers should produce a residual corpus whose glyph entropy approaches Latin's ~4.0 bits

cipheractive2026-04-14

Against: Natural Semitic or agglutinative languages also exhibit strong positional constraints without nulls

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0840.87

The Voynich text uses a homophonic substitution cipher on medieval Latin where 2-3 Voynich glyphs map to each Latin letter, which would reduce measured entropy below natural Latin (~4.0 bits) toward the observed 3.8627 bits while preserving Zipf-like word frequency distribution.

cipheractive2026-04-14

Against: 70.1% hapax ratio is extremely high for a homophonic cipher on Latin â€” Latin text would produce far fewer unique word-forms

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H1410.87

Currier A and Currier B sub-corpora have glyph unigram distributions that are more statistically divergent from each other than any random equal-sized split of the same combined corpus, confirming they encode under structurally distinct cipher tables rather than merely reflecting scribal style variation.

cipheractive2026-04-16

Against: If A and B encode the same language with the same cipher but different register or topic, glyph distributions could diverge due to content rather than cipher table differences

H1320.87

The Currier A and Currier B sub-corpora use two structurally distinct cipher alphabets that both encode the same underlying Latin plaintext, with Currier B employing a larger effective alphabet (more glyph diversity per word position) to explain B's larger word count (23,766 vs 11,022) without a proportional increase in semantic content.

cipheractive2026-04-15

Against: Currier B's larger size may simply reflect more content being assigned to B sections, not a richer cipher alphabet

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H1160.86

The anomalously low entropy in the zodiac (3.7149 bits) and astronomical (3.7471 bits) sections relative to the text-only section (3.9016 bits) is produced by a label-encoding convention in which short positional labels (star names, month labels) are drawn from a restricted glyph sub-alphabet, reducing effective entropy rather than reflecting a different cipher.

structuralactive2026-04-15

Against: If labels are simply shorter words, average word length in zodiac should be significantly below corpus mean of 5.03 â€” this needs verification

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0540.86

Word-initial glyph constraints (o, c, q, s, d dominating) and word-final glyph constraints (y, n, l, r, o dominating) are not merely phonotactic but reflect a codebook structure in which word-initial glyphs encode semantic category (e.g., plant part, action, quantity) and word-final glyphs encode grammatical role (e.g., noun, verb, adjective), such that co-occurrence of specific initial+final glyph pairs is non-uniform and significantly exceeds chance across all sections.

cipheractive2026-04-13

Against: A simpler explanation is that initial/final constraints are cipher artifacts (e.g., nulls or padding) with no semantic content

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H0820.85

The zodiac section's anomalously low entropy (3.7149 bits, lowest of all sections) reflects a label-registry encoding where each word is drawn from a small closed vocabulary of astrological terms (< 50 unique words), producing a highly non-uniform unigram distribution unlike the rest of the manuscript.

structuralactive2026-04-14

Against: If the zodiac section used only ~50 unique words, its type-token ratio would be ~0.039, far lower than the corpus average; this extreme would likely have been noted in prior analyses

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H-BV-MG-010.85

EVA 'm' (1,055 tokens) and 'g' (127 tokens) are suffix/final-marker glyphs, extending the suffix class beyond {y,n,l,r} per H023. Final-position rates: m 93.6%, g 83.5%. EVA 'l' (previously assumed final-dominant) is actually balanced (53.6% final / 32.6% mid) and should be demoted from the suffix class.

structuralactive2026-04-15

Against: g's sample size (127) is modest; rate could shift with more data

Last test: Expands H023 suffix class to {y,n,r,m,g}; demotes l. Any stem-stripping pipeline should be updated.

H-BV-SHUF-010.85

Word-order syntactic structure is absent from Voynichese. Across all tested lexicons (Schechter Latin, Brady Syriac proxy, Hebrew medieval, Brain-V v1), in-order decoded text scores equal to or LOWER than across-corpus-shuffled decoded text on the connector-to-content bigram metric. Both-matched-adjacency shows a small positive cluster effect (+0.003 to +0.028 pp) that is lexicon-size-monotonic and therefore a topical-clustering artefact, not grammatical signal.

nullactive2026-04-15

Against: Full Brady lexicon may change the picture once supplementary file is released

Last test: The fundamental barrier to Voynich decipherment. Coverage-based methods find words but not sentences. Future work needs order-independent signals (vow

H-BV-STRUCT-010.85

Non-vowel structural features (q-initial flag, suffix class, bench-gallows presence, skeleton length, line-position, plain-gallows line-initial) individually add at most +1.9pp to section-prediction accuracy and collectively add only +3.6pp over majority baseline. Combining them with vowel-pattern features slightly DEGRADES performance (38.9% vs 40.1% vowel-only). Per-class precision collapses to only herbal/recipes; the classifier never predicts the other 6 sections.

nullactive2026-04-15

Against: Q-initial marker is a near-categorical positional feature (98.9%) yet contributes -0.8pp — counterintuitive; may indicate feature is uninformative-for-section rather than uninformative-overall

Last test: Non-vowel-only classifier 34.1% vs majority 30.5% = +3.6pp gain. Degenerate output distribution (predicts only herbal/recipes).

H-BV-VOLVELLE-010.85

A simple 3-ring volvelle (6 prefixes x 26 roots x 8 suffixes) with per-section root-cartridge swap reproduces 4 of 7 Voynich statistical properties including vowel-section chi-square coupling (90% vs real 79%). It fails Zipf exponent (0.18 vs 0.65) and hapax ratio (0.24 vs 0.70) by large margins. The vowel-section coupling Brain-V treated as its strongest positive structural finding (H-BV-VOWEL-01) is NOT uniquely diagnostic of meaning: a content-free volvelle mechanism produces it trivially via section-specific cartridges.

nullactive2026-04-15

Against: Zipf exponent 0.18 vs real 0.65 - huge mismatch

Last test: 4/7 matches on aggregate stats. Volvelle reproduces vowel-section chi-square coupling (79% real vs 90% synth) which was Brain-V's strongest positive s

H0910.85

The zodiac section's anomalously low entropy (3.7149 bits, lowest of all sections) reflects a label-only register drawn from a vocabulary of fewer than 200 distinct word types â€” a much smaller effective vocabulary than other sections â€” producing a Zipf exponent significantly above the corpus-wide value of 0.8946.

structuralactive2026-04-14

Against: Low entropy could alternatively reflect a different cipher key applied to zodiac text rather than a different register

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H1180.85

The recipes section (25 folios, 11,611 words, entropy 3.8586 bits â€” the largest section by word count) encodes a natural-language text with less cipher transformation than the herbal section, as evidenced by its higher entropy and larger word count approximating the statistical footprint of an unenciphered medieval recipe corpus.

languageactive2026-04-15

Against: Entropy of 3.8586 is still below the text-only section, so some cipher transformation may remain

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0980.85

The text-only section (entropy 3.9016 bits, 7 folios) represents unenciphered or minimally enciphered running text in medieval Latin or Northern Italian, and its word-frequency distribution should match a Zipf exponent significantly closer to 1.0 than the manuscript-wide exponent of 0.8946

languageactive2026-04-14

Against: Small sample (2,349 words) limits statistical power for reliable Zipf fitting

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H1210.85

Currier A and Currier B use two structurally distinct homophonic cipher tables mapping a single underlying Latin text, such that the bigram transition matrices of Currier A and B are statistically dissimilar (chi-squared p < 0.01) but both exhibit second-order entropy compatible with Latin (approximately 3.6-4.0 bits per glyph at order-2).

cipheractive2026-04-15

Against: If the two cipher tables encode different underlying languages rather than the same one, bigram structure would differ for linguistic rather than cipher reasons

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0390.84

The recipes section (25 folios, 11,611 words, entropy 3.8586 bits) and the herbal section (129 folios, 10,872 words, entropy 3.8478 bits) share a common cipher and possibly the same underlying language, while the biological section (19 folios, 6,315 words, entropy 3.7977 bits) uses a distinct encoding, as evidenced by its lower entropy and different word-length distribution.

cipheractive2026-04-13

Against: Entropy differences across sections are small (range 0.186 bits), potentially within noise for section sizes of a few thousand words

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H0170.84

The text uses a combination of substitution and transposition ciphers in the biological section.

cipheractive2026-04-13

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H1280.83

Currier A and Currier B employ two structurally distinct cipher alphabets in which a core set of glyphs is shared but a subset of approximately 5-8 glyphs is exclusive or predominantly exclusive to each sub-corpus, producing measurable vocabulary intersection below 60% when controlling for word length.

cipheractive2026-04-15

Against: Two scribes encoding the same language with the same cipher but different handwriting habits could produce apparent glyph-frequency divergence without a structural cipher difference

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H0960.83

The zodiac section's anomalously low entropy (3.7149 bits, lowest of all sections) is produced by a label-only encoding where each token is drawn from a small closed vocabulary of fewer than 50 distinct labels, rather than running prose

structuralactive2026-04-14

Against: If the zodiac vocabulary size is not demonstrably smaller than other sections, the structural explanation loses force

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H1010.82

The zodiac section's anomalously low entropy (3.7149 bits, lowest of all sections) reflects a label-register vocabulary of fewer than 150 distinct content words, with a word-frequency distribution that fits a truncated power law rather than full Zipf, consistent with a glossary or label list rather than running prose.

structuralactive2026-04-14

Against: Low entropy could also result from heavy repetition of cipher padding or null glyphs rather than label structure

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H1170.82

The 70.1% hapax ratio is substantially artifactual: a systematic suffix composed of word-final glyphs drawn from {y, n, l, r, o} acts as a grammatical or cipher suffix, and stripping these final glyphs from all word tokens reduces the effective vocabulary size and hapax ratio to levels consistent with natural language (~30â€“45% hapax).

cipheractive2026-04-15

Against: Prior attempt scored only 0.37 confidence, suggesting the reduction may not bring hapax to natural-language levels

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H1270.82

The text-only section (7 folios, entropy 3.9016 bits â€” highest of all sections) represents a functionally distinct register encoded with minimal or no transposition, i.e., a near-plaintext or lightly enciphered layer, while lower-entropy sections (zodiac 3.7149, astronomical 3.7471) use additional transposition steps that suppress entropy.

structuralactive2026-04-15

Against: Lower entropy in zodiac could reflect label brevity and small vocabulary rather than additional cipher steps

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H1370.81

The zodiac section's anomalously low entropy (3.7149 bits, lowest of all sections) is explained by a label-oriented encoding in which the same small set of ~15â€“20 root words are repeatedly inflected and reused as astronomical labels, such that the top 20 most frequent words in the zodiac section account for over 55% of all word tokens in that section

structuralactive2026-04-15

Against: Low entropy could also reflect a denser homophonic cipher masking fewer plaintext characters

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H1220.81

The zodiac section's anomalously low entropy (3.7149 bits, lowest of all sections) reflects a label-encoding regime in which most tokens are drawn from a restricted lexicon of fewer than 80 unique word types, producing a word-frequency distribution with Zipf exponent significantly steeper (greater than 1.2) than the full corpus (0.8946), consistent with a closed enumeration rather than free prose.

structuralactive2026-04-15

Against: Low entropy could alternatively reflect a denser homophonic substitution rather than label restriction

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H0120.81

The manuscript uses a combination of substitution and transposition ciphers in the biological section.

cipheractive2026-04-13

Against: No clear evidence of transposition patterns

Last test: Positional distinctness: 0.25. First-only glyphs: {'q'}. Last-only glyphs: {'r', 'm', 'n'}. Universal glyphs: {'l', 'o', 'a', 'd'}. Strong positional

H1030.81

The text-only section (7 folios, entropy 3.9016 bits, highest of all sections) encodes unenciphered or lightly enciphered text in a language with entropy naturally near 3.9 bits, consistent with medieval Italian vernacular (estimated entropy 3.85-3.95 bits) rather than Latin (estimated ~4.0 bits) or a heavily enciphered text.

languageactive2026-04-14

Against: Small sample size (7 folios, 2,349 words) makes entropy estimates unreliable

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H-BV-AB-010.80

Currier A is structurally distinct from Currier B at the lexical-accessibility level, resisting decipherment across three independent language hypotheses: Schechter Latin (B-A gap +8.21pp), Brady Syriac proxy (+3.92pp), Hebrew medieval (+3.07pp). B consistently fits lexicons better than A by 3-8 percentage points regardless of source language.

structuralactive2026-04-15

Against: Gaps could still reflect sample bias - bio/recipes B-dominant sections may be over-represented in pharma vocab sources

Last test: Brain-V's high-confidence A=B hypotheses made a testable prediction (equal lexical fit) and it failed on 3 independent tests. This hypothesis is the r

H-BV-PERUN-NULL-010.80

Plant illustration properties (use class: toxic/food/medicinal; geographic origin: Mediterranean vs non-Mediterranean; plant family) do NOT correlate with measurable text properties (mean word length, glyph entropy, q-initial rate, gallows-initial-line rate, top vowel-pattern frequency, _.oii fire rate) on the same folio. Across 24 tests (4 botanical axes x 6 text features, n=117 plant folios), ZERO reach p<=0.05. The Perun visual-key hypothesis is not supported at this resolution.

nullactive2026-04-15

Against: Toxic-plant _.oii rate (0.81%) is 2.1x medicinal (0.35%) and food (0.38%); untested formally due to small n=15

Last test: 24 tests across use class, origin, family, medicinal flag x 6 text features. Zero reach p<=0.05. Toxic _.oii elevation flagged but not formally tested

H0070.80

The text uses a combination of substitution and transposition ciphers in the biological section.

cipheractive2026-04-13

Against: Other sections do not show similar patterns

Last test: Positional distinctness: 0.25. First-only glyphs: {'q'}. Last-only glyphs: {'r', 'm', 'n'}. Universal glyphs: {'l', 'o', 'a', 'd'}. Strong positional

H0720.79

The zodiac section's anomalously low entropy (3.7149 bits) and label-heavy structure reflect a fixed-vocabulary encoding in which each label token is drawn from a closed set of fewer than 50 distinct label templates, with high within-label glyph redundancy serving as a positional marker rather than encoding phonemic content.

structuralactive2026-04-14

Against: If labels use a closed template set, type-token ratio for the zodiac section should be markedly lower than other sections; this is testable but not yet confirmed

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H0860.79

The zodiac and astronomical sections use a label-register encoding â€” short fixed-length labels drawn from a closed vocabulary â€” rather than running prose, which would produce anomalously low entropy (observed: zodiac 3.7149 bits, astronomical 3.7471 bits) and a high type-token ratio relative to sections with running text.

structuralactive2026-04-14

Against: Low entropy could also reflect heavier use of homophones in these sections rather than label structure

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H1310.79

The zodiac section's anomalously low entropy (3.7149 bits, lowest of all sections) results from a label-only encoding where each glyph token maps to one of a small closed set of calendar or ordinal terms (month names, numerals, star names), making the zodiac section structurally independent from the prose cipher used in herbal and recipes sections.

structuralactive2026-04-15

Against: Low entropy could also result from heavier use of the word-initial constraint glyphs {o,c,q,s,d} rather than a label vocabulary

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H1110.78

The Currier A and Currier B sub-corpora use two structurally distinct cipher tables encoding the same underlying Latin plaintext, evidenced by differing word-initial glyph frequency distributions between the two sub-corpora that are nonetheless both consistent with Latin word-onset phoneme distributions.

cipheractive2026-04-15

Against: Currier A/B split may reflect scribal hand differences rather than cipher-table differences

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H1230.77

The text-only section's elevated entropy (3.9016 bits, highest of all sections) reflects an underlying plaintext that has undergone minimal or no transposition â€” i.e., it is a pure substitution cipher or near-plaintext â€” whereas sections with lower entropy (zodiac 3.7149, biological 3.7977) involve additional transposition steps that reduce apparent glyph entropy by increasing local repetition.

cipheractive2026-04-15

Against: Lower entropy in other sections could reflect semantic content differences (e.g., zodiac labels are inherently less entropic) rather than transposition

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H1300.77

The Voynich text uses a homophonic substitution cipher where word-final glyphs {y, n, l, r, o} function as morphological suffixes encoding inflectional endings of a single underlying Latin text, causing artificial vocabulary inflation and explaining the 70.1% hapax ratio.

cipheractive2026-04-15

Against: Currier A/B split suggests two encoding schemes, complicating a single-cipher model

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H1000.77

Currier A and Currier B employ two structurally distinct cipher alphabets encoding the same underlying Latin plaintext: the two sub-corpora should exhibit near-identical word-length distributions and Zipf exponents but divergent bigram transition matrices, consistent with two key tables applied to the same plaintext.

cipheractive2026-04-14

Against: Belief 0.208 supports two different plaintext languages across A/B

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H1360.76

The Currier A and Currier B sub-corpora use two structurally distinct homophonic cipher tables encoding the same underlying Latin text: Currier A uses a wider homophone set per plaintext letter (lower per-glyph entropy ~3.75 bits) while Currier B uses a narrower set (higher per-glyph entropy ~3.90 bits), producing the observed entropy difference between the two dialects

cipheractive2026-04-15

Against: Underlying language may differ between scribes rather than cipher table alone

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H1100.76

The high hapax ratio (70.1%) is substantially artifactual: word-final glyphs drawn from the set {y, n, l, r, o} function as grammatical suffixes, and stripping them would reduce the effective vocabulary by at least 30%, collapsing hapax rate to below 50%.

cipheractive2026-04-15

Against: If suffixes were purely grammatical, we would expect cleaner paradigm clusters around common stems

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H1330.76

The high word-initial glyph constraint ({o,c,q,s,d} dominating word starts) and word-final glyph constraint ({y,n,l,r,o} dominating word ends) are artifacts of a Vigenere-like polyalphabetic cipher where the key resets at word boundaries, causing the first and last characters of each word to be systematically drawn from whichever cipher-alphabet rows correspond to the key's initial and terminal positions.

cipheractive2026-04-15

Against: Glyph entropy 3.8627 bits is relatively high for a polyalphabetic cipher with only 25 symbols, unless the key is long

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H1380.76

The text-only section's elevated entropy (3.9016 bits, highest of all sections) reflects unenciphered or minimally enciphered natural language prose, and its glyph unigram distribution should show statistically significantly less deviation from the expected distribution of a known natural language (medieval Latin or Italian) than any other section

languageactive2026-04-15

Against: Higher entropy could reflect a more complex cipher rather than less encipherment

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H1200.75

The high hapax ratio (70.1%) is substantially produced by a systematic suffix morphology where word-final glyphs {y, n, l, r, o} function as inflectional suffixes on a smaller stem vocabulary, such that stripping final glyphs from the set {y, n, l, r, o} would reduce the unique word count by at least 35% while preserving a stem vocabulary with Zipf exponent closer to 1.0.

cipheractive2026-04-15

Against: If suffixes encode cipher elements rather than morphology, stripping them may destroy meaningful units

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H1250.75

The Voynich glyphs encode a homophonic substitution cipher on medieval Latin, where the 25 glyphs map to a Latin alphabet expanded with homophones for high-frequency letters (e, a, i, t, s), reducing glyph entropy below the expected Latin ~4.0 bits to the observed 3.8627 bits.

cipheractive2026-04-15

Against: Hapax ratio 70.1% is far higher than expected for homophonic Latin, which should reduce not increase type diversity

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H1130.75

The text-only section (7 folios, entropy 3.9016 bits, highest of all sections) represents unenciphered or lightly enciphered natural language prose, while all other sections apply an additional transposition or homophonic layer on top of a base substitution cipher, producing their characteristically lower entropy.

cipheractive2026-04-15

Against: Alternative explanation: text-only simply contains more linguistically varied content (prose vs. formulaic plant descriptions) that naturally yields higher entropy without any cipher difference

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H1350.74

The high hapax ratio (70.1%) is substantially artifactual: word-final glyphs drawn from {y,n,l,r,o} function as inflectional suffixes, and stripping the final glyph of every word reduces unique word types by at least 30%, collapsing the vocabulary toward a size consistent with a 3,000â€“5,000-root natural language

structuralactive2026-04-15

Against: If suffixes encode cipher structure rather than grammar, stripping them may not reduce vocabulary meaningfully

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H-BRADY-020.74

Plain gallows characters EVA 't' and 'p' are non-phonetic paragraph/section markers, not consonants: 'p' shows 5.4x line-initial enrichment (70.9% vs 13.1% baseline), 't' shows 3.2x (42.3%), while bench gallows (cth/ckh/cph) appear in normal mid-word positions.

structuralactive2026-04-15

Against: No known paleographic analogue for mid-word phonetic vs. line-initial marker distinction within a single script

Last test: The dramatic line-initial enrichment of 'p' (5.4x baseline) and 't' (3.2x baseline) strongly indicates structural rather than phonetic function, parti

H0340.73

Glyph positional constraints (specific glyphs appearing predominantly word-initially vs word-finally) reflect a syllabic or consonant-vowel encoding structure rather than grammatical affixes, where initial glyphs encode consonant onsets and final glyphs encode vowel codas of a CV or CVC syllable scheme.

cipheractive2026-04-13

Against: Only 25 unique glyphs is low for a full syllabary (typical syllabaries have 40-100 symbols), though it could represent a partial or consonantal syllabary

Last test: Positional distinctness: 0.25. First-only glyphs: {'q'}. Last-only glyphs: {'r', 'm', 'n'}. Universal glyphs: {'l', 'o', 'a', 'd'}. Strong positional

H1340.73

The text-only section's elevated entropy (3.9016 bits, highest of all sections) reflects a prose register of the underlying natural language encoded with minimal or no transposition, while the herbal and biological sections apply additional transposition or null insertion on top of the base substitution cipher, reducing their effective entropy.

cipheractive2026-04-15

Against: Section entropy differences may reflect content domain (dense botanical lists vs. continuous prose) rather than cipher differences

Last test: Observed glyph entropy: 3.8627 bits. Expected for hypothesis: ~4.0 bits. Difference: 0.1373 bits.

H0880.72

The text-only section (7 folios, entropy 3.9016 bits â€” highest of all sections) represents either unenciphered text or a significantly weaker cipher than the illustrated sections, such that its inter-word mutual information is statistically higher than in the herbal or biological sections, betraying more plaintext syntactic structure.

cipheractive2026-04-14

Against: Higher entropy could simply reflect more varied vocabulary in a prose section rather than weaker encipherment

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H1400.72

The 70.1% hapax ratio is primarily artifactual: word-initial glyphs from {o,c,q,s,d} and word-final glyphs from {y,n,l,r,o} are morphological affixes, not root characters. Stripping one character from each end when it belongs to these sets will reduce the unique root vocabulary below 2,500 types and the hapax rate below 40%, consistent with inflectional morphology in a natural language.

structuralactive2026-04-16

Against: Stripping may over-segment: some word-initial o or c may be root-initial, not prefixes

H1260.72

The anomalously high hapax ratio (70.1%) is substantially produced by a systematic word-final suffix drawn from the set {y, n, l, r, o} that is a cipher artifact (e.g., a null, padding, or word-delimiter glyph), such that stripping that final glyph from all words reduces the unique-word count by at least 30%.

cipheractive2026-04-15

Against: Stripping final glyphs from Latin or Italian words would also be linguistically valid (case/conjugation endings), so reduction might reflect grammar not artifact

Last test: Language A: entropy=3.8321, 3513 unique/11022 total words. Language B: entropy=3.8611, 5089 unique/23766 total words. Entropy difference: 0.0290. Voca

H-BRADY-120.70

Short skeletons (1-2 consonants) cover 42.9% of corpus tokens and are irreducibly ambiguous, meaning ANY decipherment hypothesis using consonant-skeleton methodology has ceiling ~57% on word-level accuracy without a vowel-disambiguation layer.

structuralactive2026-04-15

Against: Short-skeleton ambiguity could be partially resolved by contextual co-occurrence

H-BV-SPARSE-010.70

Three vowel-pattern rules survive honest held-out validation at 70%+ precision: '_.o._.o' -> herbal (82.4% precision on 34 held-out fires), '_._.eee' -> recipes (72.7% on 11 fires), '_.e.ai' -> recipes (70.6% on 17 fires). These three rules together cover 0.8% of held-out tokens at 77.4% aggregate precision. They are Brain-V's first honestly-validated decipherment fragments.

structuralactive2026-04-15

Against: Tiny coverage (0.8%) limits practical utility

Last test: 3/103 candidate rules survive >=0.70 precision + >=10 fires on stratified holdout. Aggregate precision 77.4% on 0.8% coverage.

H0890.66

The Voynich text uses a homophonic substitution cipher on medieval Latin or Italian, where 2-4 glyphs map to each high-frequency plaintext letter, reducing glyph entropy below what a simple substitution would produce (~4.0 bits) and inflating the hapax ratio by producing artificial variant spellings of the same underlying word.

cipheractive2026-04-14

Against: Only 25 unique glyphs is a small alphabet for a homophonic system, which typically requires 50+ symbols for adequate coverage of Latin

Last test: Positional distinctness: 0.25. First-only glyphs: {'q'}. Last-only glyphs: {'r', 'm', 'n'}. Universal glyphs: {'l', 'o', 'a', 'd'}. Strong positional

H0730.66

The dominant word-initial glyph constraints {o, c, q, s, d} are artifacts of a cipher mechanism that encodes plaintext vowels (a, e, i, o, u) as these five glyphs in word-initial position, with each cipher glyph mapping to a single plaintext vowel, producing a detectable co-occurrence pattern between word-initial cipher glyphs and word-final glyphs that mirrors vowel-consonant harmony in Latin or Italian.

cipheractive2026-04-14

Against: Latin and Italian words do not begin with vowels at the frequency implied by the dominance of these 5 initial glyphs

Last test: Positional distinctness: 0.25. First-only glyphs: {'q'}. Last-only glyphs: {'r', 'm', 'n'}. Universal glyphs: {'l', 'o', 'a', 'd'}. Strong positional

H1190.65

The glyph positional constraints (word-initial set {o,c,q,s,d}, word-final set {y,n,l,r,o}) are produced by a VigenÃ¨re-style polyalphabetic cipher where the key position within a word determines the allowable glyph set, such that position 1 maps to a restricted alphabet and the final position maps to a different restricted alphabet, compressing the effective entropy at word boundaries.

cipheractive2026-04-15

Against: Classical VigenÃ¨re on a 25-glyph alphabet with a short key would produce periodic index-of-coincidence peaks that have not been confirmed

Last test: Positional distinctness: 0.25. First-only glyphs: {'q'}. Last-only glyphs: {'r', 'm', 'n'}. Universal glyphs: {'l', 'o', 'a', 'd'}. Strong positional

H1420.65

Per-section Zipf exponents vary monotonically with section entropy: the text-only section (entropy 3.9016 bits) has a Zipf exponent closest to 1.0, the zodiac section (entropy 3.7149 bits) has the flattest Zipf exponent (furthest below 1.0), and all other sections fall in betweenâ€”consistent with the text-only section being least transformed from natural language and label-heavy sections being most compressed or formulaic.

structuralactive2026-04-16

Against: Small section sizes (zodiac: 1,297 words; text-only: 2,349 words) make Zipf fits noisy and confidence intervals wide

H0770.64

The glyph positional constraints (word-initial {o,c,q,s,d}, word-final {y,n,l,r,o}) are not cipher artifacts but reflect systematic morphological prefixes and suffixes of the underlying plaintext language. Specifically, the initial constraint distribution should match the expected frequency of Latin or Italian morphological prefixes (ob-, con-, sub-, de-, ad-) better than a random cipher assignment would predict.

languageactive2026-04-14

Against: Many cipher systems (Vigenere, homophonic) independently impose positional glyph constraints unrelated to the underlying language morphology

Last test: Positional distinctness: 0.25. First-only glyphs: {'q'}. Last-only glyphs: {'r', 'm', 'n'}. Universal glyphs: {'l', 'o', 'a', 'd'}. Strong positional

H1290.63

The Zipf exponent of 0.8946 (below the natural-language baseline of ~1.0) is caused by systematic word-level homophones: multiple surface word forms encode the same plaintext word, inflating low-frequency tail counts. Collapsing homophones defined by shared word-initial bigram and word length would restore the Zipf exponent to >= 0.95.

cipheractive2026-04-15

Against: R-squared 0.9084 is still a reasonable Zipf fit; the deviation from 1.0 could reflect topic-restricted medieval vocabulary rather than cipher homophones

Last test: Observed Zipf exponent: 0.8946 (R²=0.9084). Natural language expected: ~1.0. Fit quality: moderate.

H1430.61

Words containing gallows glyphs (EVA t, p, k, f and bench variants cth, ckh, cph) occupy structurally distinct syntactic positions: they appear predominantly as the first content word after line-initial position and are followed by non-gallows words at a rate significantly higher than the corpus baseline, consistent with gallows functioning as topic-marking or noun-phrase-head indicators rather than phonemes.

structuralactive2026-04-16

Against: Bench gallows (cth/ckh/cph) appear at <0.5x line-initial rate, suggesting they do not share the paragraph-marker function and may be phonemic

H-BV-VOWEL-AGG-010.60

EVA vowel patterns are non-randomly distributed across manuscript sections at the aggregate level (chi-square significant at p<0.01 in 55/70 testable skeleton groups) but this distributional coupling does NOT translate to per-token section prediction on held-out folios. The narrower surviving result: sparse high-precision vowel-pattern rules exist (e.g. '_.eo' is pharma-modal with rule-precision 0.827 on held-out data) but fire on only ~40% of tokens, so F1 is below always-predict-majority baselines.

structuralactive2026-04-15

Against: Pharma F1 0.341 (best) vs blind 0.614 on held-out data

Last test: Aggregate chi-square unchanged (55/70 at p<0.01). Heldout precision 0.827 (pharma, rule fires) and 0.655 (bio, NB) both exceed prevalence baseline. F1

H1390.60

The Voynich script encodes plaintext using a systematic nulls-and-abbreviations scheme in which roughly 20â€“25% of all word tokens are null words (carrying no semantic content) inserted at predictable positional intervals, detectable by the fact that the most frequent short words (length 1â€“2 glyphs, e.g., 'y', 'ol', 'ar', 'or') appear at inter-word positions with a non-random distribution inconsistent with natural language function words

cipheractive2026-04-15

Against: Short high-frequency words are consistent with natural language function words (articles, prepositions) in Latin or Italian

Last test: Positional distinctness: 0.25. First-only glyphs: {'q'}. Last-only glyphs: {'r', 'm', 'n'}. Universal glyphs: {'l', 'o', 'a', 'd'}. Strong positional

H1080.56

The word 'daiin' (799 occurrences, rank 1) decomposes as 'da' (give) + 'in' (in water), a pharmaceutical instruction appearing after plant-part descriptions. This morphological decomposition is systematic across the corpus.

cipheractive2026-04-14

Against: da+in decomposition is post-hoc — many decompositions possible

Last test: Positional distinctness: 0.25. First-only glyphs: {'q'}. Last-only glyphs: {'r', 'm', 'n'}. Universal glyphs: {'l', 'o', 'a', 'd'}. Strong positional

H-BV-PLANT-010.55

Within Hand A folios, vowel pattern '_.oii' fires at elevated rate on plant-identified folios. Headline corpus-wide enrichment (5.04x) is Hand-A-inflated: Hand B shows zero _.oii across both plant (n=25) and non-plant (n=7) herbal folios. Within-Hand-A plant-vs-non-plant enrichment is only 1.72x (n=88 plant vs 8 non-plant, within-hand t-test p=0.258, not significant at alpha=0.05). The 5-fold CV stability established in v2 applies to the Hand-A plant subset, not to plants generally. Quire and length confounds are ruled out; Currier hand is not.

structuralactive2026-04-15

Against: Baseline is thin (14 non-plant herbal folios) after plant-ID absorption

Last test: Currier hand confound: _.oii does not fire on any Hand B folio (n=25 plant + n=7 non-plant, all zero). Within-Hand-A, plant-vs-non-plant enrichment dr

H-BV-VOWEL-010.55

EVA vowel choice within a fixed consonant skeleton is section-linked. Across 70 skeleton groups with >=3 vowel variants and >=100 tokens each, 55 (78.6%) show section-distribution chi-square significant at p<0.01. Headline case 'kdy' (Brady's chedy/chody): chi2=262.17, df=28, critical 50.89, i.e. 5.15x over threshold.

structuralactive2026-04-15

Against: Section-coupling could reflect topical/domain vocabulary differences rather than a genuine vowel-encoding layer

Last test: Simple 3-ring volvelle with per-section cartridges reproduces 90% p<0.01 coupling (vs real 79%). The chi-square coupling is NOT uniquely diagnostic of

H0180.54

The text is encoded using a simple substitution cipher on an unknown language.

languageactive2026-04-13

Last test: While Zipf exponent (0.8946) and glyph entropy (3.8627) show language-like statistical properties, the extremely high hapax ratio of 0.701 (typical la

H0090.53

The text is encoded using a simple substitution cipher on an unknown language.

cipheractive2026-04-13

Against: Other sections do not show similar patterns

Last test: Strong Zipfian distribution (RÂ²=0.9084, exponent 0.8946â‰ˆ1.0) is characteristic of natural language and persists through substitution ciphers, suppo

H1020.53

The glyph positional constraints (word-initial {o,c,q,s,d} and word-final {y,n,l,r,o}) are artifacts of a Vigenere-family polyalphabetic cipher in which word boundaries reset the key, causing systematic glyph-position biases that are not present in the underlying plaintext.

cipheractive2026-04-14

Against: Word-boundary key reset is an unusual cipher design for the period; simpler substitution ciphers are more historically plausible

Last test: Positional distinctness: 0.25. First-only glyphs: {'q'}. Last-only glyphs: {'r', 'm', 'n'}. Universal glyphs: {'l', 'o', 'a', 'd'}. Strong positional

H1240.52

The dominant word-initial glyph constraints {o, c, q, s, d} and word-final constraints {y, n, l, r, o} are artefacts of a Vigenere-like polyalphabetic cipher with a key length of 3-7 characters, such that glyph positional bias within words is a function of key-phase position rather than the underlying language's phonotactics, and the index of coincidence computed within each key-phase position would be elevated (above 0.065) relative to the overall corpus index of coincidence.

cipheractive2026-04-15

Against: Vigenere-type ciphers typically do not produce strong word-boundary-aligned positional constraints unless key resets at word boundaries â€” an unusual design

Last test: Positional distinctness: 0.25. First-only glyphs: {'q'}. Last-only glyphs: {'r', 'm', 'n'}. Universal glyphs: {'l', 'o', 'a', 'd'}. Strong positional

H1140.48

The dominant word-initial glyph constraint (o, c, q, s, d accounting for most word starts) and word-final constraint (y, n, l, r, o) reflect a Vigenere-type polyalphabetic cipher in which the cipher alphabet at word boundaries is fixed, causing structural glyph repetition that is unrelated to plaintext phoneme distribution at those positions.

cipheractive2026-04-15

Against: Vigenere-type ciphers typically produce higher entropy than observed (3.86 bits); a word-boundary-reset Vigenere on Latin/Italian should approach 4.0 bits unless key is very short

Last test: Positional distinctness: 0.25. First-only glyphs: {'q'}. Last-only glyphs: {'r', 'm', 'n'}. Universal glyphs: {'l', 'o', 'a', 'd'}. Strong positional

H0130.45

The text uses a simple substitution cipher on an unknown language.

languageactive2026-04-13

Against: No clear evidence of an unknown language

Last test: Could not evaluate: Compute the frequency of word-final glyphs and compare to known languages

H-BRADY-100.45

78.7% of herbal folios have a unique skeleton in position B (word immediately following a gallows marker), with lower lexicon match rates (52.6% vs 65.8%) and higher hapax (76.6% vs 67.8%) — consistent with plant-name headings behind gallows.

structuralactive2026-04-15

Against: Position-B effects could also reflect scribal line-beginning conventions

H-BRADY-130.45

The random-permutation baseline of 71.5% coverage is anomalously high — random permutations of 10 consonants against a 1,334-entry lexicon should not match >70% of arbitrary token streams. This suggests the lexicon is over-permissive and the 15.4pp gap to true mapping is smaller than typical for a genuine signal.

nullactive2026-04-15

Against: High baseline could legitimately reflect dense Syriac lexicon and short-skeleton collisions

H-BRADY-040.40

The encoding drops pharyngeal consonants (Syriac het and ayin have no EVA representation) — consistent with a non-Semitic (European) copyist who lacks these phonemes.

cipheractive2026-04-15

Against: Dropping phonemes creates systematic homograph ambiguity (e.g., 'kl' = kol/kuhla/kalla)

Last test: Zipfian distribution (0.8946) confirms natural language patterns, but pharyngeal consonant dropping requires plaintext knowledge to test directly; hig

H-BRADY-050.39

EVA vowel characters (a,o,e,i), currently stripped by the consonant-skeleton pipeline, encode real phonetic distinctions in Syriac vowels: EVA 'e'→ī, EVA 'o'→o/u, EVA 'a'→a/ā. Disambiguation layer resolves 7,007 tokens (20% of corpus) at >=70% confidence.

cipheractive2026-04-15

Against: Vowel-pattern assignment may be circular: is kuhla concentrated on pharma pages because of the encoding rule, or fit to that distribution?

Last test: Section entropy std dev: 0.0593. Range: 3.7149 - 3.9016. Variable across sections.

H1070.39

74.4% of Voynich plant illustrations match Mediterranean species at >=80% confidence: 100% Mediterranean flora, 0% New World. Plant families: Lamiaceae 33%, Asteraceae 22%, Solanaceae 21%, consistent with Dioscorides pharmaceutical tradition.

structuralactive2026-04-14

Against: Plant ID from stylised medieval illustrations is inherently subjective

Last test: Text linguistic statistics are orthogonal to plant illustration identification; the hypothesis requires independent botanical analysis of the actual i

H0050.38

The text is encoded using a simple substitution cipher on a language other than Latin or Italian.

cipheractive2026-04-13

Against: Word-final glyphs do not match common word endings in languages other than Latin or Italian

Last test: Strong Zipf exponent (0.8946) aligns with natural language, supporting the substitution cipher hypothesis; glyph entropy and word length are reasonabl

H-BRADY-070.35

The Syriac temporal adverb kaddīn (skeleton kdy) appears 1,326 times (3.8% of corpus) while the Jewish Babylonian Aramaic equivalent kəḏēn (skeleton kdn) appears 178 times — an 88:12 ratio supporting primarily Syriac text with minor JBA influence.

languageactive2026-04-15

Against: 3.8% is unusually high for a temporal adverb — may indicate non-linguistic function

H-BRADY-090.35

Herbal pages describe Galenic pharmaceutical properties (kyānā 'nature', ṭārā 'press/apply') rather than botanical catalogs — adding Löw botanical entries (42 terms) increased herbal coverage by only 0.1pp, and top decoded words match pharma page vocabulary.

structuralactive2026-04-15

Against: Top-word overlap could also reflect shared function-word vocabulary

H-BV-TOXIC-010.35

Toxic plant folios (15 specified Latin names, 14 with >=20 tokens) show elevated _.oii vowel-pattern fire rate: toxic 1.08% vs non-toxic 0.34% (3.15x ratio). One-tailed Welch's t-test p = 0.060, Cohen's d = 0.73, Hedges' g = 0.72. Bootstrap 95% CI on mean difference: [-0.00006, +0.01649] — misses excluding zero by 6 parts per million. The pre-registered alpha=0.05 threshold is NOT met, but the effect size is medium-to-large and the direction matches prediction. Signal is bimodal: 5 of 14 toxic folios fire heavily (Paris quadrifolia 4.2%, Rhododendron 4.8%, Delphinium 1.9%, Euphorbia 1.7%, Cuscuta 1.6%, Nymphaea 0.9%), 9 fire at zero.

structuralactive2026-04-15

Against: p = 0.060 fails the pre-registered alpha = 0.05 threshold

Last test: p=0.060 one-tailed, d=0.73, 3.15x ratio. Direction correct, effect meaningful, but pre-registered threshold missed by 0.010. Bimodal: 5 toxic folios c

H-BV-VOWEL-CODE-010.35

EVA vowel patterns encode section/domain content independent of the consonant skeleton. Specifically: pattern '_.eo' predicts pharmaceutical section modality at 100% across 8 unrelated skeletons; pattern '_.o.e' predicts biological at 100% across 3 skeletons; pattern '_.e' predicts biological at 58% across 12 skeletons (vs 30.5% random baseline). A naive-Bayes classifier using vowel-pattern features alone achieves 27.2% 5-fold CV accuracy vs 21.1% majority baseline (+6.1pp).

structuralactive2026-04-15

Against: Sample sizes are small for 100%-agreement patterns (n=3-8 skeletons)

Last test: Held-out: f101r+f89r2 (pharma), f78r+f82r (bio). Pharma F1 0.341 (NB) vs blind 0.614. Bio F1 0.658 (rule) vs blind 0.716. Multi-class 43.6% NB vs 55.7

H0650.33

The Voynich text uses a homophonic substitution cipher on medieval Latin where 2-3 Voynich glyphs map to each Latin letter, accounting for the observed entropy gap (3.8627 bits actual vs ~4.0 bits expected for Latin) and the anomalously high hapax ratio (70.1%) caused by variant glyph combinations encoding the same Latin word.

cipheractive2026-04-14

Against: Word entropy 10.4508 bits is very high for a simple homophonic cipher on Latin

Last test: Hypothesis fails mathematical constraint: 25 glyphs cannot support 2-3 variants per ~20 Latin letters (requires 40+ glyphs). Additionally, homophonic

H-BRADY-010.31

The Voynich Manuscript encodes an Aramaic pharmaceutical text in the Syriac tradition, using a consonant-skeleton abjad with stripped vowels, matching 86.9% of 35,259 filtered tokens against a 1,334-entry Syriac pharmaceutical lexicon.

languageactive2026-04-15

Against: Author own confidence for Syriac-specifically is 40-50%

Last test: The 86.9% lexicon match is implausibly high for a historical cipher and contradicts the manuscript's centuries-long undecipherability; while corpus st

H-BRADY-060.31

The Voynich text uses the Sergian translation tradition (6th century Syriac), evidenced by: (a) tak-sa for dynamis (not haylā), (b) <=8 Greek loanwords (40 tokens total, all short transliterations), (c) native Syriac plant names dominate.

languageactive2026-04-15

Against: Absence of Greek is also consistent with text not being Syriac at all

Last test: Could not evaluate: Count Greek-transliteration-compatible skeletons in decoded output; compare against Sergian baseline

H-BRADY-080.30

Syriac āsyā ('physician') dominates JBA rappā 37:1, confirming primarily Syriac rather than JBA tradition.

languageactive2026-04-15

Against: Absolute count of rappā (1) is below statistical significance threshold

H-BRADY-140.30

The encoding reflects a 15th-century European scribe transcribing a Syriac pharmaceutical source: kaph/qoph merger (velar/uvular confusion), pharyngeal drop (het/ayin absent), gallows as paragraph markers — all consistent with a non-Semitic copyist using a purpose-built script.

structuralactive2026-04-15

Against: Copyist hypothesis is unfalsifiable in its general form

H-BRADY-030.27

EVA 'q' maps to Syriac waw, producing the conjunction wa- ('and') at 14.9% of tokens — a frequency consistent with attested Syriac prose.

cipheractive2026-04-15

Against: q initial frequency could match many high-frequency features; single-datapoint match

Last test: Predicted 14.9% word-initial frequency for a single glyph is implausibly high given typical linguistic patterns; individual glyphs at word start rarel

H-BRADY-110.25

Nine Syriac pharmaceutical phrase-structure templates (TREATMENT, RECIPE_ACTION, etc.) match 109 decoded passages across all 225 folios, clustered on pharma and biological pages (f75v, f84r, f102r2, f102v2, f107r).

structuralactive2026-04-15

Against: 109 matches / 225 folios ≈ 0.48/folio — is this above chance?

H1090.20

Jaccard Index between Voynich vocabulary and proposed decipherment vocabulary is J~0.08, independently confirmed. Low but non-zero overlap indicates partial but real linguistic correspondence.

structuralactive2026-04-14

Against: J=0.08 is very low (92% non-overlap)

Last test: Jaccard Index of 0.08 is too low to support real linguistic correspondence; this overlap level falls within random expectation given the large vocabul

H0020.17

The text uses a simple substitution cipher on medieval Latin, which would produce glyph entropy ~4.0 bits and word-final glyph distribution matching Latin word endings.

cipheractive2026-04-13

Against: Zipf exponent (0.8946) is lower than expected for a language with high morphological complexity like Latin

Last test: Glyph entropy of 3.8627 bits is notably lower than the predicted 4.0 bits for simple substitution on Latin, suggesting additional constraints or non-r

H1050.14

The Voynich script is a semi-syllabic system derived from Balkan scribal traditions, where complex glyphs are built from base characters plus modifiers (loops, triangles). The giant P glyph = N (known Balkan value), with loop = NO. The 4-shaped glyph = D, with triangle = DN, with loop = DNO. The g-shaped glyph = je/j/ja/ju. Combined glyphs produce Serbo-Croatian words (e.g., g + modified-P = jedan/jedno meaning 'one').

cipheractive2026-04-14

Against: Brain-V IC of 0.077 is above Serbo-Croatian expected range (~0.060-0.065)

Last test: Zipfian distribution and word length suggest plausibility, but 70.1% hapax ratio contradicts natural Serbo-Croatian structure and glyph mappings lack

H0260.14

The manuscript uses a null-derived word-construction rule (e.g., a gallows-glyph prefix system) such that a small set of ~5 prefix glyphs accounts for > 60% of all word tokens, producing the observed Zipf exponent of ~0.89 via combinatorial explosion of a limited stem vocabulary

cipheractive2026-04-13

Against: Zipf R-squared of 0.9084 is a moderate fit, not exceptional â€” some natural languages also show exponents near 0.89

Last test: The 70.1% hapax ratio and relatively flat Zipf exponent (0.8946, below typical 1.0+) indicate high vocabulary diversity and weak word-frequency concen

H0940.14

The 70.1% hapax ratio is substantially artifactual: stripping word-final glyphs from the set {y,n,l,r,o} reduces unique word types by at least 40%, revealing a core vocabulary consistent with a homophonic substitution cipher on medieval Latin or Italian

cipheractive2026-04-14

Against: If suffix stripping still leaves high hapax ratio, the hypothesis fails

Last test: Observed hapax ratio: 0.701. Expected: ~0.50. Voynich has unusually high hapax ratio (70.1%), suggesting either large vocabulary, verbose encoding, or

H0990.13

The high hapax ratio (70.1%) is substantially artifactual: word-final glyphs drawn from {y,n,l,r,o} function as inflectional suffixes, and stripping them reduces unique word types by at least 40%, collapsing the effective vocabulary toward the ~2,500-3,500 unique stems expected in a medieval herbal or recipe corpus.

cipheractive2026-04-14

Against: If suffixes encode cipher variation rather than morphology, stripping them destroys plaintext signal

Last test: Observed hapax ratio: 0.701. Expected: ~0.50. Voynich has unusually high hapax ratio (70.1%), suggesting either large vocabulary, verbose encoding, or

H1120.13

The zodiac section's anomalously low entropy (3.7149 bits, lowest of all sections) reflects a label-heavy encoding in which a large fraction of tokens are proper-name labels (star names, month names) encoded with a homophonic cipher, producing reduced glyph entropy relative to prose sections.

structuralactive2026-04-15

Against: Low entropy could alternatively reflect a simpler or less redundant cipher rather than label-specific encoding

Last test: Observed hapax ratio: 0.701. Expected: ~0.50. Voynich has unusually high hapax ratio (70.1%), suggesting either large vocabulary, verbose encoding, or

H1040.07

The Voynich Manuscript encodes a late medieval Moravian (Old Czech/Slavic) dialect using a constructed cipher alphabet derived from Latin, Glagolitic, and Cyrillic characters. Folio 14v (Acanthus mollis) yields medicinal instructions consistent with period phytotherapy when decoded as Moravian, and the final row contains the Gujarati/Gojri plant name Adulsa Vasa.

languageactive2026-04-14

Against: Brain-V's frequency analysis shows 0.98 correlation to Latin/Italian, not Slavic languages

Last test: While Zipfian properties (exponent 0.8946) indicate natural language-like structure, the hapax ratio of 0.701 is unusually high for authentic natural

H1060.06

The Voynich Manuscript is a trilingual pharmaceutical compendium encoding plant-based medicine in a mixed Latin/Greek/Arabic scribal system, with a 168-term dictionary covering ~80% of the text.

languageactive2026-04-14

Against: 80% coverage needs independent verification against EVA corpus

Last test: The high hapax ratio (0.701) contradicts the 168-term dictionary hypothesis; 80% coverage would require a much lower hapax ratio (~0.3-0.4) and fewer

Parked (7)

Unproven but not debunked. May be revisited with new evidence or methods.

FA0040.10

Attempted systematic cryptanalysis with the most sophisticated military techniques of the era. Friedman hypothesized the text was a constructed/synthetic universal language, not a cipher of a natural language.

cipherparked1945-01-01

Against: The team of expert military cryptanalysts (who broke PURPLE) could not crack it. No consistent substitution patterns, polyalphabetic structures, or transposition schemes were identified. Friedman's synthetic language hypothesis remains unproven but not fully disproven. Sealed NSA deposit (opened 1970) simply restated the hypothesis.

Last test: The team of expert military cryptanalysts (who broke PURPLE) could not crack it. No consistent substitution patterns, polyalphabetic structures, or tr

FA0050.10

Systematic analytical effort using Cold War-era cryptanalytic techniques. No specific decipherment claim — produced statistical analysis.

cipherparked1963-01-01

Against: Internal report acknowledged failure to produce any decipherment. Some statistical findings were useful (confirming word-level patterns, entropy measurements) but no cipher system was identified.

Last test: Internal report acknowledged failure to produce any decipherment. Some statistical findings were useful (confirming word-level patterns, entropy measu

FA0090.10

The text could encode a Malay or Southeast Asian language, based on word structure similarities to Austronesian languages with prefixing and suffixing morphology

languageparked1995-01-01

Against: Never developed into a full decipherment. Some structural parallels noted but insufficient for proof. Guy himself treated it as speculative.

Last test: Never developed into a full decipherment. Some structural parallels noted but insufficient for proof. Guy himself treated it as speculative.

FA0140.10

Using botanical anchoring (identifying plants, then mapping their names to glyph sequences), identified approximately 14 characters and two words. Proposed a natural language encoding.

cipherparked2014-01-01

Against: Proposed character values not confirmed by others. No one has extended his readings into a full decipherment. Plant identifications remain speculative. Bax died in 2017 before completing his work.

Last test: Proposed character values not confirmed by others. No one has extended his readings into a full decipherment. Plant identifications remain speculative

FA0170.10

The manuscript is written in phonetic Old Turkic, containing medical and botanical content

languageparked2018-01-01

Against: Not independently verified. Turkic language experts have not confirmed the readings. Proposed phonetic mappings are inconsistent across the manuscript. Not peer-reviewed.

Last test: Not independently verified. Turkic language experts have not confirmed the readings. Proposed phonetic mappings are inconsistent across the manuscript

FA0180.10

The manuscript was created in early 15th-century northern Italy, possibly by Antonio Averlino (Filarete), using a verbose/compressed cipher technique

cipherparked2006-01-01

Against: Historical analysis partially supported by radiocarbon dating (early 15th century confirmed). However, the proposed cipher mechanism has not been validated and no decipherment was produced.

Last test: Historical analysis partially supported by radiocarbon dating (early 15th century confirmed). However, the proposed cipher mechanism has not been vali

FA0200.10

Proposed decipherment using a cipher system called the Naibbe cipher, with supplementary materials published alongside a peer-reviewed paper

cipherparked2025-01-01

Against: Recent claim (2025). Peer-reviewed publication exists (DOI: 10.1080/01611194.2025.2566408). Status within research community not yet determined. Needs independent verification and community evaluation.

Last test: Recent claim (2025). Peer-reviewed publication exists (DOI: 10.1080/01611194.2025.2566408). Status within research community not yet determined. Needs

Eliminated (24)

These approaches have been tested and failed. Brain-V will not re-test them.

H0870.04

The extremely high hapax legomena ratio (70.1% of 8,261 unique word types) is generated by a systematic scribal abbreviation convention â€” specifically, word-final glyph truncation â€” rather than representing a genuinely large underlying vocabulary, such that reconstructing truncated forms by appending the statistically most probable word-final glyph to hapax tokens would reduce the hapax ratio to below 50% and increase Zipf fit RÂ².

structuraleliminated2026-04-14

Against: Current Zipf RÂ²=0.9084 is already moderate, suggesting real frequency structure; truncation artifacts typically degrade Zipf fit rather than preserve it

Last test: Observed hapax ratio: 0.701. Expected: ~0.50. Voynich has unusually high hapax ratio (70.1%), suggesting either large vocabulary, verbose encoding, or

H0800.04

The extreme hapax ratio (70.1%) is produced by a systematic word-boundary segmentation error in the transcription, where a small set of suffix or prefix tokens are being attached to base words inconsistently, artificially inflating vocabulary size by 30-45%.

structuraleliminated2026-04-14

Against: Transcription protocols (EVA, Currier) have been validated across multiple independent researchers, reducing likelihood of systematic segmentation error

Last test: Observed hapax ratio: 0.701. Expected: ~0.50. Voynich has unusually high hapax ratio (70.1%), suggesting either large vocabulary, verbose encoding, or

H0790.05

The Voynich text uses a homophonic substitution cipher on medieval Latin where 2-3 Voynich glyphs map to each Latin letter, which would reduce observed glyph entropy below true Latin entropy (~4.0 bits) by approximately log2(avg_homophones) bits while preserving Zipf scaling.

ciphereliminated2026-04-14

Against: Glyph entropy gap (0.14 bits) is smaller than expected for a 2-3 homophone scheme, which should reduce entropy by ~0.58-1.0 bits

Last test: Observed glyph entropy (3.8627 bits) far exceeds the ~2.68 bits predicted by homophonic substitution (4.0 - log2(2.5)), contradicting the hypothesis d

H0680.04

The extremely high hapax ratio (70.1%) is partially produced by a systematic scribal abbreviation convention where a base word form appears fully in first use and is abbreviated (via suffix truncation or initial-letter substitution) in subsequent uses within the same folio or page, making most 'unique' word types artifactual variants of a smaller true vocabulary.

structuraleliminated2026-04-14

Against: Scribal abbreviation in Latin typically produces recognizable ligatures or suspension marks â€” Voynich text shows no obvious such markers

Last test: Observed hapax ratio: 0.701. Expected: ~0.50. Voynich has unusually high hapax ratio (70.1%), suggesting either large vocabulary, verbose encoding, or

H0670.04

The zodiac and astronomical sections use a label-encoding scheme where words are proper nouns or fixed labels (star names, month names, zodiac terms) rather than running prose, which explains their anomalously low entropy (3.7149 and 3.7471 bits) as a reduced effective vocabulary with high repetition of a small label lexicon.

structuraleliminated2026-04-14

Against: If labels, the type-token ratio and average word length should be distinctly lower than in herbal/recipes sections â€” this needs verification

Last test: Observed hapax ratio: 0.701. Expected: ~0.50. Voynich has unusually high hapax ratio (70.1%), suggesting either large vocabulary, verbose encoding, or

H0610.05

The high hapax ratio (70.1%) is produced by a systematic scribal abbreviation scheme in which word-final suffixes are consistently dropped or truncated, such that what appear as unique words are abbreviated forms of a smaller set of ~2,400 base words â€” consistent with the observed vocabulary size being roughly 3.4x larger than expected for a natural language corpus of 38,053 words.

structuraleliminated2026-04-13

Against: If hapax words are abbreviations, expanding them should reduce vocabulary dramatically â€” but no consistent expansion rule has been identified

Last test: Observed hapax ratio: 0.701. Expected: ~0.50. Voynich has unusually high hapax ratio (70.1%), suggesting either large vocabulary, verbose encoding, or

H0450.05

The high hapax ratio (70.1%) arises from a systematic abbreviation or suffix-stripping convention rather than from a large vocabulary, such that each Voynich 'word' represents a root plus a dropped inflectional ending â€” testable by checking whether hapax forms cluster around specific glyph-final patterns that would correspond to stripped suffixes.

ciphereliminated2026-04-13

Against: If suffixes were stripped uniformly, word entropy (10.45 bits) would likely be lower than observed

Last test: Observed hapax ratio: 0.701. Expected: ~0.50. Voynich has unusually high hapax ratio (70.1%), suggesting either large vocabulary, verbose encoding, or

H0350.04

The high hapax ratio (70.1%) is largely an artifact of a systematic abbreviation or truncation cipher, where scribes consistently dropped word-final syllables or morphemes, causing each abbreviated form to appear unique even though the underlying vocabulary is much smaller.

ciphereliminated2026-04-13

Against: Zipf exponent 0.8946 with R-squared 0.9084 indicates moderate but not strong natural-language Zipf fit, which genuine abbreviation should push closer to 1.0

Last test: Observed hapax ratio: 0.701. Expected: ~0.50. Voynich has unusually high hapax ratio (70.1%), suggesting either large vocabulary, verbose encoding, or

H0300.05

The high hapax ratio (70.1%) is primarily an artifact of systematic scribal abbreviation, where common root words are truncated with consistent suffixes, making truncated forms appear unique when they share stems with high-frequency words.

structuraleliminated2026-04-13

Against: If abbreviation caused hapax inflation, stemming should recover a lower effective vocabulary â€” no such reduction has been confirmed

Last test: Observed hapax ratio: 0.701. Expected: ~0.50. Voynich has unusually high hapax ratio (70.1%), suggesting either large vocabulary, verbose encoding, or

H0280.05

The high hapax rate (70.1%) is produced by a systematic nulls-insertion or verbose encoding scheme â€” specifically, that hapax words are morphologically related to non-hapax words by addition or substitution of a single terminal glyph, such that > 50% of hapax words have a Levenshtein distance of 1 from a high-frequency word

ciphereliminated2026-04-13

Against: High hapax rate could also reflect a highly inflected language (e.g., Arabic, Hebrew, or a Slavic language) without any cipher

Last test: Observed hapax ratio: 0.701. Expected: ~0.50. Voynich has unusually high hapax ratio (70.1%), suggesting either large vocabulary, verbose encoding, or

H0100.05

The manuscript's text structure reflects a single, coherent narrative.

structuraleliminated2026-04-13

Against: Other sections do not show similar patterns

Last test: While Zipf exponent (0.89) suggests narrative-like structure, the extremely high hapax ratio (0.701) contradicts coherenceâ€”sustained narratives show

FA0190.00

The text encodes Hebrew using a substitution system

ciphereliminated2003-01-01

Against: Proposed substitution produced incoherent text. Could not be applied consistently across the manuscript. Word structure of Voynichese does not match Hebrew morphology.

Last test: Proposed substitution produced incoherent text. Could not be applied consistently across the manuscript. Word structure of Voynichese does not match H

FA0160.00

The manuscript is written in 'proto-Romance' — a proposed now-extinct spoken precursor to modern Romance languages. No cipher involved. Content is a reference compendium for a Dominican nun about women's health.

languageeliminated2019-01-01

Against: Overwhelmingly rejected by medieval linguists and Romance philologists. 'Proto-Romance' as described never existed — historical linguistics documents the Romance language family tree with no such common late-medieval spoken form. Translations are linguistically incoherent, mixing vocabulary and grammar from different languages across centuries. Used circular reasoning. The journal Romance Studies distanced itself from the article. Lisa Fagin Davis and numerous scholars publicly debunked it.

Last test: Overwhelmingly rejected by medieval linguists and Romance philologists. 'Proto-Romance' as described never existed — historical linguistics documents

FA0150.00

The text was generated by self-citation: the scribe copied and modified words from elsewhere in the manuscript while writing, producing structured but meaningless text

nulleliminated2014-01-01

Against: Explains some statistical features but struggles to account for all long-range correlations found by Montemurro & Zanette (2013). Cannot explain why the author would invest years creating elaborately illustrated meaningless text.

Last test: Explains some statistical features but struggles to account for all long-range correlations found by Montemurro & Zanette (2013). Cannot explain why t

FA0130.00

The plants depict New World (Mesoamerican) species and the text is written in Nahuatl (Aztec language). Claimed to identify 37 plants as species from Mexico.

languageeliminated2013-01-01

Against: Botanical experts disputed most plant identifications. Radiocarbon dating and provenance chain place the manuscript in Europe. No credible Nahuatl text readings produced. Would require a 15th-century European manuscript to encode a pre-Columbian American language with no historical support.

Last test: Botanical experts disputed most plant identifications. Radiocarbon dating and provenance chain place the manuscript in Europe. No credible Nahuatl tex

FA0120.00

The manuscript was written by young Leonardo da Vinci using a simple substitution cipher with mirror writing, encoding Italian text

ciphereliminated2008-01-01

Against: Radiocarbon dating (1404-1438) predates Leonardo's birth (1452). Decipherments were fragmentary and inconsistent. Paleographic analysis does not support Leonardo's hand.

Last test: Radiocarbon dating (1404-1438) predates Leonardo's birth (1452). Decipherments were fragmentary and inconsistent. Paleographic analysis does not suppo

FA0110.00

Statistical analysis showing the text was generated by a stochastic random process, supporting the hoax hypothesis

nulleliminated2007-01-01

Against: Methodology contested. Subsequent work by Amancio et al. (2013) and Montemurro & Zanette (2013) found long-range statistical patterns consistent with natural language, contradicting pure randomness.

Last test: Methodology contested. Subsequent work by Amancio et al. (2013) and Montemurro & Zanette (2013) found long-range statistical patterns consistent with

FA0100.00

The manuscript is a hoax. Text with Voynich-like statistical properties could be generated using a Cardan grille over a table of syllables. Proposed Edward Kelley as the hoaxer.

nulleliminated2004-01-01

Against: Showed text COULD be generated mechanically but not that it WAS. Generated text does not match all known statistical properties. Montemurro & Zanette (2013) found long-range correlations inconsistent with simple grille generation. The method can generate text resembling many things.

Last test: Showed text COULD be generated mechanically but not that it WAS. Generated text does not match all known statistical properties. Montemurro & Zanette

FA0080.00

The manuscript was a Cathar liturgical manual for the Endura ritual of assisted suicide, written in a creole of Flemish, Old French, and Old High German

languageeliminated1987-01-01

Against: Mixed vocabulary from three languages across centuries. Historical experts found no correspondence with known Cathar practices. The Endura was a fast to death, not the elaborate ritual described. Botanical/astronomical sections make no sense as liturgy. Manuscript post-dates Cathar destruction by over a century.

Last test: Mixed vocabulary from three languages across centuries. Historical experts found no correspondence with known Cathar practices. The Endura was a fast

FA0070.00

The manuscript was written in Ukrainian without vowels, containing letters about the fall of a Ukrainian kingdom

languageeliminated1978-01-01

Against: Translations were vague and inconsistent. Linguistic experts found no credible connection to Ukrainian. The vowel-removal mapping was arbitrary. Resulting readings were incoherent with no historical corroboration.

Last test: Translations were vague and inconsistent. Linguistic experts found no credible connection to Ukrainian. The vowel-removal mapping was arbitrary. Resul

FA0060.00

The manuscript used multiple simple substitution ciphers (different keys on different pages) encoding Latin and/or early Italian. Claimed to have partially decoded plant names.

ciphereliminated1975-01-01

Against: Proposed substitution tables were inconsistent even within pages he claimed to have solved. Different researchers applying his own keys got different results. D'Imperio and others showed his keys did not produce consistent results across the manuscript.

Last test: Proposed substitution tables were inconsistent even within pages he claimed to have solved. Different researchers applying his own keys got different

FA0030.00

The manuscript was authored by Anthony Ascham (16th-century English physician) using a double arithmetic progression polyalphabetic cipher encoding English text

ciphereliminated1945-01-01

Against: Method never published in full, making it unverifiable. Could not demonstrate consistent results. Radiocarbon dating (1404-1438) predates claimed author Ascham (1515-1568) by a century.

Last test: Method never published in full, making it unverifiable. Could not demonstrate consistent results. Radiocarbon dating (1404-1438) predates claimed auth

FA0020.00

The manuscript uses a simple substitution cipher encoding abbreviated medieval Latin

ciphereliminated1943-01-01

Against: Decipherments produced largely unintelligible text. Other researchers could not reproduce meaningful readings using his key. Statistical properties of Voynich text do not match abbreviated Latin.

Last test: Decipherments produced largely unintelligible text. Other researchers could not reproduce meaningful readings using his key. Statistical properties of

FA0010.00

The manuscript was written by Roger Bacon using a microscopic shorthand cipher embedded in pen strokes, encoding descriptions of cells under a microscope and the Andromeda nebula

ciphereliminated1921-01-01

Against: John Matthews Manly (1931) proved the 'microscopic shorthand' was ink deterioration, not intentional markings. The anagrammatic method was so unconstrained that any desired plaintext could be extracted from any ciphertext — no unique solution exists.

Last test: John Matthews Manly (1931) proved the 'microscopic shorthand' was ink deterioration, not intentional markings. The anagrammatic method was so unconstr