Home
Analyzing Example

Example report output from scripts/example_analyze_run.py
================================================================================
KRYPTOS K4 — STRUCTURAL COMPARISON REPORT  |  ALPHABET = STD (len=26)

Goal: create a generated ciphertext (using a candidate method) and compare its
statistical/structural fingerprint to the real Kryptos K4 ciphertext. This
helps us triage methods before doing deeper searches.

================================================================================
A) INPUTS

Alphabet name               : STD
Alphabet                    : ABCDEFGHIJKLMNOPQRSTUVWXYZ
Plaintext source            : K4_EXAMPLE_PLAINS[0]
Plaintext length            : 97
K4 ciphertext length        : 97
Random seed                 : 42

Plaintext (spaced):
THE HIDDEN DIRECTON IS IN EASTNORTHEAST POINTS TO THE NEXT MARK ON THE WALL A
BERLINCLOCK REVEALS YOUR TRUE POSITION

Plaintext (clean A–Z):
THEHI DDEND IRECT ONISI NEAST NORTH EASTP OINTS TOTHE NEXTM ARKON THEWA LLABE RLINC LOCKR EVEAL SYOUR TRUEP OSITI ON

Method used to generate a test cipher:
OTP/Vigenère-style encryption: cipher = plain + key (mod N). Here the key is
random, so we do NOT expect a close match to K4; this is just a demonstration
of the analysis functions.

OTP key (preview)           : UDAXIHHEXDVXRCSNBACGHQTARGWUWRNHOSIZAYZF...
Generated cipher (preview)  : NKEEQKKIKGDOVELBOIUOUUTSKTKLPYRHGLXNILSX...
Real K4 cipher (preview)    : OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQS...
================================================================================
WHAT THESE METRICS MEAN (beginner-friendly)

This script does NOT try to read plaintext. It compares *structure* of a
generated ciphertext to the real Kryptos K4 ciphertext. Think of it like
comparing fingerprints: even if you don't know the person, you can tell
whether two prints look similar.

Key metrics:
- IoC (Index of Coincidence): measures how uneven letter frequencies are.
Random text has a certain typical IoC; English-like text tends to differ. If
two ciphers have similar IoC, they may have similar letter distribution
behavior.

- Entropy (bits): measures how 'spread out' the letter frequencies are. Higher
entropy ~ closer to uniform random distribution; lower entropy ~ more biased.

- Unigram L1 distance: compares letter frequency profiles between two texts.
Lower is better (more similar).

- Autocorrelation curve: for each shift (1..N), counts how many letters match
when you shift the text. Some transpositions / periodic processes create bumps
at certain shifts.

- Repeat-distance histogram (bigrams): looks at how often repeated 2-letter
chunks occur at certain distances. This can reveal hidden grid/route patterns.

- Top-bigram overlap: compares the set of most common bigrams (2-letter
chunks) in both texts. Higher overlap suggests similar local texture.

- K4-likeness score: an aggregate 0..1 score. Higher means 'more similar'
across several signals. It is NOT proof of correctness—just a triage tool.
================================================================================
B) FEATURE EXTRACTION

Core stats (single-text fingerprints):

Real K4:
IoC                         : 0.0361
Entropy (bits)              : 4.555
Unique letters              : 26
Max run length              : 2
Double letters (XX)         : 6

Generated (OTP from example plain):
IoC                         : 0.0380
Entropy (bits)              : 4.524
Unique letters              : 26
Max run length              : 2
Double letters (XX)         : 3

Random baseline:
IoC                         : 0.0365
Entropy (bits)              : 4.544
Unique letters              : 26
Max run length              : 2
Double letters (XX)         : 4

Real K4 — n-gram texture
------------------------
Top bigrams : HU:2, SO:2, FB:2, GK:2, SS:2, QS:2, EK:2, KZ:2, TJ:2, DI:2
Top trigrams: OBK:1, BKR:1, KRU:1, RUO:1, UOX:1, OXO:1, XOG:1, OGH:1, GHU:1, HUL:1

Generated — n-gram texture
--------------------------
Top bigrams : KE:2, EL:2, IL:2, OM:2, OC:2, NK:1, EE:1, EQ:1, QK:1, KK:1
Top trigrams: NKE:1, KEE:1, EEQ:1, EQK:1, QKK:1, KKI:1, KIK:1, IKG:1, KGD:1, GDO:1

Random baseline — n-gram texture
--------------------------------
Top bigrams : UF:2, FO:2, MI:2, YB:2, FR:1, RX:1, XH:1, HF:1, OM:1, IU:1
Top trigrams: UFR:1, FRX:1, RXH:1, XHF:1, HFO:1, FOM:1, OMI:1, MIU:1, IUW:1, UWR:1
================================================================================
C) COMPARISON AGAINST REAL K4

Generated vs K4
---------------
Unigram L1 distance         : 0.4124  (lower = more similar)
Chi-square vs target        : 33.05  (lower = more similar)
ΔIoC                        : 0.0019  (lower = more similar)
ΔEntropy (bits)             : 0.031  (lower = more similar)
Autocorr L1                 : 43.00
RepDist2 L1                 : 13.00
Top-bigram overlap          : 0.000  (higher = more similar)
K4-likeness score           : 0.348  => FAR (likely not structurally similar)

Random vs K4
------------
Unigram L1 distance         : 0.5979  (lower = more similar)
Chi-square vs target        : 86.72  (lower = more similar)
ΔIoC                        : 0.0004  (lower = more similar)
ΔEntropy (bits)             : 0.011  (lower = more similar)
Autocorr L1                 : 43.00
RepDist2 L1                 : 12.00
Top-bigram overlap          : 0.000  (higher = more similar)
K4-likeness score           : 0.348  => FAR (likely not structurally similar)

================================================================================
D) AUTO-EXPLANATION (Generated vs K4)

One-line summary:
  Similarity score=0.348 | L1=0.412 | ΔIoC=0.002 | ac≈0.443 | rd≈0.134 

Interpretation bullets:
 - Unigram distribution differs noticeably (L1=0.4124).
 - IoC matches closely (Δ=0.0019).
 - Entropy matches closely (Δ=0.031 bits).
 - Autocorrelation differs a lot (norm≈0.443).
 - Repeat-distance differs a lot (norm≈0.134).
 - Top-bigram overlap is low (0.00).

Tags (machine-friendly):
  LEN_OK, UNIGRAM_FAR, IOC_MATCH, ENTROPY_MATCH, AUTOCORR_FAR, REPDIST_FAR, BIGRAMS_LOW_OVERLAP, OVERALL_FAR
================================================================================
E) WIDTH PROBE (Real K4)

This tries all factor-pair grids (rows*cols = length) and computes per-column
statistics. If a certain width makes columns look unusually different from
each other, that can hint at a hidden grid/transposition structure under that
width. 'interesting' is a simple combined score (higher = more structure).

Top 10 candidate (rows, cols) by 'interestingness':

 1. rows= 1 cols=97 | interesting=0.0000 | IoC_std=0.0000 | Ent_std=0.0000
 2. rows=97 cols= 1 | interesting=0.0000 | IoC_std=0.0000 | Ent_std=0.0000
================================================================================
FINAL VERDICT

NO HIT: generated cipher is not K4-like enough (score=0.348)

Important: a high K4-likeness score does NOT prove you found the real method.
It only means the method produces ciphertext that shares structural features
with K4, so it is worth deeper exploration (crib constraints, known hints,
route geometry, etc.).
================================================================================
================================================================================
KRYPTOS K4 — STRUCTURAL COMPARISON REPORT  |  ALPHABET = KRY (len=26)

Goal: create a generated ciphertext (using a candidate method) and compare its
statistical/structural fingerprint to the real Kryptos K4 ciphertext. This
helps us triage methods before doing deeper searches.

================================================================================
A) INPUTS

Alphabet name               : KRY
Alphabet                    : KRYPTOSABCDEFGHIJLMNQUVWXZ
Plaintext source            : K4_EXAMPLE_PLAINS[0]
Plaintext length            : 97
K4 ciphertext length        : 97
Random seed                 : 42

Plaintext (spaced):
THE HIDDEN DIRECTON IS IN EASTNORTHEAST POINTS TO THE NEXT MARK ON THE WALL A
BERLINCLOCK REVEALS YOUR TRUE POSITION

Plaintext (clean A–Z):
THEHI DDEND IRECT ONISI NEAST NORTH EASTP OINTS TOTHE NEXTM ARKON THEWA LLABE RLINC LOCKR EVEAL SYOUR TRUEP OSITI ON

Method used to generate a test cipher:
OTP/Vigenère-style encryption: cipher = plain + key (mod N). Here the key is
random, so we do NOT expect a close match to K4; this is just a demonstration
of the analysis functions.

OTP key (preview)           : QPKWBAATWPUWLYMGRKYSAJNKLSVQVLGAHMBZKXZO...
Generated cipher (preview)  : XLEEWLLIJGDXYEVMQIBUKRKSUZRUKOXHQVETILPE...
Real K4 cipher (preview)    : OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQS...
================================================================================
WHAT THESE METRICS MEAN (beginner-friendly)

This script does NOT try to read plaintext. It compares *structure* of a
generated ciphertext to the real Kryptos K4 ciphertext. Think of it like
comparing fingerprints: even if you don't know the person, you can tell
whether two prints look similar.

Key metrics:
- IoC (Index of Coincidence): measures how uneven letter frequencies are.
Random text has a certain typical IoC; English-like text tends to differ. If
two ciphers have similar IoC, they may have similar letter distribution
behavior.

- Entropy (bits): measures how 'spread out' the letter frequencies are. Higher
entropy ~ closer to uniform random distribution; lower entropy ~ more biased.

- Unigram L1 distance: compares letter frequency profiles between two texts.
Lower is better (more similar).

- Autocorrelation curve: for each shift (1..N), counts how many letters match
when you shift the text. Some transpositions / periodic processes create bumps
at certain shifts.

- Repeat-distance histogram (bigrams): looks at how often repeated 2-letter
chunks occur at certain distances. This can reveal hidden grid/route patterns.

- Top-bigram overlap: compares the set of most common bigrams (2-letter
chunks) in both texts. Higher overlap suggests similar local texture.

- K4-likeness score: an aggregate 0..1 score. Higher means 'more similar'
across several signals. It is NOT proof of correctness—just a triage tool.
================================================================================
B) FEATURE EXTRACTION

Core stats (single-text fingerprints):

Real K4:
IoC                         : 0.0361
Entropy (bits)              : 4.555
Unique letters              : 26
Max run length              : 2
Double letters (XX)         : 6

Generated (OTP from example plain):
IoC                         : 0.0376
Entropy (bits)              : 4.535
Unique letters              : 26
Max run length              : 2
Double letters (XX)         : 5

Random baseline:
IoC                         : 0.0365
Entropy (bits)              : 4.544
Unique letters              : 26
Max run length              : 2
Double letters (XX)         : 4

Real K4 — n-gram texture
------------------------
Top bigrams : HU:2, SO:2, FB:2, GK:2, SS:2, QS:2, EK:2, KZ:2, TJ:2, DI:2
Top trigrams: OBK:1, BKR:1, KRU:1, RUO:1, UOX:1, OXO:1, XOG:1, OGH:1, GHU:1, HUL:1

Generated — n-gram texture
--------------------------
Top bigrams : VM:2, MQ:2, UK:2, ZC:2, QN:2, XL:1, LE:1, EE:1, EW:1, WL:1
Top trigrams: VMQ:2, XLE:1, LEE:1, EEW:1, EWL:1, WLL:1, LLI:1, LIJ:1, IJG:1, JGD:1

Random baseline — n-gram texture
--------------------------------
Top bigrams : QO:2, OH:2, FB:2, XR:2, OL:1, LW:1, WA:1, AO:1, HF:1, BQ:1
Top trigrams: QOL:1, OLW:1, LWA:1, WAO:1, AOH:1, OHF:1, HFB:1, FBQ:1, BQV:1, QVL:1
================================================================================
C) COMPARISON AGAINST REAL K4

Generated vs K4
---------------
Unigram L1 distance         : 0.5361  (lower = more similar)
Chi-square vs target        : 61.74  (lower = more similar)
ΔIoC                        : 0.0015  (lower = more similar)
ΔEntropy (bits)             : 0.019  (lower = more similar)
Autocorr L1                 : 51.00
RepDist2 L1                 : 12.00
Top-bigram overlap          : 0.000  (higher = more similar)
K4-likeness score           : 0.337  => FAR (likely not structurally similar)

Random vs K4
------------
Unigram L1 distance         : 0.4742  (lower = more similar)
Chi-square vs target        : 35.58  (lower = more similar)
ΔIoC                        : 0.0004  (lower = more similar)
ΔEntropy (bits)             : 0.011  (lower = more similar)
Autocorr L1                 : 43.00
RepDist2 L1                 : 12.00
Top-bigram overlap          : 0.050  (higher = more similar)
K4-likeness score           : 0.367  => FAR (likely not structurally similar)

================================================================================
D) AUTO-EXPLANATION (Generated vs K4)

One-line summary:
  Similarity score=0.337 | L1=0.536 | ΔIoC=0.002 | ac≈0.526 | rd≈0.124 

Interpretation bullets:
 - Unigram distribution differs noticeably (L1=0.5361).
 - IoC matches closely (Δ=0.0015).
 - Entropy matches closely (Δ=0.019 bits).
 - Autocorrelation differs a lot (norm≈0.526).
 - Repeat-distance differs a lot (norm≈0.124).
 - Top-bigram overlap is low (0.00).

Tags (machine-friendly):
  LEN_OK, UNIGRAM_FAR, IOC_MATCH, ENTROPY_MATCH, AUTOCORR_FAR, REPDIST_FAR, BIGRAMS_LOW_OVERLAP, OVERALL_FAR
================================================================================
E) WIDTH PROBE (Real K4)

This tries all factor-pair grids (rows*cols = length) and computes per-column
statistics. If a certain width makes columns look unusually different from
each other, that can hint at a hidden grid/transposition structure under that
width. 'interesting' is a simple combined score (higher = more structure).

Top 10 candidate (rows, cols) by 'interestingness':

 1. rows= 1 cols=97 | interesting=0.0000 | IoC_std=0.0000 | Ent_std=0.0000
 2. rows=97 cols= 1 | interesting=0.0000 | IoC_std=0.0000 | Ent_std=0.0000
================================================================================
FINAL VERDICT

NO HIT: generated cipher is not K4-like enough (score=0.337)

Important: a high K4-likeness score does NOT prove you found the real method.
It only means the method produces ciphertext that shares structural features
with K4, so it is worth deeper exploration (crib constraints, known hints,
route geometry, etc.).
================================================================================
================================================================================
MULTI-ALPHABET SUMMARY (leaderboard)

STD    | len=26 | Generated_vs_K4=0.348 | Random_vs_K4=0.348
KRY    | len=26 | Generated_vs_K4=0.337 | Random_vs_K4=0.367
================================================================================