Example report output from scripts/example_analyze_run.py
================================================================================ KRYPTOS K4 — STRUCTURAL COMPARISON REPORT | ALPHABET = STD (len=26) Goal: create a generated ciphertext (using a candidate method) and compare its statistical/structural fingerprint to the real Kryptos K4 ciphertext. This helps us triage methods before doing deeper searches. ================================================================================ A) INPUTS Alphabet name : STD Alphabet : ABCDEFGHIJKLMNOPQRSTUVWXYZ Plaintext source : K4_EXAMPLE_PLAINS[0] Plaintext length : 97 K4 ciphertext length : 97 Random seed : 42 Plaintext (spaced): THE HIDDEN DIRECTON IS IN EASTNORTHEAST POINTS TO THE NEXT MARK ON THE WALL A BERLINCLOCK REVEALS YOUR TRUE POSITION Plaintext (clean A–Z): THEHI DDEND IRECT ONISI NEAST NORTH EASTP OINTS TOTHE NEXTM ARKON THEWA LLABE RLINC LOCKR EVEAL SYOUR TRUEP OSITI ON Method used to generate a test cipher: OTP/Vigenère-style encryption: cipher = plain + key (mod N). Here the key is random, so we do NOT expect a close match to K4; this is just a demonstration of the analysis functions. OTP key (preview) : UDAXIHHEXDVXRCSNBACGHQTARGWUWRNHOSIZAYZF... Generated cipher (preview) : NKEEQKKIKGDOVELBOIUOUUTSKTKLPYRHGLXNILSX... Real K4 cipher (preview) : OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQS... ================================================================================ WHAT THESE METRICS MEAN (beginner-friendly) This script does NOT try to read plaintext. It compares *structure* of a generated ciphertext to the real Kryptos K4 ciphertext. Think of it like comparing fingerprints: even if you don't know the person, you can tell whether two prints look similar. Key metrics: - IoC (Index of Coincidence): measures how uneven letter frequencies are. Random text has a certain typical IoC; English-like text tends to differ. If two ciphers have similar IoC, they may have similar letter distribution behavior. - Entropy (bits): measures how 'spread out' the letter frequencies are. Higher entropy ~ closer to uniform random distribution; lower entropy ~ more biased. - Unigram L1 distance: compares letter frequency profiles between two texts. Lower is better (more similar). - Autocorrelation curve: for each shift (1..N), counts how many letters match when you shift the text. Some transpositions / periodic processes create bumps at certain shifts. - Repeat-distance histogram (bigrams): looks at how often repeated 2-letter chunks occur at certain distances. This can reveal hidden grid/route patterns. - Top-bigram overlap: compares the set of most common bigrams (2-letter chunks) in both texts. Higher overlap suggests similar local texture. - K4-likeness score: an aggregate 0..1 score. Higher means 'more similar' across several signals. It is NOT proof of correctness—just a triage tool. ================================================================================ B) FEATURE EXTRACTION Core stats (single-text fingerprints): Real K4: IoC : 0.0361 Entropy (bits) : 4.555 Unique letters : 26 Max run length : 2 Double letters (XX) : 6 Generated (OTP from example plain): IoC : 0.0380 Entropy (bits) : 4.524 Unique letters : 26 Max run length : 2 Double letters (XX) : 3 Random baseline: IoC : 0.0365 Entropy (bits) : 4.544 Unique letters : 26 Max run length : 2 Double letters (XX) : 4 Real K4 — n-gram texture ------------------------ Top bigrams : HU:2, SO:2, FB:2, GK:2, SS:2, QS:2, EK:2, KZ:2, TJ:2, DI:2 Top trigrams: OBK:1, BKR:1, KRU:1, RUO:1, UOX:1, OXO:1, XOG:1, OGH:1, GHU:1, HUL:1 Generated — n-gram texture -------------------------- Top bigrams : KE:2, EL:2, IL:2, OM:2, OC:2, NK:1, EE:1, EQ:1, QK:1, KK:1 Top trigrams: NKE:1, KEE:1, EEQ:1, EQK:1, QKK:1, KKI:1, KIK:1, IKG:1, KGD:1, GDO:1 Random baseline — n-gram texture -------------------------------- Top bigrams : UF:2, FO:2, MI:2, YB:2, FR:1, RX:1, XH:1, HF:1, OM:1, IU:1 Top trigrams: UFR:1, FRX:1, RXH:1, XHF:1, HFO:1, FOM:1, OMI:1, MIU:1, IUW:1, UWR:1 ================================================================================ C) COMPARISON AGAINST REAL K4 Generated vs K4 --------------- Unigram L1 distance : 0.4124 (lower = more similar) Chi-square vs target : 33.05 (lower = more similar) ΔIoC : 0.0019 (lower = more similar) ΔEntropy (bits) : 0.031 (lower = more similar) Autocorr L1 : 43.00 RepDist2 L1 : 13.00 Top-bigram overlap : 0.000 (higher = more similar) K4-likeness score : 0.348 => FAR (likely not structurally similar) Random vs K4 ------------ Unigram L1 distance : 0.5979 (lower = more similar) Chi-square vs target : 86.72 (lower = more similar) ΔIoC : 0.0004 (lower = more similar) ΔEntropy (bits) : 0.011 (lower = more similar) Autocorr L1 : 43.00 RepDist2 L1 : 12.00 Top-bigram overlap : 0.000 (higher = more similar) K4-likeness score : 0.348 => FAR (likely not structurally similar) ================================================================================ D) AUTO-EXPLANATION (Generated vs K4) One-line summary: Similarity score=0.348 | L1=0.412 | ΔIoC=0.002 | ac≈0.443 | rd≈0.134 Interpretation bullets: - Unigram distribution differs noticeably (L1=0.4124). - IoC matches closely (Δ=0.0019). - Entropy matches closely (Δ=0.031 bits). - Autocorrelation differs a lot (norm≈0.443). - Repeat-distance differs a lot (norm≈0.134). - Top-bigram overlap is low (0.00). Tags (machine-friendly): LEN_OK, UNIGRAM_FAR, IOC_MATCH, ENTROPY_MATCH, AUTOCORR_FAR, REPDIST_FAR, BIGRAMS_LOW_OVERLAP, OVERALL_FAR ================================================================================ E) WIDTH PROBE (Real K4) This tries all factor-pair grids (rows*cols = length) and computes per-column statistics. If a certain width makes columns look unusually different from each other, that can hint at a hidden grid/transposition structure under that width. 'interesting' is a simple combined score (higher = more structure). Top 10 candidate (rows, cols) by 'interestingness': 1. rows= 1 cols=97 | interesting=0.0000 | IoC_std=0.0000 | Ent_std=0.0000 2. rows=97 cols= 1 | interesting=0.0000 | IoC_std=0.0000 | Ent_std=0.0000 ================================================================================ FINAL VERDICT NO HIT: generated cipher is not K4-like enough (score=0.348) Important: a high K4-likeness score does NOT prove you found the real method. It only means the method produces ciphertext that shares structural features with K4, so it is worth deeper exploration (crib constraints, known hints, route geometry, etc.). ================================================================================ ================================================================================ KRYPTOS K4 — STRUCTURAL COMPARISON REPORT | ALPHABET = KRY (len=26) Goal: create a generated ciphertext (using a candidate method) and compare its statistical/structural fingerprint to the real Kryptos K4 ciphertext. This helps us triage methods before doing deeper searches. ================================================================================ A) INPUTS Alphabet name : KRY Alphabet : KRYPTOSABCDEFGHIJLMNQUVWXZ Plaintext source : K4_EXAMPLE_PLAINS[0] Plaintext length : 97 K4 ciphertext length : 97 Random seed : 42 Plaintext (spaced): THE HIDDEN DIRECTON IS IN EASTNORTHEAST POINTS TO THE NEXT MARK ON THE WALL A BERLINCLOCK REVEALS YOUR TRUE POSITION Plaintext (clean A–Z): THEHI DDEND IRECT ONISI NEAST NORTH EASTP OINTS TOTHE NEXTM ARKON THEWA LLABE RLINC LOCKR EVEAL SYOUR TRUEP OSITI ON Method used to generate a test cipher: OTP/Vigenère-style encryption: cipher = plain + key (mod N). Here the key is random, so we do NOT expect a close match to K4; this is just a demonstration of the analysis functions. OTP key (preview) : QPKWBAATWPUWLYMGRKYSAJNKLSVQVLGAHMBZKXZO... Generated cipher (preview) : XLEEWLLIJGDXYEVMQIBUKRKSUZRUKOXHQVETILPE... Real K4 cipher (preview) : OBKRUOXOGHULBSOLIFBBWFLRVQQPRNGKSSOTWTQS... ================================================================================ WHAT THESE METRICS MEAN (beginner-friendly) This script does NOT try to read plaintext. It compares *structure* of a generated ciphertext to the real Kryptos K4 ciphertext. Think of it like comparing fingerprints: even if you don't know the person, you can tell whether two prints look similar. Key metrics: - IoC (Index of Coincidence): measures how uneven letter frequencies are. Random text has a certain typical IoC; English-like text tends to differ. If two ciphers have similar IoC, they may have similar letter distribution behavior. - Entropy (bits): measures how 'spread out' the letter frequencies are. Higher entropy ~ closer to uniform random distribution; lower entropy ~ more biased. - Unigram L1 distance: compares letter frequency profiles between two texts. Lower is better (more similar). - Autocorrelation curve: for each shift (1..N), counts how many letters match when you shift the text. Some transpositions / periodic processes create bumps at certain shifts. - Repeat-distance histogram (bigrams): looks at how often repeated 2-letter chunks occur at certain distances. This can reveal hidden grid/route patterns. - Top-bigram overlap: compares the set of most common bigrams (2-letter chunks) in both texts. Higher overlap suggests similar local texture. - K4-likeness score: an aggregate 0..1 score. Higher means 'more similar' across several signals. It is NOT proof of correctness—just a triage tool. ================================================================================ B) FEATURE EXTRACTION Core stats (single-text fingerprints): Real K4: IoC : 0.0361 Entropy (bits) : 4.555 Unique letters : 26 Max run length : 2 Double letters (XX) : 6 Generated (OTP from example plain): IoC : 0.0376 Entropy (bits) : 4.535 Unique letters : 26 Max run length : 2 Double letters (XX) : 5 Random baseline: IoC : 0.0365 Entropy (bits) : 4.544 Unique letters : 26 Max run length : 2 Double letters (XX) : 4 Real K4 — n-gram texture ------------------------ Top bigrams : HU:2, SO:2, FB:2, GK:2, SS:2, QS:2, EK:2, KZ:2, TJ:2, DI:2 Top trigrams: OBK:1, BKR:1, KRU:1, RUO:1, UOX:1, OXO:1, XOG:1, OGH:1, GHU:1, HUL:1 Generated — n-gram texture -------------------------- Top bigrams : VM:2, MQ:2, UK:2, ZC:2, QN:2, XL:1, LE:1, EE:1, EW:1, WL:1 Top trigrams: VMQ:2, XLE:1, LEE:1, EEW:1, EWL:1, WLL:1, LLI:1, LIJ:1, IJG:1, JGD:1 Random baseline — n-gram texture -------------------------------- Top bigrams : QO:2, OH:2, FB:2, XR:2, OL:1, LW:1, WA:1, AO:1, HF:1, BQ:1 Top trigrams: QOL:1, OLW:1, LWA:1, WAO:1, AOH:1, OHF:1, HFB:1, FBQ:1, BQV:1, QVL:1 ================================================================================ C) COMPARISON AGAINST REAL K4 Generated vs K4 --------------- Unigram L1 distance : 0.5361 (lower = more similar) Chi-square vs target : 61.74 (lower = more similar) ΔIoC : 0.0015 (lower = more similar) ΔEntropy (bits) : 0.019 (lower = more similar) Autocorr L1 : 51.00 RepDist2 L1 : 12.00 Top-bigram overlap : 0.000 (higher = more similar) K4-likeness score : 0.337 => FAR (likely not structurally similar) Random vs K4 ------------ Unigram L1 distance : 0.4742 (lower = more similar) Chi-square vs target : 35.58 (lower = more similar) ΔIoC : 0.0004 (lower = more similar) ΔEntropy (bits) : 0.011 (lower = more similar) Autocorr L1 : 43.00 RepDist2 L1 : 12.00 Top-bigram overlap : 0.050 (higher = more similar) K4-likeness score : 0.367 => FAR (likely not structurally similar) ================================================================================ D) AUTO-EXPLANATION (Generated vs K4) One-line summary: Similarity score=0.337 | L1=0.536 | ΔIoC=0.002 | ac≈0.526 | rd≈0.124 Interpretation bullets: - Unigram distribution differs noticeably (L1=0.5361). - IoC matches closely (Δ=0.0015). - Entropy matches closely (Δ=0.019 bits). - Autocorrelation differs a lot (norm≈0.526). - Repeat-distance differs a lot (norm≈0.124). - Top-bigram overlap is low (0.00). Tags (machine-friendly): LEN_OK, UNIGRAM_FAR, IOC_MATCH, ENTROPY_MATCH, AUTOCORR_FAR, REPDIST_FAR, BIGRAMS_LOW_OVERLAP, OVERALL_FAR ================================================================================ E) WIDTH PROBE (Real K4) This tries all factor-pair grids (rows*cols = length) and computes per-column statistics. If a certain width makes columns look unusually different from each other, that can hint at a hidden grid/transposition structure under that width. 'interesting' is a simple combined score (higher = more structure). Top 10 candidate (rows, cols) by 'interestingness': 1. rows= 1 cols=97 | interesting=0.0000 | IoC_std=0.0000 | Ent_std=0.0000 2. rows=97 cols= 1 | interesting=0.0000 | IoC_std=0.0000 | Ent_std=0.0000 ================================================================================ FINAL VERDICT NO HIT: generated cipher is not K4-like enough (score=0.337) Important: a high K4-likeness score does NOT prove you found the real method. It only means the method produces ciphertext that shares structural features with K4, so it is worth deeper exploration (crib constraints, known hints, route geometry, etc.). ================================================================================ ================================================================================ MULTI-ALPHABET SUMMARY (leaderboard) STD | len=26 | Generated_vs_K4=0.348 | Random_vs_K4=0.348 KRY | len=26 | Generated_vs_K4=0.337 | Random_vs_K4=0.367 ================================================================================