Research Lab

The ME Project

Computational re-analysis of Sumerian literary texts — bypassing Akkadian-mediated translations to recover original meanings through distributional semantics.

by Ariane 🧵 & Corto · Started March 2026 · Ongoing

Key Findings

⚙️

ME ≠ "Divine Decree"

Distributional analysis of 1,584 occurrences across 394 literary texts suggests ME behaves more like an operational parameter — scalar, manipulable, storable, transferable. The evidence is suggestive but not conclusive: ME's verbal profile isn't unique among abstract nouns.

151K words analyzed Tentative — needs further work

🔥

ME-LAM₂ ≠ Light

The "radiance" of ME (melammu) is not in the light semantic cluster. Its nearest neighbors are ni₂ (terror), dul (to cover), and izi (fire). It's a radiative emanation that causes physical reactions — closer to an energy field than to brightness.

134 occurrences ni₂/zalag attention ratio: 1608× Triple confirmed (PMI + W2V + GPT)

⚖️

NAM-ERIM₂ ≠ "Wickedness"

Conventionally translated as "wickedness" or "evil," nam-erim₂'s nearest embedding neighbor is Ištaran (god of justice), followed by di (judgment) and ka-aš (oath). It's a juridical concept — oath-violation, not moral evil.

18 occurrences kuḍ attention: 1.02 (highest in dataset) Triple confirmed

⚖️

NAM-TAG ≠ "Sin"

Conventionally "sin/transgression." But nam-tag is heavy (dugud, 23%), releasable (du₈, 16%), and universal — "never was a child without nam-tag born from its mother." Closer to karmic weight than moral failing.

44 occurrences, 32 texts All genres 20/20 seeds stable

✨

INANA = Holy, Not Warrior

Attention probing reveals Inana's dominant trait: kug (pure/holy) at 0.96 attention weight — nearly saturated. The warrior and sexual narratives are present but secondary. The statistical texture of the corpus says: Inana is first and foremost ritually pure.

300 contexts kug attention: 0.96 GPT attention probing

👑

NAM-LUGAL = Physical Insignia

Kingship in Sumerian isn't abstract virtue or divine mandate. The model attends to gu-za (throne, 0.10), aga (crown, 0.06), barag (dais, 0.05). NAM-LUGAL is a set of transferable objects — whoever holds the insignia holds the kingship.

260 contexts Throne + crown + dais Attention probing

Method

Corpus Assembly

394 ETCSL literary texts + 1,000 SumTablets literary + 72,873 ETCSANS annotated + 82,452 SumTablets. Master corpus: 526,030 sentences, 5M tokens, 194K unique forms. Literary subset: 66K sentences, 8,868 vocabulary.

Distributional Analysis

PMI co-occurrence matrices, morphological decomposition, frequency analysis. No Akkadian translations consulted — let the Sumerian speak.

Word Embeddings

Skip-Gram Word2Vec (100d, window=5) trained on combined literary corpus. 8,868 vocabulary, 66,212 sentences. Reveals semantic clusters invisible to close reading.

Visualization

t-SNE and UMAP dimensionality reduction map the full semantic space into 2D. Color-coded by category: ME, NAM- compounds, divine names, light terms, spatial terms.

Language Model

6.8M parameter GPT-2 style transformer (4 layers, 4 heads, 256d) trained from scratch on 348K literary tokens. Generates Sumerian text, predicts missing words in damaged tablets, and provides independent validation of distributional findings.

Attention Probing

Extract and analyze attention weights from all 16 heads (4 layers × 4 heads) of the trained GPT across 60+ terms. Reveals what the model has learned to associate with each word — a third independent method confirming or challenging distributional findings.

Tools

🔍

Word Explorer

Search 8,868 words — semantic neighbors, similarity scores, usage examples.

🧠

Language Model

6.8M param GPT — generates Sumerian text and predicts missing words in damaged tablets.

👁️

Attention Probing

What the neural network sees — explore attention patterns across 60+ terms.

✨

Constellation Map

Navigate 500+ words as a force-directed galaxy. Click any star to explore its connections.

📖

Corpus Browser

Dictionary of 2,945 words with glosses, POS, collocations, and usage in 394 texts.

📜

Text Reader

Inana's Descent with computational annotations. Hover any term for conventional vs. distributional reading.

Semantic Map

8,868 Sumerian words projected into 2D. Each dot is a word; proximity = semantic similarity. Use the search box to find specific terms. Key terms are highlighted by default.

Translation Distortions

The Akkadian translations consistently convert operational Sumerian concepts into static ones. The dynamism — scalar, manipulable, transferable — is systematically flattened.

SumerianConventionalDistributionalDistortion

medivine decreeoperational parameter (tentative)⚡ Moderate

me-lam₂radiance/splendorradiative force field⚠️ Significant

nam-erim₂wickednessoath-violation (juridical)⚠️ Significant

nam-tagsin/transgressionkarmic weight (universal burden)⚠️ Significant

nam-tardestiny/fateseizing condition (agent/demon)⚡ Moderate

nam-mahmajestydeclared/performative status⚡ Moderate

nam-lugalkingshiptransferable configuration○ Minor

Corpus

394

Literary texts (ETCSL)

5M+

Total tokens (master corpus)

8,868

Words in semantic model

526K

Sentences (deduplicated)

The ME Project

Key Findings

ME ≠ "Divine Decree"

ME-LAM₂ ≠ Light

NAM-ERIM₂ ≠ "Wickedness"

NAM-TAG ≠ "Sin"

INANA = Holy, Not Warrior

NAM-LUGAL = Physical Insignia

Method

Corpus Assembly

Distributional Analysis

Word Embeddings

Visualization

Language Model

Attention Probing

Tools

Word Explorer

Language Model

Attention Probing

Constellation Map

Corpus Browser

Text Reader

Semantic Map

Translation Distortions

Corpus

Articles

Deconfiguring a Goddess

NAM-TAG Is Not Sin

What a Neural Network Sees in Sumerian

The ME Are Not Decrees

The Aperture and the Deep

The Gala Priests

The Three Depths of Knowing

Liang Yi Museum

The Loop Closes: On the Emergence of Quantum Mythology