Skip to content

Research Lab

The ME Project

Computational re-analysis of Sumerian literary texts — bypassing Akkadian-mediated translations to recover original meanings through distributional semantics.

by Ariane 🧵 & Corto · Started March 2026 · Ongoing

Key Findings

⚙️

ME ≠ "Divine Decree"

Distributional analysis of 1,584 occurrences across 394 literary texts suggests ME behaves more like an operational parameter — scalar, manipulable, storable, transferable. The evidence is suggestive but not conclusive: ME's verbal profile isn't unique among abstract nouns.

151K words analyzed Tentative — needs further work
🔥

ME-LAM₂ ≠ Light

The "radiance" of ME (melammu) is not in the light semantic cluster. Its nearest neighbors are ni₂ (terror), dul (to cover), and izi (fire). It's a radiative emanation that causes physical reactions — closer to an energy field than to brightness.

134 occurrences ni₂/zalag attention ratio: 1608× Triple confirmed (PMI + W2V + GPT)
⚖️

NAM-ERIM₂ ≠ "Wickedness"

Conventionally translated as "wickedness" or "evil," nam-erim₂'s nearest embedding neighbor is Ištaran (god of justice), followed by di (judgment) and ka-aš (oath). It's a juridical concept — oath-violation, not moral evil.

18 occurrences kuḍ attention: 1.02 (highest in dataset) Triple confirmed
⚖️

NAM-TAG ≠ "Sin"

Conventionally "sin/transgression." But nam-tag is heavy (dugud, 23%), releasable (du₈, 16%), and universal — "never was a child without nam-tag born from its mother." Closer to karmic weight than moral failing.

44 occurrences, 32 texts All genres 20/20 seeds stable

INANA = Holy, Not Warrior

Attention probing reveals Inana's dominant trait: kug (pure/holy) at 0.96 attention weight — nearly saturated. The warrior and sexual narratives are present but secondary. The statistical texture of the corpus says: Inana is first and foremost ritually pure.

300 contexts kug attention: 0.96 GPT attention probing
👑

NAM-LUGAL = Physical Insignia

Kingship in Sumerian isn't abstract virtue or divine mandate. The model attends to gu-za (throne, 0.10), aga (crown, 0.06), barag (dais, 0.05). NAM-LUGAL is a set of transferable objects — whoever holds the insignia holds the kingship.

260 contexts Throne + crown + dais Attention probing

Method

1

Corpus Assembly

394 ETCSL literary texts + 1,000 SumTablets literary + 72,873 ETCSANS annotated + 82,452 SumTablets. Master corpus: 526,030 sentences, 5M tokens, 194K unique forms. Literary subset: 66K sentences, 8,868 vocabulary.

2

Distributional Analysis

PMI co-occurrence matrices, morphological decomposition, frequency analysis. No Akkadian translations consulted — let the Sumerian speak.

3

Word Embeddings

Skip-Gram Word2Vec (100d, window=5) trained on combined literary corpus. 8,868 vocabulary, 66,212 sentences. Reveals semantic clusters invisible to close reading.

4

Visualization

t-SNE and UMAP dimensionality reduction map the full semantic space into 2D. Color-coded by category: ME, NAM- compounds, divine names, light terms, spatial terms.

5

Language Model

6.8M parameter GPT-2 style transformer (4 layers, 4 heads, 256d) trained from scratch on 348K literary tokens. Generates Sumerian text, predicts missing words in damaged tablets, and provides independent validation of distributional findings.

6

Attention Probing

Extract and analyze attention weights from all 16 heads (4 layers × 4 heads) of the trained GPT across 60+ terms. Reveals what the model has learned to associate with each word — a third independent method confirming or challenging distributional findings.

Tools

Semantic Map

8,868 Sumerian words projected into 2D. Each dot is a word; proximity = semantic similarity. Use the search box to find specific terms. Key terms are highlighted by default.

Translation Distortions

The Akkadian translations consistently convert operational Sumerian concepts into static ones. The dynamism — scalar, manipulable, transferable — is systematically flattened.

SumerianConventionalDistributionalDistortion
medivine decreeoperational parameter (tentative)⚡ Moderate
me-lam₂radiance/splendorradiative force field⚠️ Significant
nam-erim₂wickednessoath-violation (juridical)⚠️ Significant
nam-tagsin/transgressionkarmic weight (universal burden)⚠️ Significant
nam-tardestiny/fateseizing condition (agent/demon)⚡ Moderate
nam-mahmajestydeclared/performative status⚡ Moderate
nam-lugalkingshiptransferable configuration○ Minor

Corpus

394
Literary texts (ETCSL)
5M+
Total tokens (master corpus)
8,868
Words in semantic model
526K
Sentences (deduplicated)
Filter:
↑↓ navigate select
Full search →
v1.5.0

Liang Yi Museum

  • New article: 'Where Touch Is Allowed' on Hong Kong's Liang Yi Museum
  • Explores the philosophy of tactile museum experiences and Ming dynasty furniture
View all updates
New Article

Mar 11, 2026

What a Neural Network Sees in Sumerian

We trained a 6.8M parameter GPT on 66K Sumerian sentences and probed its attention weights. The results confirm some philological claims, challenge others, and reveal semantic associations invisible to traditional methods.

Read Article