A Game of Sounds

Once upon a time, I noticed a similarity between the words for to buy in Russian, German, Dutch, and Norwegian: купить (kupît’1), kauffenkopen, and kjøpe.

While it’s not the most difficult set of sound changes to trace, I realized when I set out to write about it that it would pay to start a bit simpler. So, I decided to pick a bit of an easier case.

Family Terms & Numbers in IE Languages: A Brief Aside

Each of these four languages is Indo-European. As such, in spite of their superficial differences, they share a number of structural and lexical similarities. Words relating to members of the family and to numbers tend to show this most obviously0.

This is, of course, only a tendency, but it holds true for the term I’d like to take a look at today: The word for mother. This is мать (mat’), in Russian; Mutter, in German; moeder, in Dutch; and mor in Norwegian.

Spotting the similarities and accounting for the differnces between these words is simple enough. But, I figured it would be a good place to start exploring sound changes before looking at kauffen etc. and ultimately moving into PIE reconstruction, for which a fairly sophisticated intuition for this kind of thing is ideal.

Shared Consonants in Dutch & German

Let’s start with the easy one: The relationship between Mutter and moeder.

These two words share the initial m-, which happens to be an unpalatalized bilabial nasal consonant in all four cases, as well as their other consonants: A final -r, and an alveolar plosive in the center of the word, -d- in moeder and -tt- in Mutter.

In Dutch, this last sound is voiced; In German, it is not (it is an unvoiced alveolar plosive), meaning that there is no, or minimal, vibration of the vocal cords during enunciation.3

The difference between voiced and unvoiced consonants is a fairly minor one from a cross-linguistic standpoint. In any case, we can say that the Dutch and German words share the consonant pattern m-d*-r, where m- is an unpalatalized bilabial nasal consonant; d*- is an alveolar plosive (voiced or unvoiced); and -r denotes a rhotic consonant (sometimes called a tremulant), the specific nature of which depends on the dialect under consideration4.

In this case, we’re even lucky enough to have a similar vocalic pattern. The initial sound — –oe- in Dutch and –u– in German — sounds very similar in both languages, as does the schwa sound in the final syllable of both words.

Taking Stock of Norwegian: Where’s the D?

Norwegian has mor. Again, we note the initial m-, and final rhotic sound, but find that the medial alveolar plosive sound is missing.

Well, it’s . . . Hiding, to be more accurate. It shows up again in the word’s plural forms — mødre, meaning mothers, and mødrene, meaning the mothers5. So, we see the same m-d*-r pattern in the plural stem mødr-.

What happened in the singular? My hypothesis here was that Old Norse had a soft sound with a similar place of articulation as –d*-. If that consonant were followed by an unstressed vowel, it would be reasonable to expect weakening or deletion over time6.

A quick check of the etymology supports this hypothesis — mor derives from Old Norse, móðir, with –ðrepresenting a dental fricative (a th- sound) followed by an unstressed –i.

The plural of móðir is mǿðr. I suspect the reason that the rhotic sound persists in mødre is because it follows –ð directly, and both sounds have a similar place of articulation; reducing –ð offers little in terms of economy of speech.

To conclude, we can see that Norwegian does indeed preserve m-d*-r. And where it doesn’t, we can at least produce a plausible hypothesis as to why.

Stems Once Again: Uncovering the Sounds of мать

By now, there’s little point lingering on details. Let’s run through this one quickly.

мать has the characteristic initial m-, and a final -t– sound, which is here palatalized. The stem of the word — the form to which case endings are appended — is матер-. The transliteration is mater-, which, obviously, features the characteristic pattern of m-d*-r.

And voilà — strong evidence for a relationship between these words that suggests a common source.

This is, by no means, an exhaustive analysis, and I’ve left out a lot for the sake of brevity.

But, again, the whole point of selecting this relatively easy case is in the interest of making this all a little more intuitive.

0. Consider unus, duo, tris, in Latin; odin, dva, tri, in Russian; en, duo, tria, in Attic Greek; en, to, tre, in Norwegian; ein, zwei, drei, in German; etc.

1. It’s one of those differences that’s slight, until you happen to be lucky enough to be a native speaker of the language or spend enough time holed up in your basement trying and failing to approximate it.

2. The letter m- can also be a syllabic consonant, as in spasm or in the suffix -ism. IPA indicates this with the letter m and an understroke, [m̩]. Alternatively, it can be a palatalized bilabial nasal consonant, as in

3. The other difference is that the -tt- in German is geminated, or doubled in length. In the context of Dutch vs. German, this is minor enough, but gemination in the context of PIE is an interesting and complex subject.

4. There’s quite a bit of variability here. Standard Dutch and some dialects of German use an alveolar tap or alveolar trill (the distinction between Spanish pero, meaning but, and perro, meaning dog), and many dialects of German use a voiced uvular fricative. I’m sure there are quite a few variants I don’t know about, though.

5. More properly, mødrene is the definite plural, and mødre is the indefinite pluralMødrene can’t typically stand alone, so the mothers has to be de mødrene.

6. I’m sure there’s a name for this process, but I haven’t found it yet. I’ll write about this sort of collapse later, and update this when I do.

Words, Words, Words

I heard recently that the French word for shirt (chemise) is etymologically related to the Arabic word with the same meaning, قميص (qami).

Now, word origins probably aren’t at the top of most people’s lists of exciting knowledge. But, they’re close to the top of mine, and exotic ones like this trump just about everything else.

But then, I thought for a second. Chemise in French; camisa in Spanish; camicia in Italian. Probably . . . Not an Arabic thing.

The Latin Source of Chemise

The French chemise has cognate terms in Spanish (camisa) and Italian (camicia), and just about every source I can find agrees that all (probably) derive from Late Latin camisia, meaning shirt (nightgown, in Classical usage).

The derivation of Spanish camisa from camisia speaks for itself: The only phonetic difference to account for is the transformation of the final diphthong –ia to –a, which hardly requires justification. Say it five times fast and see if you don’t drop the i yourself.

Italian camicia is almost identical orthographically, but with soft -c in the final syllable (like cherry) rather than the universally hard –c of Classical Latin (like castle). As in many languages, Italian has soft –c before front vowels (e and i, most significantly) while maintaining hard –c before back vowels, due — I believe — to details of place of articulation.

Of this etymological triptych, the French chemise is the most distant from the Latin camisia. In the interest of brevity, I won’t linger on the transformations that have taken place to produce the latter, but do note that similar patterns exist in other French words of established Latin origin: Château from Castellum, for instance1.

I think it hard to argue that the Latin isn’t the immediate source of these three words. But, as always, I’m open to rebuttals.

Arab(ic) Roads Into Europe(an Languages)

My initial hypothesis in favor of the Arabic source of chemise was that Andalusi Arabic had somehow corrupted (presumably classical) qami, possibly by feminizing the word to produce qamiṣah (قميصة), for whatever reason.

Given the drastic differences we see between modern Arabic dialects these days, this isn’t that far-fetched, and it would explain the near-identical form of camisa that occurs in Spanish.

But, I concede that it is a bit further out than accepting camisia as the ultimate source of chemise. Still, it’s good to get as close to proving things as possible, so I tried a historical approach to eliminating Arabic as candidate for the word’s origin.

Interestingly, both Old Portuguese and Old French had the same forms of the word that the modern incarnations of those languages maintain — camisa in the former, as in Spanish, and chemise in the latter.

Unfortunately, neither language was recorded until well after the Muslim conquest in Spain, so even if I had been able to find the reputable sources for first usage that I’d at first set out for, I wouldn’t have been able to rule out Arabic as a potential source on the basis of date of first appearance alone.

Also unfortunate is the fact that the forms of Vulgar Latin, being Vulgar, aren’t exactly well-recorded, what with the Dark Ages being so . . . Well, dark2. Sadly enough, the historical record doesn’t do much good here.

But geography helps a bit. Chemise/camisa/camicia are clearly related terms. If Arabic were the source of camisa, French chemise and Italian camicia would probably derive from the Arabic only indirectly, through Spanish.

Given the sheer distance of French- and Italian-speaking communities from most of Muslim Spain, linguistic factors notwithstanding, such diffusion seems unlikely enough to justify rejecting an Arabic source in favor of a derivation from Latin.

This is the view espoused by a number of well-established dictionaries. The Collins English dictionary suggests a Celtic root. Wiktionary suggests Transalpine Gaulish, with an ultimately Germanic origin. The Online Etymology Dictionary and American Heritage entries agree, as do I. Neither Random House nor the Oxford Dictionaries site go any further than Latin.

Case closed?


Stepping Back Further: Ugaritic Roots

A distinguishing feature of the Semitic languages is their root system. For brevity’s sake, I’ll mention only the essentials.

Semitic languages, such as Arabic, Hebrew, and Ugaritic3, derive many of their words by superimposing vowel patterns on a set of consonants, whose order is fixed.

This is easier to understand with examples. In Arabic, the root consonants mlk are associated with ownership, rule (in the sense of reign), etc. In the Classical language4malik- means king; muluk- means kings; malaki means royal; milk– means property; mulk- means kingship; mamlakat- means kingdom; and the list goes on.

A crucial point is that many Semitic languages have many of these roots in common. In Hebrew, mlk is also associated with ownership and rule; melek is the word for king. In both Classical Arabic and Hebrew, the roots rbb are associated with lordship; in both languages5rabbi means my master.

The Arabic qami has, as its roots, qms, associated with the concept of covering, or enveloping. In Hebrew, one finds the closely related root q-m-ts (קמץ). This is associated with a clenched hand, or something enclosed within it.

I’ve seen it suggested that Ugaritic has a related root, qmṣ, meaning ‘garment’, that supplied Late Greek with the word καμισίων (kamisíôn), with the same meaning. Let’s assume that’s the case, for the sake of argument.

Ugaritic was spoken in modern-day Syria, where both the Greeks and Romans had a prominent presence6. This is significant, as it provides us with several centuries of time in which Ugaritic could have influenced Greek-speaking populations. I don’t know how likely it is that the root supplied the Hellenes with kamisíôn, but, I suppose it’s possible.

I found the word in this collection of fragments of Clemens of Alexandria‘s work (alternate link). His traditional dates are 150-215C.E., well before the Muslim conquest of 711.

I haven’t yet pinned down a first attestation for camisia, but — assuming this roughly 2nd century occurrence of kamisíôn is near to its first occurrence — it would be compelling if it followed these dates.

If so, French very possibly could chemise not from Arabic, but from one of its Semitic cousins.

Some linguists point to the PIE (Proto Indo-European) root of *kam- as the ultimate source of camisia, and propose that Arabic got qamis from the European languages.

I find this unlikely, due to the relative conservatism of Semitic languages. Hebrew, Akkadian, Assyrian, and Ugaritic have isomorphic roots with related meaning; it would be strange for Arabic not to have a native word built off the same pattern.

Further, the fact that kamision and camisia share the same root consonants7 as the Semitic terms is compelling.

Words that are purported to be related to camisia via *kam– include hemedi (Old High German) and hemeþe (Old English).

I’m entirely unfamiliar with Celtic languages. As such, the possibility of the transformation (h-t*-) –> (c, s) is not something I can really comment on.

But, at first blush, it seems less likely than the Ugaritic –> Greek –> Latin –> Romance flow — but that might just be the part of me in love with the exotic talking.

None of this is to be interpreted as rigorous scholarship — just thoughts on the issue. I’ll update this in the future as I gain more familiarity with Latin, Greek, and the Ancient Semitic languages. With luck, I’ll happen upon a solution.


1. The ^ symbol, called the circumflex accent (l’accent circonflexe) in French orthography or a caret in general typography, indicates that the vowel it marks was once followed by an s. This usage makes sense when you consider the meaning of the word caret in the first place: It is a form of the Latin verb carere, meaning to lack.

2. I use the term a bit tongue-in-cheek:

“The stereotype of the Middle Ages as “the Dark Ages” fostered by Renaissance humanists and Enlightenment philosophes has, of course, long since been abandoned by scholars.” ~ Ralph Ralco

“Historians and archaeologists have never liked the label Dark Ages . . . ” ~ Christopher Snyder

3. This is obviously not an exhaustive list, but these are the members of the Semitic language family with which I’m most familiar. For more information on the Semitic root system, see this document from the American Heritage Dictionary and this page from the Ancient Hebrew Research Center.

4. By Classical Arabic, I am referring specifically to the language of the Qur’an, as presented in W.M. Thackton’s An Introduction to Classical and Koranic Arabic. The book is available for viewing in PDF format at this link. The examples regarding the root mlk can be found on page 22. The hyphen at the end indicate that these are ‘stems’, to which inflectional endings denoting possesive, nominative, genitive, or accusative function are added.

Many of these words are by no means restricted to Classical usage, however. In Egypt, the phrase for heads or tails is malik wala kitaabah, meaning king or writing. Probably, this has to do with the fact that the first ancient coins of Egypt featured a face on one side and writing on the other.

5. Cf. the familiar Hebrew word rabbi with the second ‘ayat (verse) from the Opener of the Qur’an: Alhamdullilahi r-rabbi l-alameen, loosely translated as All praise due to Allah, Lord of all the worlds.

6. Greek is still spoken in some parts of Lebanon and Syria.

7. Two points here. First: While kamision and camisia share a similar vocalic pattern, it would be premature to conclude that the former shares that pattern with its Semitic source (if, of course, it has one). Ugaritic, like most Semitic languages, did not record its vowels. As it’s no longer spoken, it’s probably impossible to go further than the consonantal root.

Second: kamision appears to have kmsnbut in fact, the final -on is a characteristic ending of neuter nouns. I’m not entirely up on my Greek, but if the word does indeed have a Semitic source, it’s probable that -on (or –ion) was added simply to facilitate inflection. As such, the presene of the final –n might not serve to discredit the Ugaritic hypothesis.