Casual Studies in Linguistics: The Semitic Languages

I created this page as a way of recording my own individual learning. The disclaimer is that it's written by a novice, not an expert. I found that there's a lot of information out there but it's not very visual, and it can be difficult to see the wood from the trees. I'm particularly interested in three things: commonalities (and of course differences too) between the Semitic languages, commonalities between the Indo-European languages, and the historical development of alphabets.

The Semitic languages include: Biblical Hebrew, Aramaic, Classical Arabic (the language of the Qur'an), Sabaean, Modern Israeli Hebrew, Modern Standard Arabic, Ge'ez and Amharic (two Ethiopian languages) and Maltese.

Consonant Equivalences

22
21
23
21a
15
20
19
27
26
18
17
28
16
14
13
12
11
10
9
24
8
25
7
6
5
4
3
2
1
Alphabetical Order
t
sh
th
sh
s
r
qk
dhtz
d
ts
p
gh
ʻ
n
m
l
k
y
t
kh
ch
dh
z
vv
h
d
jgˇ
b
ʼ
Transcription
ت
ش
ث
س
 
ر
ق
ظ
ض
ص
ف
غ
ع
ن
م
ل
ك
ي
ط
خ
ح
ذ
ز
و
ه
د
ج
ب
ء
Arabic
ת‎
שׂ
 
ששׁ
ס
ר‎
ק
 
 
צ
פ‎
 
ע‎
נ‎
מ‎
ל
כ
י
ט
ח‎
 
ז‎
ו‎
ה
ד‎
ג‎
ב
א
Hebrew & Judeo-Aramaic
ܬ‎
 
 
ܫ
ܣ
ܪ‎
ܩ
 
 
ܨ
ܦ‎
 
ܥ‎
ܢ‎
ܡ‎
ܠ
ܟ
ܝ
ܛ
ܚ‎
 
ܙ‎
ܘ‎
ܗ
ܕ‎
ܓ‎
ܒ
ܐ
Syriac Aramaic
ת‎
 
 
 
 
 
 
 
 
 
פ‎
 
 
 
 
 
כ
 
 
 
 
 
 
 
 
ד‎
ג‎
ב
 
Soft
sth
 
 
 
 
 
 
 
 
 
ف
f
 
 
 
 
 
kh
 
 
 
 
 
 
 
 
dh
gh
v
 
The 29 Proto-Semitic Consonants.

The Semitic languages as a whole have 29 original consonant sounds. Actually, there are far more sounds than that. Every language, every dialect and every time period has evolved its own collection of sounds. But each sound can be placed into one of 29 categories. Shown above are some English letters that'll serve as labels for the 29 consonants. For now, don't worry about what they sound like. ch certainly isn't pronounced anything like the "Ch" sound in "Church", for example. Linguists think there was an original Proto-Semitic language (or at least a common melting pot) which then split off into many separate languages.

Here are the Arabic letters. The Arabic alphabet wasn't developed until quite late, the 3rd Century CE, but it distinguishes 28 out the 29 sounds with separate letters, so it's useful to have the Arabic alphabet in mind at all times. Actually, hamza (ء) isn't considered an independent letter, but nonetheless it is a symbol that adds the ʼ sound onto other letters such as و and ي.

And here is the Hebrew alphabet, which only contains 22 letters. However, even when the alphabet was first used the Hebrew language already included 25 different consonant sounds, ח standing for both ch and kh, and ש standing for both sh and sh. Following a line up or down a column can help us find cognates, instances where an Arabic word and a Hebrew word share the same historical origins. Sometimes the meanings of cognate words diverge over time and they come to mean very different things, and sometimes they still mean roughly the same thing (although sometimes with subtle culturally specific connotations added). Of course the Hebrew language and the Hebrew alphabet exhibit incredibly rich history, literature and discourse, and the purpose is not to attempt to reduce Hebrew to a fictional subcategory of Arabic, merely to provide learning aids for people interested in learning both.

There are some Hebrew letters whose boxes span more than one column.

Pronunciation

Simple, Pronounced as in English

Where there are two Hebrew letters in one box the lower one indicates the sofit form used at the ends of words.

Arabic Hebrew English
ب בּ b
ج גּ j
g
د דּ d
ه ה h
ة ת,ה h, t
و ו‎ w
v
ز ז‎ z
ي י y
ك כּ k
ل ל l
Arabic Hebrew English
م מ
ם
m
ن נ‎
ן
n
פּ p
ر ר‎ r
ش שׁ sh
س ס s
שׂ
ت תּ t
Simple consonants, pronounced the same as their English equivalents.
  1. The j sound is the Standard Arabic pronunciation that's used for Qur'an recitation. In parts of Egypt, Sudan, Yemen and Oman a g or gy (gyuh) pronunciation is spoken. In the Levant and parts of North Africa a hard z sound is used (as in genre). The Hebrew גּ is mostly pronounced as g, an exception being some Yemenite Jewish groups where it's pronounced as j. According to some sources ج was originally pronounced closer to g in Muhammad's time (peace be upon him). Whenever I use g in transcriptions I am referring to the hard g sound such as in get, rather than the soft g such as in gem, which is always transcribed as j.
  2. The h is voiceless, without the vocal chords vibrating, as in hot, rather than being voiced, as in behind.
  3. When a noun is in a grammatical form called the construct state (which is indicated by a diacritical mark called a tanwin) then ة is pronounced as t, otherwise it's pronounced as h. It's also pronounced as h if you pause for breath or stop speaking after it. Hebrew also has the concept of construct state but a word in the construct state has its final letter ה changed to a ת, whereas in Arabic the base letter remains as ه but has the two dots borrowed from the ت added to it, becoming ة.
  4. The w sound is used in Arabic and in Hebrew spoken by Yemenite and Iraqi Jews, and when preceded by an a the aw sound is spoken by some Italian Jews. Other Hebrew accents use a v sound instead.
  5. There are a variety of different ways of producing an r sound and any of them are accepted. The most ancient form is probably a rolled r. In Arabic it is necessary to have two distinct pronunciations for ر, one heavy and one light, see .
  6. Where an Arabic word and a Hebrew word share a common origin the s and sh sounds are often swapped, for example shemesh, שֶׁמֶשׁ, and shams, شَمْس. This table shows letters aligned according to pronunciation whereas shows their historical alignment.
  7. In early Biblical times שׂ was originally pronounced differently from ס (probably like the ll sound in Welsh). But by the time Ezra–Nehemiah was written there is evidence it had already come represent the same sound as ס. It's not normal practice to attempt to reproduce a distinction today. (Except in writing!)

Sounds Rarely Heard in English

Arabic Hebrew Transcription IPA Description
ء א ʼ ʔ Sound between the two words Uh-Oh!
After an i and before a t, as in bit.
Some accents that drop a t, as in butter
(though this shouldn't be confused with flapping).
Automatic in words beginning with a vowel.
خ כ kh x ch in loch (Scottish English).
(Most English people mispronounce this as k.)
ח kh χ ch in Bach
(in English, and for some German speakers).
ع ע‎ ʻ,
aa
ʕ Croaky throat sound.
Unusual sounds for English speakers.
  1. In some Hebrew accents ע‎ is pronounced as [ʔ], which is the same sound as for א, or alternatively as [ŋ], which is the sound that forms the "n" part of the -ing sequence in English, which is different to the n sound in other places such as in the word net. In fact, the sound of the letter ع (and ע in the places where there is a sound equivalence) sounds kind of "strange" and distinctly Non-European. In some accents both ע and א are completely silent.

Learning to hear the sound of א/ء can be tricky. An ear tuned to listening to English speech doesn't recognize this sound as a separate letter. Fortunately, every word in English that we write and think of as starting with a vowel actually begins with the sound of א/ء when we say it, followed by whatever the vowel is of course. So you can imagine א as being a silent letter when it's at the beginning of words and you'll still pronounce them correctly with the א, just the same as if you got your head in spin trying to think of the consonant and the vowel that follows as separate things!

The letter خ has two permissible pronunciations that are similar but distinct. I've used the International Phonetic Alphabet (IPA) to differentiate them. The IPA is a collection of symbols which can represent the sounds of any language. The [χ] pronunciation is made by lifting the back of the tongue and breathing as if you're trying to remove phlegm. The [x] sound is like the pronunciation of غ (to be discussed in ) except that it's unvoiced. You can learn about voiced and unvoiced sounds by placing your fingers on your voice box and pronouncing the letters s and z. The z sound is voiced and you should be able to feel your vocal chords vibrate, whereas s is an unvoiced sound, without vibration. The [x] pronunciation has evolved into the [χ] pronunciation that is spoken by most modern speakers, although the [χ] pronunciation may have existed even before the period when the [x] pronunciation was widespread.

From the Septuagint scholars have argued that some words containing ח were at that time pronounced as [χ] but others were pronounced as [ħ] (). This distinction was forgotten and present day Hebrew speakers consistently use one of [ħ], [χ] or [x] for all words (it varies between different Jewish traditions). The כ may be heard spoken as either [x] or [χ] (traditionally [x]).

The Septuagint is a Greek translation of the Bible, translated in stages between the 3rd Century BCE and 132BCE for the library of Alexandria in Egypt, a place where there was a strong Jewish community at the time. The hypothesis about the different pronunciations of ח was made by looking at which Greek letters were chosen to represent the sounds of the names of people and places.

Begadkefat (Hard and Soft Letters)

In Hebrew there are six letters which can have two pronunciations each. The correct pronunciation is indicated by the presence or absence of a hardening dagesh (dot). The usual practice is to distinguish at least three hard and soft pairs. The other three pairs are less commonly distinguished, some speakers pronouncing the soft versions as though they were the hard ones. However, even the ב/בּ (b/v) distinction is not fully universally distinguished (and in some accents the ב may be an intermediate sound, [β] in IPA). The dagesh is part of the pronunciation rules of the language. For example, if a word which usually begins with a letter with a hardening dagesh comes after a word that ends in a vowel then, in some circumstances, it will lose its hard sound and adopt the soft pronunciation.

On the Arabic side of things, there are ten fully independent letters in this section. The sounds will still have hard or soft qualities but understanding those features is not required to understand Arabic. There is, for example, no special relationship between ك and خ. The pairs د and ذ and ت and ث do look visually similar however. This is because both Hebrew and Arabic use adaptations of two alphabets which were originally both used to write the Aramaic language, which, like Hebrew, has the soft-hard distinction.

Soft Hard
Arabic Transcription Example Hebrew Hebrew Transcription Arabic
v vet ב בּ b ب
خ kh loch
(Scottish pronunciation,
)
כ
ך
כּ
ךּ
k ك
ف f fed פ‎
ף
פּ
ףּ‎
p
Hard and soft pronunciations, common distinctions.
Soft Hard
Arabic Transcription Example Hebrew Hebrew Transcription Arabic
غ gh See ג‎ גּ gˇ
j
ج
ذ dh the ד דּ d د
ث th,
s
three ת‎ תּ t ت
More hard and soft pronunciations. Presence or absence of a distinct soft pronunciation varies by Hebrew accent but is always required in Arabic.
  1. Occurs only rarely (once in the entire Bible). Unlike other letters that have final (sofit) forms which must be used consistently at the ends of words it's acceptable to use פּ even at the end of a word.
  2. Some Ashkenazi Jews use an alternative pronunciation like an English s.

Heavy and Light Letters

Some sounds can be modified to create both heavy and light versions. There are seven pairs of heavy and light letters and two additional Arabic letters that are sometimes heavy and othertimes light without changing the letter's appearance.

There are four different methods which can be used to make a letter heavy. It helps to make the right sound if you visualize where it's coming from. To some extent the techniques are interchangeable. The main objective is to create some contrast between the heavy and the light letters. Nonetheless in the interests of precision the "proper" techniques are listed in . A secondary aspect to making a good heavy or light pronunciation is to make the lips spread wide (like a smile) for the light letters and rounded (like a w) for the heavy letters.

Velarization
Uses the top part of the back of the mouth. Both English and Arabic have l with and without velarization. If you say the words lips and full then you may notice two different sounds and different feelings in the mouth. However, some heavy accents outside of England only have the heavy l, and some speakers use a different technique involving the teeth to produce heavy l (i.e. not velarization in some accents). The sounds for g, k and ng also come from the velar region of the mouth.
Pharyngealization
Performed by tensing up the lower part of the back of the mouth.
Uvularization
The uvular is located part way between the velar and pharyngeal parts.
Glottalization
Performed by pinching in a muscle located inside the voice box, in the neck.
Heavy Light
Technique Arabic Transcription Hebrew Hebrew Transcription Arabic
Uvularized or
Pharyngealized
ص ts,
tz
צ
ץ
ס s س
שׂ
Uvularized or
Pharyngealized
ض d דּ d د
Uvularized or
Pharyngealized
ظ dh,
z
ד‎ dh ذ
Uvularized or
Pharyngealized
ط t ט‎ תּ t ت
Uvular ق k,
q
ק‎3 כּ k ك
Uvular [χ] or
Velar [x]
خ kh כ‎ ה h ه
Uvular or
Velar
غ gh ג‎ ע‎ ʻ,
aa
ع
Heavy and light letter pairs.
  1. צ is alternatively pronounced as a t rapidly followed by and blended together with an s (both light) in some Hebrew accents, a method of articulation called an affricate consonant. The same ts sound occurs in English words ending in ts, such as cats, and in the word pizza. Some experts think the original Proto-Semitic sound was both heavy and affricate. English words frequently make use of two more affricate consonants: ch is a combination of t and sh; and j is a combination of d and a hard z. (The hard z is found in words like genre, vision, equation and seizure.)
  2. ظ is a heavy version of ذ, the th sound in the. It is not supposed to be a heavy version of ز (z) in Standard Arabic (though it can be this way in Levantine Arabic) even though the most common English alphabet transcription of ظ is as z. The transcription is misleading so far as Standard Arabic is concerned.
  3. In some accents these are pronounced as light sounds, exactly the same as the corresponding light letters.
  4. Pronounced as g in some dialects (same as גּ), which is also a velar sound but unrelated to ע‎, ع and غ.
Technique Arabic Transcription When is it Heavy? When is it Light?
Velarized ل l Heavy case:
ٱللّٰه
Except:
ٱللّٰه ◌ِ
ر r Light cases:
رِ
رْ◌ِ
رْ◌ْ◌ِ
The 2nd and 3rd cases don't apply if the رْ is followed by one of the letters in .
Arabic letters that are sometimes heavy and sometimes light.
Articulation Point Arabic Transcription /
English
Hebrew
Velar g גּ
Velar ك k כּ
Velar ng
Pharyngeal ع ʻ,
aa
ע
Pharyngeal ح ch
(IPA: [ħ])
ח‎
Glottal ء ʼ א
Other velar, pharyngeal and glottal sounds (not described as heavy by Muslim scholars).
  1. ח/ح/[ħ] is also an unvoiced version of ע/ع. See also, for alternative pronunciations and transcriptions of ח.

Silent Letters

Letters When is it silent?
ٱ After a vowel.
(Otherwise i).
ٱلْ Before a sun letter at the beginning of a word.
(Otherwise Al-)
أُ
أَ
إِ

Always silent.

The silent ا acts as a "support" for the hamza,
which has a glottal stop sound.

One of the 3 vowel signs is also included.

Mater lectionis letters (see below).

There are cases when the letters ي ,و ,ا and ה ,י ,ו ,א combine with a mark above, below or to the side of the letter before (or the same letter) to signify a vowel sound rather than their usual glottal stop (or silent), w (or v), y and h sounds. This is called mater lectionis (Western term) or madd (Arabic).

However, sometimes when these letters are technically silent because of mater lectionis our brains may still interpret them as their usual consonantal sounds, a y sound for example. The vowels being constructed in these cases are things like ah and ey, and the letters h and y act in a similar way in English they do in Hebrew. Nonetheless, the y in prey is noticeably different from the y in yam. The word blah is another example. There are other combinations which are not used for this purpose in English however. For example, English doesn't use uw to represent oo.

ا is the only letter in the Arabic alphabet which is only ever used to either indicate a vowel or as a silent support for a hamza, but it's never a consonant letter on its own.

Doubled Letters

When an Arabic letter has a shaddah (small ω) written above it, for example سّ, or a Hebrew letter has a dagesh (dot) in its centre, for example שּׂ, then the letter may be doubled. The shaddah always indicates doubling. In Hebrew the situation is a bit more complicated because, as previously mentioned in , a dot also indicates a hardening dagesh when it's placed inside one of the begadkefat letters (תּ ,פּ , כּ ,דּ ,גּ, בּ). And as a third function a dot can also indicate something called a mappīq.

When reading:

  1. When a begadkefat letter has a dagesh it's always pronounced with its hard pronunciation (b, g, d, k, p, t).
  2. When a begadkefat letter with a dagesh comes after a full vowel (not a vocal sh'vā or chatāf vowel) then it's pronounced both hardened and doubled, except when it's the last letter of a word in which case it's not doubled, for example אַתְּ.
  3. When a begadkefat letter with a dagesh follows straight after another consonant sound then it's pronounced hard but not doubled.
  4. When a letter that's not a begadkefat letter nor א , ה or ר has a dot then it's a doubling dagesh.
  5. When a ה or א has a mappīq dot (חּ , הּ) then it's pronounced as a consonant even though it's in a position within the word where ה or א would normally be interpreted as a mater lectionis (silent) letter serving to lengthen the preceding vowel.

When writing:

  1. When applying a prefix to a word that begins with an begadkefat letter the dagesh will usually be dropped (e.g. בֵּיתֶךָ בְּבֵיתֶךָ) and become pronounced with softer pronunciation (maximally v, gh, dh, kh, f and th or s, although most accents don't distinguish between gh and g, nor between dh and d).
  2. A begadkefat letter with a dagesh after a vowel occurs:
    1. When applying a grammatical prefix where the grammar rule specifically calls for adding a doubling dagesh, for example the הַ־ definite article prefix ("the").
    2. When it's part of the basic make-up of the word itself, for example שַׁבָּת (shabbāt).

    In both cases the dagesh will be both hardening and doubling.

  3. A begadkepfat letter which comes after a consonant without a vowel usually has a hardening dagesh.
  4. When a word normally begins with a begadkefat letter but the previous word ends with a vowel then the begadkefat letter may or may not end up having a dagesh, depending on whether the cantillation (rhythm) is conjunctive (continuing) or disjunctive (with separation). Cantillation may be indicated using a system of cantillation marks although these are often not written. A maqqāf (hyphen) also indicates conjunction, in which case there is no dagesh.
  5. The letters ה ,ח ,ע and א never have a dagesh, and ר almost never has one.
  6. A word cannot end with a doubling dagesh. This results in some words where a dagesh only appears once a suffix has been added, for example when עַם ("nation", "people") becomes עַמִּים in its plural form.

Ordered According to the English Alphabet

Each cell in the "Transcription" column corresponds to a unique sound and the Arabic and Hebrew letters have been fitted around them. The transcriptions d, g and t can represent more than one sound and which one is pronounced varies by Hebrew accent. The leftmost column provides a rough grouping based on the English alphabet (e.g. "The H-like sounds") but it isn't specific enough to provide adequate transcriptions.

Arabic Transcription Hebrew
Alef
(Hamza)
ء ʼ א
ʻayn ع ʻ,
aa
ע‎
Ghayn غ gh g ג‎
B ب b בּ
D
DH
د d דּ
d ד
ذ dh
ض d
ظ dh,
z
F ف f פ
ף
G g g גּ
g ג
H ه h ה
ح ch,
h
ח‎
خ kh
kh כ
ך
J ج j
K ك k כּ
ךּ
L ل l ל
M م m מ‎
ם
N ن n נ‎
ן
P p פּ
ףּ
Q ق q,
k
ק
R ر r ר
S
SH
س s ס
s,
ś
שׂ
ش sh,
שׁ
ص ts,
tz
צ
ץ
T
TH
ت t תּ
t ת
ث th
ط t ט
V v ב
ו‎
W و w
Y ي y י
Z ز z ז‎

Representing Foreign Words

English Cognate with Persian
ch چ
g ج j گ
p f پ
v ب b و Iranian
Persian
ڤ Kurdish &
Tatar
Hard z /
French j
ج j ژ

In Arabic, the two letter sequence تش is used to write ch in words borrowed from other languages.

English /
Arabic Transcription
Arabic Cognate With Geresh System Judeo-Arabic Yiddish Without Dagesh With Raphe
ch צ׳ טש
j ج גּ g ג׳ ג דזש
w و ו v / w ו׳ ו
Hard z /
French j
ج‎ גּ g ז׳ ג זש
dh ذ ז z ד׳ דׄ ד דֿ
kh خ ח h / kh ח׳ כׄ כ
ח
כ כֿ
gh غ ע‎ ʻ ר׳
ע׳
ג‬ׄ ג‬ ג‬ֿ
s ص צ ts / s ס׳ צ
d ض צׄ
dh / z ظ טׄ
th ث שׂ s ת׳ תׄ
ת֒
ת תֿ
  1. Hard‑z / French‑j is an alternative pronunciation of ج in Levantine and Maghrebi countries.

Further Notes on Judeo-Arabic

Transcription Arabic Hebrew Judeo-Arabic
b ب בּ ב
k ك כּ כ
sh ش שׁ ש
s س ס
שׂ
ס
aa,
a
ىٰ הָ א Early
י Classical
ה Late/Modern
Some more Judeo-Arabic reuse of Hebrew characters. The remaining Arabic characters are mapped to Hebrew ones trivially.

References

  1. Benjamin Hary. Adaptations of Hebrew Script. In: Peter T. Daniels and William Bright (Eds.). The World's Writing Systems. Oxford University Press. 1996. (alternative link)
  2. Geoffrey Khan. Notes on the grammar of a late Judaeo-Arabic text. In: S. Shaked et al. (Eds.). Jerusalem Studies in Arabic and Islam. Hebrew University of Jerusalem. 1992.
  3. Avraham Ben-Rahamiël Qanaï. (2008). unicode font for judeo-arabic. [online] Google Groups, Jewish Languages. [Accessed 31 Jan. 2018].
  4. Thomas Schorreel. An edition and translation of T-S Ar. 54.63 with a grammatical analysis of the text (Masters Thesis). Universiteit Gent. 2011.

Obselete English Letters

Name Orthography Sounds (IPA) Modern Spelling
(Lower Case)
Upper
Case
Lower
Case
Long S S ſ s s
ſs ss
Double U VV vv
uu
w w
Wynn
Ƿynn
Ƿ ƿ
Thorn
Þorn
Þ þ θ (normally)
ð (between voiced sounds)
th
Eth
Ð ð
  1. The regular s was used at the end of words. Long-s also developed into the integral sign and the usage of / to represent the word "shilling" or "shillings". In German, a "Sharp S", ß, represents a similar phenomenon.
  2. Doubled v was used at the beginnings of words and doubled u was used in the middle and at the ends of words. Originally, u and v were stylistic variations of the same letter.
Return to main page