Tesяfkǝm: A Constructed Language (S11)

Phonotactics


Alphabet

Tesяfkǝm is written with two alternative scripts. The one that is used in this document, the 'romanisation', is mainly the Latin alphabet, with additions from Greek, Cyrillic, Armenian, and Runic alphabets. (The DejaVu fonts are well-suited for viewing the romanisation in decent beauty.) The native script is a conscript yet to be developed.

The alphabet of the standard romanisation uses the following letters:

A Я B D Δ E Ω F G Չ H I Ա Ь K L Λ Վ M N Ŋ Ն O Ǝ P Q R S T Þ U Ю V Ъ X Z
a я b d δ e ω f g չ h ı ա ь k l λ վ m n ŋ ն o ǝ p q r s t þ u ю v ъ x z

Due to the sandhi and vowel harmony, it is not wise to use this as the primary sort order in lexicons. Therefore, the order is as follows, where letters separated by only spaces are assumed equal in sort ordering. There are only few categories of letters regarded separate by primary sort order (the bold letters are the labels/headlines used in a lexicon for the whole group):

a я open vowels
< b f h mb p v labials
< d δ nd s t þ z alveolars
< e ω ı ա o ǝ u ю non-open vowels
< k g չ ŋ g նչ r q x velars/uvulars
< l λ վ liquids
< m n ŋ ն nasals

ь and ъ are ignored in primary sort order. When words are equal according to this sort order, secondary sort order is used as given in the space separated sublists (it is equal to the order given in the first list above).


Vowels

There is no phonemic length.

front back
unrounded rounded unrounded rounded
close ı ю ա u
[i] [y] [ɯ] [u]
mid e ǝ ω o
[e] [ø] [ɤ] [o]
open я a
[æ] [ɑ]

Diphthongs

Before consonant, <aվ> is pronounced as [a͡u].

Maybe we'll get <uo> and <юǝ> from labialisation sandhi.

Consonants

labial dental alveolar palatal velar uvular glottal
vd. vl. vd. vl. vd. vl. vd. vl. vd. vl. vl.
stop b p d t g k չ q
[b] [pʰ] [d] [tʰ] [ɡ] [kʰ] [ɢ] [qʰ]
fricative v f δ þ z s r x h
[v] [f] [ð] [θ] [z] [s] [ʁ] [χ] [h]
nasal m n ŋ ն
[m] [n] [ŋ] [ɴ]
liquid l λ վ
[l] [ɬ] [lˠˌ ʟ]
approximant ь ъ, վ
[j] [w]

The voiced stops are phonetically realised as prenasalised voiced stops if they are isolated. If preceded by another voiced consonant, the prenasalisation is lost. This is reflected in the orthography where /taba/ (< <ta> + <pa>) is pronounced [tamba] and written <tamba> but /talba/ (= <ta> + <l> + <pa>) is pronounced [talba] and written <talba>. This leads to switches between <mb> and <p>, <nd> and <t>, <ŋg> (which is phonemically /ɡ/, but phonetically [ŋɡ]) and <k>, <նչ> (/ɢ/ and [ɴɢ]) and <q>.

/l/ is devoiced after voiceless consonants and fuses with adjacent <h> and is then realised as [ɬ]. A standalone phoneme /ɬ/ also exists and is written <lh> in ASCII or 8859-* or <λ> in Unicode. In Unicode, the velar allophone is also distinguished and written <վ>. In ASCII, it is not distinguished from <l>. It is triggered before non-alveolar voiced (phonemic) consonants (e.g. <վb>, <վg>) and before uvulars (e.g. <վq>).

<վ> is pronounced [u] after /a/ to form the diphthong [a͡u]. Otherwise, it is either [lˠ] or [ʟ] depending on dialect.

<m> and <ŋ> are the same phoneme: <m> is used before rounded vowels, <ŋ> before unrounded vowels.

<f> and <h> are the same phoneme: <f> is used before rounded vowels, <h> before unrounded vowels.

<k> and <q> are closely related and often mutate into oneanother. The precise rules are depicted in tables, e.g. for the sandhi rules.

Many dialects have [x] and [ɣ] as allophones of /χ/ and /ʁ/. The constraints that trigger these phones are then the same as for [k] vs. [q].


Syllables

Syllables are C(C)V(C).


Vowel Harmony

Tesяfkǝm has three different vowel harmonies, mostly working left-to-right, i.e., in forward direction. For the few language's prefixes, the harmony rules work right-to-left, i.e., backwards.

H1 Harmony: front/back

Verbs are subject to this harmony. Vowels in such stems are marked with an acute accent when shown in isolation, thus before the harmony is applied.

Shown As Realisation
front back
<í> <ı> /i/ <ա> /ɯ/
<ú> <ю> /y/ <u> /u/
<é> <e> /e/ <ω> /ɤ/
<ó> <ǝ> /ø/ <o> /o/
<á> <я> /æ/ <a> /ɑ/

H2 Harmony: close-open

Clitics are subject to this harmony. Vowels in such stems are written with capital letters.

Affixes are subject to both H1 and H2. Vowels in affix stems are written with capital letters and acute accents.

Due to this harmony, clitics and affixes stems only use two principal vowel heights.

Shown As Realisation
Clitic close/neutral open
<I> <ı> <e>
<Ï> <ա> <ω>
<Ü> <ю> <ǝ>
<U> <u> <o>
Affix
<Í> <í> <é>
<Ú> <ú> <ó>

<a> and <я> do not participate in this harmony, they are neutral. For convenience, they will still be written <A> and <Ä> in clitics and <Á> in affixes, to indicate the morpheme type. When only <a> and <я> precede a vowel in that word, the neutral column applies in the above table.

H3 Harmony: rounded/unrounded

This is an implicit harmony obbeyed in all native stems and morphemes.

Further, this harmony rule is obeyed across compounding and stem-stem derivation boundaries (also see l-sandhi).

<a> and <я> do not participate in this harmony.

Inside stems and morphemes, H1 and H2 also apply, so the number of secondary vowels is quite limited: vowels in a stem are either the same or <a>/<я> (selected by H1). In isolation, these two vowel groups are written <ý> and <á>, respectively, just as if they were only part of harmony H1.

unrounded rounded
<ı> <ю>
<ա> <u>
<e> <ǝ>
<ω> <o>

Sandhi

C+C Sandhi

Between Words

Used between words. Note that at the end of words, <t> ↦ <s>, instead of *<t> ↦ <þ>. The $-column shows the effect at the end of an sentence.

+p +t +k $
(V)+ - p - t - k -
p+ f p f t f k f
t+ s p s t s k s
k+ x p x t x q/k x
Velar vs. Uvular

<q> is used before <a>,<o>,<ω>, otherwise <k> is used.

In dialects that have [x], all <x> except in <xq> are pronounced [x].

Word-Internally

Used when adding affixes, clitics, verbs. The +<s> and +<h> sandhi occur with some construct state stems and with affixes.

+p +t +k +s +h
(V)+ mb nd նչ/ŋg s ь/ъ
p+ hp px ps h/f
t+ tf ht tx ns þ
k+ kf hk ks x
Labialisation

<f> is used before <u>, <ю>, <o> and <ǝ>, otherwise <h> is used.

Epenthetic Glide

<ъ> is used before <u>, <ю>, <o> and <ǝ>, otherwise <ь> is used.

Velar vs. Uvular

<նչ> is used before <a>,<o>,<ω>, otherwise <ŋg> is used.

In dialects that have [x], all <x> not followed by <a>,<o>,<ω>,<q> are pronounced [x].

C+l+C Sandhi

Used when deriving/compounding verb+verb, noun+noun, verb+noun, or noun+verb. Again +<s> and +<h> is in derivations/compounds with some construct states. The result of such a derivation is always a construct state itself. Infixed -<l>- often voices a cluster and then drops.

+l+ +p +t +k +s +h
(V)+ վb ld վg/վչ z λ
p+ vd vg/vչ zb v
t+ δb δg/δչ zn δ
k+ rb rd zg/zչ r

Velar vs. Uvular

<չ> is used before <a>,<o>,<ω>, otherwise <g> is used.

In dialects that have [ɣ], all <r> not followed by <a>,<o>,<ω>,<q> are pronounced [ɣ].


Stems

Stems have the structure CV(CCV)*(C).

Initial and final consonants of normal stems must be voiceless stops. By the sandhi rules, these may mutate into other consonants.

Additional to that of normal stems, the initial consonant of construct state and suffix stems may be <s> or <h>.

The following 21 clusters are allowed inside stems:

<l>, <վm>, <ln>, <վr>, <վv>, <lz>, <վδ>, <m>, <mf>, <ml>, <mv>, <n>, <ŋr>, <ŋx>, <nz>, <nδ>, <nþ>, <xλ>, <fλ>, <sλ>, <þl>.

Due to the restricting vowel harmonies, the first vowel may be one of <a>,<e>,<ı>,<o>,<u> for noun and verb stems and <a>,<ı>,<u> for affix and clitic stems, while all the other vowels are either identical to the main vowel or are <a>. See the harmony tables for information about regular vowel changes.

The number of monosyllabic substantive stems is, therefore, 3*5*4=60: three initials (p,t,k), five vowels (a,e,i,o,u), and four finals(p,t,k, and nothing). Of these 60, there are 15 open syllables.

The number of disyllabic substantives is the number of monosyllabics, multiplied by the number of medial clusters, times 2 (there are only two choices for the vowel), so we have 2520. Of these, 630 end in on open syllable.

The language will probably be oligosynthetic, so the amount of mono- and disyllabic words will clearly be enough. Stem boundaries are clearly marked for their function by the sandhi rules, so no additional morphemes are used to mark boundaries between words. This makes the morphology well-suited for oligosynthesis. The monosyllabic stems will be used for very frequent morphemes.


Stress

Although not a phonemic one, words have a stress pattern. Stress can be used to support understanding in noisy environments. Words are stressed on each odd syllable (1st, 3rd, 5th, ...), except for the last syllable, which is always unstressed.

kap + paqu + tu ↦ kah pa qu þu
1 2 3 last

Tone

The language has no phonemic tone.


Prosody

continue

Foreign Words

Tesяfkǝm has a predefined set of letters set aside for writing foreign words, particularly names. Foreign words are written 'more or less' phonetically (not 'more or less' phonemically). This means phonetic writing without being overly precise (whatever that means). In normal Tesяfkǝm texts, using these letters and phonetic style is always preferred over other romanisation styles.

bilab. lab. dent. alv. postalv. retr. alv.-pal. pal. vel. uvul. phar. glot.
vd. vl. vd. vl. vd. vl. vd. vl. vd. vl. vd. vl. vd. vl. vd. vl. vd. vl. vd. vl. vd. vl. vl.
stop b p d t g k չ q '
fricative vh fh v f δ þ z s ž š ż ź ś rc xc r x h
nasal m mh n nh ṅh nьh ñ ñh ŋ ŋh ն նh

In order to embed foreign names into Tesяfkǝm morphology without destroying the self-segregation, all foreign names are prefixed with the stem <pan-> or <pяn>- followed by a vowel V0 that is not <a> or <я>. No native stem contains a sequence of such vowels. A foreign word is terminated with the suffix V0+<t> (thus repeating the same vowel), to which affixes, clitics, verbs and construct states may normally be attached. Adjacent vowels at the edges of a foreign word are avoided by the insertion of epenthetic glides (which are written as part of the markers, i.e., <pяnıь> or <pяnıъ>).

V0 is chosen to not be contained in the foreign word to ensure clean marking of start and end of the 'foreign word quotation'. The vowels are tried in the following order: <ı>, <ա>, <u>, <ю>, <e>, <ω>, <o>, <ǝ>. If the foreign word starts with a rounded vowel without any preceding consonant or with a labialised or labial consonant, the rounded vowels are preferred and the priority of vowels becomes <u>, <ю>, <ı>, <ա>, <o>, <ǝ>, <e>, <ω>. For front vowels, <pяn>- is used, for back vowels, <pan>- is used. In the very unlikely case that the foreign word contains all the above eight vowels possible for V0, some vowels in the foreign word must be adjusted to resolve the conflict.

Apart from these restrictions, foreign words need not obey any phonology constraints of Tesяfkǝm. It must be expected, of course, that many native speakers of Tesяfkǝm will fail to pronounce foreign names correctly (and typically just ignore various diacritics).

Foreign names are not capitalised in Tesяfkǝm romanisation.

More Rules

The fine art of Tesяfkǝm typesetting tries to obey the romanisation rules closely. Here are some more rules.

  • Long phones are represented by doubled graphemes.
  • Aspirated/breathy voiced stops are marked with a suffixed <h> for source languages that distinguish them from unaspirated voiced stops.
  • Unaspirated voiceless stops are marked with a suffixed <y> for source languages that distinguish them from aspirated voiceless stops.
  • Approximants and lax fricatives are represented as fricative marked with a suffixed <y>: 'Venlo' = <panu fyènlooъus>
  • Trills are represented as fricative with a dot below: 'Madrid' = <panu mâdẓıdus>
  • If a word contains a phoneme /r/ that has several different pronunciations in the source language depending on dialect and if no main or standard dialect can be identified and/or if this fact is important to be marked, then the grapheme <ṙ> is used: 'Groningen' = <panա xṙoonıŋênաs>
  • Glottalised and ejective consonants are marked with a suffixed <'>.
  • Labialised consonants are marked with a suffixed <ъ>.
  • Palatalised or palatal consonants are marked with a suffixed <ь>: 'Tagliatelle' = <pяnı tâlьâteleьıs>
  • Velarised or uvularised or velar or uvular consonants not available as a single grapheme are marked with a suffixed <c>.
  • Velar and uvular fricatives are only cleanly distinguished if the source language does that. Otherwise, <x> and <r> are used just like in Tesяfkǝm itself.
  • Lowered vowels are marked with graves: [ɛˌ ɔ] = <è>, <ò>.
  • Central vowels are marked with circumflexes: [ɨˌ ʉˌ ɜˌ ɵˌ a] = <î>, <û>, <ê>, <ô>, <â>.
  • <ê> is also used to represent a schwa [ə].
  • Other phones are represented by approximations.
October 28th, 2007
Comments? Suggestions? Corrections? You can drop me a line.
Schwerpunktpraxis