Chapter 2
SOUNDS AND WORD-FORMS

2.1: First Impressions 64
2.2: The Design Problem 65
2.3: The Size of the System 66
2.4: The Six Vowels 68
2.5: The Five Regular Vowels /a e i o u/ 68
2.6: The Vowel /y/ 70
2.7: The Seventeen Consonants 73
2.8: Stress 78
2.9: Pause 80
2.10: Intonation 83
2.11: Three Kinds of Words 84
2.12: Structure Words 85
2.13: Names 88
2.14: Predicate Words 91
2.15: The Varieties of Predicates 94
2.16: Djifoa ("Affix") Shapes 97
2.17: Predicate Joints 98
2.18: Resolving Words 100
2.19: Summary 103

2.1 First Impressions

When you first hear Loglan spoken it will probably remind you of Spanish or Italian. It has the same short list of curtly spoken vowels, and the same rhythmic alternation between consonants and vowels dominates the speech-flow. Spoken slowly Loglan has the same strongly marked pattern of stresses and pauses that characterizes these Romance languages. Yet at the same time, and like them also, it is also capable of being spoken with great speed, the effect then being of long staccato bursts of evenly stressed syllables.

Like these languages, too, we expect that the sound system of Loglan will be very easily mastered. No new sounds are required of a speaker of English, for example (with the exception of the allophone of h in syllable-final position, added in the 2013 reimplementation), and its word-forms are extremely simple and regular. Some of the rarer consonant combinations will seem odd to an English-speaking learner but none will be difficult to either hear or pronounce. The biggest difficulty for the English-speaker will be in learning to read; for some of the letters of the English alphabet are used differently in Loglan. But since spelling in Loglan is strictly phonemic--each instance of what we perceive to be the "same sound" will always be represented by the same single letter--even this difficulty is more apparent then real.

It should also be noted that the resemblance to Romance languages can be overstated. Loglan permits some hard consonant clusters, such as that in likcke, "week", which might be hard for speakers of Romance languages, though not usually for English speakers.

In short, and on first acquaintance, spoken Loglan will have a natural, familiar air. Appearances are deceiving, however. Loglan is in fact a very strange language; even its sound-system has properties which no natural language has. One of these is that no matter how swiftly spoken, any string of Loglan sounds will automatically resolve into a unique string of separate words. Moreover, it will have this property even for a newcomer who does not know these words. Some natural languages exhibit this useful property in some degree but none possess it perfectly. It will be the central theme of this chapter to show how this peculiar advantage for the learner is built into the sound-system of Loglan.

2.2 The Design Problem

The scientific plan for Loglan required that it incorporate a maximum amount of phonetic recognizability into its basic words.¹ This meant devising a sound system in which to build those words that would provide good phonetic approximations of the major distinctions, at least, that occur in the most widely spoken natural languages. At the same time it was hoped that a phonetic system that could do this would remain modest in size, regular in pattern, and present no insuperable pronunciation difficulties to speakers of any widely-spoken language. The system that resulted was a compromise; for it contains some sounds that are definitely not universal, and some sound clusters that will be difficult to pronounce for native speakers of some languages. But the inclusion of these sounds and sound patterns has added notably to the regularity of the language, or to the over-all recognizability of its word-stock, or to both.

The languages selected in 1955 to measure recognizability in Loglan were the eight then most widely spoken languages in the world: English, the Beijing dialect of Chinese, Hindi, Russian, Spanish, French, Japanese and German. At the time, native and secondary speakers of these eight "source" languages comprised nearly 80% of the population of the earth. Today, they constitute a good deal more. Our source population was limited to speakers of just these eight because the trial inclusion of the ninth language, Arabic, made no significant improvement in the average recognition scores of Loglan words and did complicate the word-building process. So no languages beyond the first eight were included.

In contrast, the lengthening of the originally very short list of Loglan speech sounds did make some difference in recognition. The inclusion in 1958 of the widely used consonantal sounds c and j--c is the English [sh]-sound ([ [s with caret] ] in the International Phonetic Alphabet) and j is the French 'j' of 'Jacques' and English 's' of 'vision' ([ [z with caret] ] in IPA) and spelled [zh] in the phonetic guides--increased the recognition scores of the new words made with them from an average probability of about .35 of being recognized to one of about .50, a very substantial improvement indeed. The sounds [sh] and [zh] occur in one form or another in all the source languages. They had been excluded from the earliest Loglan phoneme set because neither is uniformly represented by a single letter in the European languages. In particular, the letters 'c' and 'j' do not always spell these sounds. Representing them as c and j has therefore diminished readability just as using them has increased the recognizability of spoken words.

Another improvement in recognition scores--this time at no cost in readability--was made by popular demand in 1977 when h was added to the language. This time the increase in recognition scores was smaller: an average of about .05 probability points was added to the scores of words remade with h. The sound of h is not universal in the source languages, however, being missing from Spanish² and all but missing from French. Curiously enough the letter 'h' does occur in these languages, but is silent in Spanish and often pronounced with a "glottal stop" (a very short silence) in French. So either in sound or appearance, h was already familiar to the source population and so could be added at relatively small cost.

In 1982 the letter y was added. Four years later the last three letters of the English alphabet, namely q, w and x, were added to the Loglan set, but this has been reconsidered: it is clear that these letters were being purged from the vocabulary in the 1990's and they have been formally banished in the reimplementation since 2013.^R1

So Loglan sounds are, in the end, not quite universal. The sound of [h] does not occur in either French or Spanish. Even the original list included v and l; and there is no [v]-sound in Chinese, and neither [v] nor [l] occurs in Japanese. But the [r:l] contrast is absolutely necessary to distinguish a host of Indo-European word-roots, especially the Graeco-Latin ones used in science; and [v], while not important in itself, is the voiced equivalent of [f], which is widespread. In fact the chief reason for the inclusion of v was to preserve the symmetry of the phonological system, a feature that is now thought by linguists to be minimally conserved even under conditions of swift phonological change. Thus if Loglan is to have a stable phonological structure, it is obviously better to start with a symmetrical phonology than an asymmetrical, and hence potentially unstable, one.

2.3 The Size of the System

The phonological size of a language can best be measured by the number of its phonemes, for these are the elements which must be separately mastered before it can be spoken or understood. A phoneme is a class of sounds in a given language all members of which are perceived by the speakers of that language as instances of "the same sound." We shall see that what constitutes the psychological sameness of two sounds for the speakers of a language depends more on how these sounds are distributed in the utterances of that language than on their acoustic similarity. But however the boundaries of a phoneme are determined, it is important to remember that a phoneme is not one sound, or even an acoustic asymptote toward which speech tends, but a group of sounds which may include several which seem acoustically quite different to the foreign ear.

Loglan has 27 phonemes: 6 vowels, 17 consonants, 3 varieties of stress, and one juncture or pause. This contrasts well with the size of English, which by a similar count has 45 phonemes, and it falls at the low end of the human range, which lies between about 15 phonemes per language and about 85. Only Hawaiian is smaller than Loglan among the languages whose names you are likely to know.

Now in itself there is no virtue in smallness. The fewer the sound classes in a language, the longer its words must be; and there is certainly no virtue in long words. But it is also true that the fewer the phonemes of a language are, the larger the average phonetic distance between their sound values can be, and this last property, it turns out, does contribute greatly to intelligibility. Loglan is likely to be spoken in many quite different dialects. The mutual intelligibility of any two of those dialects will depend greatly on the readiness with which the phoneme boundaries of one dialect can be lined up with, or tuned in on, those of the other. In a language with numerous phonemic boundaries, such as French or English, this mutual realignment between dialects is difficult even for native speakers to make. The phonetic wavelengths are, so to speak, already too closely packed. This is probably the reason why dialects in phonemically large languages tend to be more difficult to understand than in phonemically smaller languages like Spanish or Italian. In a language like Loglan, which is almost as small as it can be, dialectical confusion can be expected to be at a minimum. There is, for example, only one class of /a/-sounds in Loglan. Therefore, it scarcely matters whether the letter 'a' is pronounced in Loglan as a Spaniard would pronounce it, or as a Frenchman or a Bostonian would. For there is no other class of Loglan sounds into which any of these national varieties of /a/ is likely to be sorted.

Throughout this book we will assume that it is the American English dialect of Loglan we are describing. This may not turn out to be the "standard dialect" of Loglan, if there ever is one. As we have already noted, Loglan has closer phonological affinities with vowel-rich languages like Spanish than with consonant-rich ones like English. (By a "vowel-rich" language I mean one with a high incidence of vowels in the speech stream.) Still, English, Russian, French, German, Hindi, Japanese, Chinese, Arabic and Swahili students of Loglan are invited to pronounce any of its phonemes in any way which is "natural" in their languages. Eventually there may be a standard Loglan r. But at the moment the differences between the French uvular /r/, the Spanish trilled one, the English vocalic /r/ and the Chinese fricative one, are simply not phonemic in Loglan. They are, in short, differences that make no difference in Loglan.

In the next few sections we shall describe the four classes of Loglan phonemes--vowels, consonants, pauses and stress--and then turn to the few simple rules that govern their occurrence in the speech-flow.³

2.4 The Six Vowels

The six Loglan vowels can be further divided into two groups: the five main vowels or regular vowels, a e i o u pronounced [ah eh ee oh oo], which may occur in any word of the language, and the sixth vowel, y pronounced [uh], which may occur only between the terms of a complex word--in this, it is rather like a hyphen--or, with more freedom, in names. We will devote Section 2.6 to the uses of this special vowel.

2.5 The Five Regular Vowels /a e i o u/

The five main vowels, which we call the regular vowels are the five most commonly heard human vowels. They are heard in all our source languages although with some variation from language to language. They are a [ah] as in 'father' (in the French and Spanish dialects it will probably be the higher-pitched [aa] of 'la' and 'casa'), e [eh] as in 'met', i [ee] as in 'machine', o [oh] as in 'note' and u [oo] as in 'lute'. They are all short, "pure" vowels with no trace of the diphthongalization that is so common in English. Loglan No, for example, sounds a little angry...it is so curt. In this, it is exactly like a Spanish or Italian 'No!' if you have heard one recently. It has none of the prolonged, softened, almost disyllabic effect of English 'No-ooo!' [NOH-oo]. The other Loglan vowels are similarly shortened. Note that i and u have values that are rare for these letters in English. But if you speak other European languages you will probably recognize that [ee] for 'i' and [oo] for 'u' are in fact the most common pronunciations of these letters on that continent. Note also that [ah eh ee oh oo] are in fact very widely spaced sounds. Replacement of one of these sounds with a variant found in another major tongue is not going to cause confusion: one must caution the English or Russian speaker, however, against replacing a final a (or other unstressed vowel) with the short 'a' of 'sofa' ([ inverted 'e' ] in IPA), something they are naturally likely to do, as this is the sound of Loglan y.⁴

The sounds of four of the main vowels vary systematically with the phonemic contexts in which they occur. When i precedes a vowel, as it does in the word ia (meaning 'yes'), it has the value of English 'y' as in 'yes'. This gives ia a sound that we could write comfortably in English as 'yah'; and [yah] in fact is how we will represent the word ia in the pronunciation guides in this book. Similarly, when u precedes a vowel, as it does in ui ('gladly'), it has the value of English 'w', and that makes this word sound like [wee]. (The gleeful note is not accidental.) Similarly, uo is 'woe' [woh], io is 'yo' [yoh], and iu is pronounced like English 'you' to which our guide will be [yoo]. In general [y] and [w] will guide our pronunciation of i and u whenever they occur before vowels.

These consonant-like uses of i and u are regarded as instances of vowel phonemes in Loglan, and not as separate consonants, because they are distributed complementarily with the more usual, vocalic values of these letters. Thus, the [y]-value of i never occurs before consonants and the [ee]-value of i never occurs before vowels. So from the point of view of the Loglan-speaker--whatever his linguistic background--they may be safely regarded as instances of the same sound. That they will actually come to be so regarded by your own ear is a prediction we can confidently make on the basis of our knowledge of similar effects in the natural languages.

A third vowel that experiences contextual variation is e. Before vowels e has the curt, tense sound of [eigh] in 'eight'([e] in IPA), which is the same sound the letter 'a' has in English 'late', 'mate' and 'sate' spoken briskly. We represent this value of e by [eigh] in the guides because 'eigh' is the only spelling of this sound in English that never means anything else. The sound [eigh] is always the value of e when it occurs before vowels...as it does, for example, in meo [MEIGH-oh], a Loglan word that is pronounced exactly like English 'Mayo'. Any e-initial vowel-pair in Loglan has the sound [eigh] in first position: [eigh-ah] and [eigh-oo] are the pronunciations of ea and eu. The [eigh] value of e never occurs before consonants.

Writing in 2022, we add a warning here. The choice of [eigh] as the representation for the alternative pronunciation of e is unfortunate, as for many if not all English speakers this stands for a diphthong. It is absolutely essential that e be a pure vowel in all contexts, not a diphthong ei. This is not driven by eccentric views of ours: in Notebook 3 it is clearly explained that (for example) lea and leia are distinct possible cmapua. So e cannot have the usual diphthong pronunciation of English long a, which is for this English speaker the pronunciation of most if not all occurrences of English [eigh]. The references to Romance e, a pure sound, are more appropriate.

A more common value of e is the relaxed sound it has before consonants, which is the sound it has in English 'let', 'met' and 'set'([ epsilon ] in IPA). This second value of e is represented in the guides by either [eh] or [e], the former being a better guide to its sound when the e is final in a Loglan syllable, e.g., [meh], while the latter is unequivocal when the e is between consonants, e.g., as in [met]. Thus the guides to the pronunciation of Loglan le me se will be [leh meh seh], but of let met set, which contain the same relaxed e-sound, the guides [let met set] will be unmistakable. Perhaps the most frequent pronunciation error made by both English- and Romance-speakers learning Loglan is to use the tense value [eigh] for *[leigh meigh seigh] for le me se. (The asterisk '*' indicates that this is an incorrect pronunciation.) If your native language disposes you to make this error, please take the time now to notice that these three Loglan words--le me se--have the same sounds as English 'let' 'met' 'set' with the final 't's dropped off. In fact, the best way to learn to say Loglan le (which means 'the') correctly is to start to say 'let' and then surprise yourself by stopping with 'le-'. In short, le is an unfinished 'let'.

Sound-pairs, such as the [e/eh eigh] values of Loglan e, the [ee y] values of Loglan i, and the [oo w] values of its u, which are found in complementary distribution in some languages, are called allophones of some phoneme in that language. Among Loglan vowel phonemes there is one other important allophone-pair. The phoneme o has the value [oh] (IPA [o]) except before i or r. In just these two contexts o has the value of [aw] in English 'law' (IPA [ inverted c ]). The same variation occurs in English. For example, notice how the value of English 'o' shifts from [oh] to [aw] in going from 'note' to 'nor', and how the same shift occurs phonemically in going from 'mower' to 'more' ([MOH-rr] to [mawr]) in English. Also note that the sounds of English 'noise' are in many dialects more closely approximated by the phrase 'gnaw ease' than by 'no ease' ([NAW-eez] and [NOH-eez]); for in the i-context, too, this same shift in the value of 'o' occurs in most dialects of English. Thus in Loglan as in most languages, if o is followed by r or i, a shift from [oh] to [aw] occurs.

2.6 The Vowel /y/

Y is not a consonant in Loglan but a vowel. Its sound is represented in the guides by [uh] which is the value of English 'a' in 'sofa' and 'above', of 'o' in 'of' and 'above', of 'u' in 'up' and 'under', and of 'e' in unstressed 'the'. The sound [uh] is a short, grunted vowel that is seldom stressed and is very common in the Germanic and Slavic languages but rare in Romance tongues. Thus in the special character strings used to explain Loglan pronunciation in this book, English 'love' would be written [luhv], 'above' would be written [uh-BUHV], and 'sofa' would be written [SOH-fuh]. See the page before the Foreword (p.14) for the whole list of English spellings used as guides. An alternative value for Loglan y which is safe for use and might often be useful is the value of "oo" in English "look" (not the value of "oo" which JCB uses to represent the sound of Loglan u)

As mentioned briefly in Section 2.4, y's most important function will be to serve as a kind of spoken hyphen. For example, the Loglan word for 'eye-doctor' is [MEK-uh-kyoo] or mekykiu. The first part, mek-, is derived from the word for 'eye', which is [MENG-kee ] spelled menki, and the second, -kiu, from the word for 'doctor', which is [KEESH-moo] spelled kicmu. But if mek- and -kiu were spoken without the separating hyphen [uh], that fact would be lost; for the word would come out [MEK-yoo]. This could only be understood as the two-word phrase me kiu and not as a word at all. Inserting -y- not only holds mek- and -kiu together but also insures that both k's will be heard. When y is playing this hyphenating role in complex words it is never stressed. In fact, it is a "non-syllable": one that is not even counted in fixing stress.

Y also occurs as an ordinary sound in names. For example, [huhnt] Hynt is the Loglan transcription of the English surname 'Hunt', and [SUHM-trr] Symtrr transcribes the surname 'Sumter'. (The sound [rr] in this second name is, as we shall see later, a "vocalic r".) In its name-spelling role y may occasionally be stressed. Y does not occur in borrowings.

So the two uses of y as a spoken hyphen and to give accurate respellings of some borrowed names.

2.7 The Seventeen Consonants

The consonants of Loglan are represented by the letters

b c d f g h j k l m n p r s t v z. These are all the English consonant letters except 'q', 'w', 'x' and 'y'. We have just seen that 'y' is a vowel in Loglan; the other three missing letters do not occur in the language. Of the seventeen consonant letters that do have uses in regular Loglan words, all but two have the same pronunciation in Loglan that they have most commonly in English. Thus s as in 'bus', but never as in 'busy'; g as in 'gate', but never as in 'gem'; k as in 'kit', but never silent as in 'knot'; and so on. The two exceptional letters are c and j. These we have already learned have the values of English [sh] and the French [zh] of 'j', this last also being the rather rare American English sound of 'z' in 'azure' or of '-ge' in some pronunciations of 'garage'.⁷ In the guides we will use [sh] and [zh] to cue the pronunciation of these two oddly-spelled sounds.

Of the two letters c and j, c is the only real stumbling-block in reading Loglan. Do you not agree that seeing jo as a new kind of 'Joe' (with the soft, French sound in 'azure') will be easier for your eye to do than learning to see 'shoe' in Loglan cu? Or 'sheep' in cip? Though seeing the German's 'schnapps' in cnaps might be a little easier. Still, while the letters c and j will be troublesome for your eye, neither sound will give your ear or tongue the least trouble. Say [zho shoo sheep zheep shnahps] and you have pronounced the Loglan letter-strings jo cu cip jip cnaps, an odd mixture of French and German sounds.

You may wonder how it happened that the letters c and j acquired these odd uses. You may recall that they stand for two sounds that were added to Loglan fairly early in the design work and that they made a very important contribution to the over-all recognizability of spoken Loglan. The reason I was reluctant to include these two sounds in the original phoneme list was not because they were not very widely used--because in fact they are--but because [sh], at least, is not spelled with the same single Latin letter in any group of languages that I knew of. (It is spelled with two different single letters in two languages, however: in Hungarian, where it is spelled by the letter 's', and in Portuguese, where it is one of the many values of the letter 'x'.) The sound [zh], in contrast, is spelled with the same single letter in at least four Latin alphabets: those of French, Turkish, Roumanian, and Portuguese, in all of which [zh] is spelled with 'j'. (Both [sh] and [zh] have single letters in Russian, which uses the Cyrillic alphabet. But the Latin alphabet is by far the most widely used alphabet on the planet and it was obvious it must be the Loglan alphabet as well.) Yet despite their irregular natural spelling, [sh] and [zh] not only proved to have great practical importance for building recognizability into Loglan, but the phonological symmetry of the language would in fact be weaker without them.

When it was clear that the sounds [sh] and [zh] had somehow to be accommodated in the Loglan version of the Latin alphabet, the letter j was an obvious choice for [zh], not only because [zh] is always represented by that letter in French and three other languages, but because it is also a component, on Loglan phonemic analysis at least, of the English sound of 'j' in 'judge'. This very common English segment is actually composed of two sounds: [d] + [zh]. (This can be heard by constructing the word 'badger' from 'bad' followed by 'azure' stripped of its initial vowel: thus 'bad' + '-zure' = 'badger'.) This means that the sound 'j' usually has in English will be represented by a consonant pair in Loglan, namely the pair dj.

The letter c is a less obvious candidate for the [sh]-sound; but it is an inevitable choice even so. For one thing, the work 'c' usually does in other Latin alphabets is done in Loglan by k and s. For another, the letter c is not entirely unacquainted with the [sh]-sound in other languages. It appears in the German trigraph 'sch', which is the most familiar spelling of [sh] in German ('schuh', 'schwein', 'schiff', and so on); and it is the characteristic element in the French digraph 'ch' which is the invariant spelling of the [sh]-sound in French ('chez', 'chef', 'chien', and so on). In yet another language, Italian, a 'c'-containing digraph, 'ci-', is always used to write the sound we write in English with 'ch'...for example, in 'ciao', where it has exactly the same sound as English 'chow'. But Italian 'ci-' and English 'ch' also conceal a composite sound: in this case one composed of [t] + [sh], or Loglan tc. Thus the word we would write in English as 'cheap' may be rewritten in Loglan phonemes as tcip. (This can be heard by building the sounds of 'cheap' from 'fat sheep' by leaving off the 'fa-'. Thus 'fat' + 'sheep' - 'fa-' = 'cheap'. Evidently 'cheap' could be written 'tsheep'.) So tc and dj are in fact very similarly constituted consonant groups in Loglan, and they have symmetrical roles in the language. We will sometimes represent tc by [ch] and sometimes by [tsh] in the guides. Similarly, dj may be represented by [dzh] or sometimes simply by [j]. In each case, these are the same sounds.

Before leaving the consonant sounds, we must mention that there is one complementarily distributed pair of them in Loglan. (A "complementarily distributed" pair of sounds are such that if one ever appears in any context, the other never appears in that context; and vice versa.) These are the two allophones of the phoneme n. The first allophone is the usual English value of [n] as in 'new'. But this sound never occurs before k or g in either English or Loglan. The other is the [ng]-sound of 'sing', which never occurs elsewhere in Loglan. So the [n] of 'new' and the [ng] of 'sing' ([ n with hook ] in IPA) are in fact complementarily distributed sounds in Loglan. This same allophonic shift in the value of /n/ also occurs in English. Notice that the 'n' of English 'bank' does not have its usual value (i.e., the word is not 'bann-ickk') but the value of 'ng' in 'bang'. (Thus 'bang kit' and 'bank it', if spoken pauselessly, are indistinguishable in English.) But [n] and [ng] are not complementarily distributed sounds in English; for notice such contrasting English pairs as 'sin'/'sing', 'thin'/'thing', and so on. So in other contexts than before /g/ and /k/ the [n:ng] contrast is in fact phonemic in English. Not so in Loglan. The [ng]-sound does not occur anywhere else in Loglan except as a g/k-preceding allophone of the phoneme n. It may also be the pronunciation of n before h when the latter takes the phonetic value of the final ch in Scottish English "loch".

The fact that there is no [ng] phoneme in Loglan will put loglanists with names like Young in a quandary. They will be obliged to choose among three not very handsome alternatives: [yuhn], which is spelled Iyn; [yuhngk], which is spelled Iynk (and rhymes with 'junk'); and a word spelled Iyng but pronounced [yuhngg], in which the second [g] is the hard 'g' of 'get', and rhymes with what 'bungle' would sound like if the '-le' sound at the end were left off...a common pronunciation of 'Young', in fact, in New York City. The last alternative looks best because it would be spelled Iyng. But unfortunately it cannot correctly be pronounced [yuhng]. On the other hand, German [YUHNG-krr] spelled 'Junker' can be exactly reproduced as Iynkr; for in the German word the [ng] allophone of n is called for by the following k. There is still more on transcribing natural names into Loglan in Section 2.13.

Finally, the seventeen consonants may be classified in a way that will be useful when we come to consider the pronounceability of adjacent pairs of them in the unbuffered dialects of the language. Crucial for this question is whether the consonant has a vocalic allophone or not, that is, whether it can be sustained in full musical voice like a vowel. If it does, it is called a continuant. There are four of these continuants in Loglan: the two nasals, m and n and the two liquids, l and r. Because any of these four consonants can be given a vowel-like quality, any of them may be pronounceably paired with any other consonant. Also, they are very useful in respelling foreign names taken from languages, like English, which happen to exploit this vowel-like potential of their continuants. For example, the sound spelled 'er' in the American pronunciation of 'Robert' is a vocalic /r/; so the Loglan transcription of American 'Robert' is Rabrrt. The first r in this word is a consonant and it will be represented in the guides as [r]. But the second, doubled, rr is really a vowel. To show the difference between the two values of Loglan r in the pronunciation guides, we will write the vocalic r as [rr]...rather as one might record a dog's growl in a children's story as 'Grrr!' Thus the pronunciation of the Loglan word Rabrrt is [RAH- brrt]. There is an ordinary r in first place and a "growly rr" in second.

Each of the four continuants has two such allophones in Loglan, one of which might be called the consonantal, and the other the vocalic version of its sound. The consonantal version of a continuant will always be represented by the letter itself in the pronunciation guides, thus as [m n r l]; the vocalic version by the doubled letter: [mm nn rr ll]. This explains why the English names 'Earl', 'Myrtle' and 'Burton' will seem to have no vowels at all when they are rewritten in Loglan: Rrll (I think Rrl is a viable alternative --MRH), Mrrtll and Brrtnn. When we write out the pronunciation guides for these words, we get [RR-ll], [MRR-tll] and [BRR-tnn]; in both we see from the doubled letters that they do have "vowels", namely the continuants used as vowels. We have imposed the same rule in Loglan orthography as well as in the pronunciation guides that "vocalic" consonants (which are better referred to as "syllabic consonants") are doubled, as part of the reimplementation of the language from 2013 on. ⁸

The thirteen remaining consonants have no vocalic values. Twelve of them may be further divided into the voiced series, b v d z g j, and the unvoiced series, p f t s k c. Voice is the quality a consonant acquires when its production is accompanied by the vibration of the vocal chords. (The presence or absence of voice can be sensed by placing the tips of your fingers on your larynx, or Adam's apple, as you speak. For example, you may notice that the vibration due to voice disappears when you whisper any normally voiced sound.) All vowels and all continuants are, of course, voiced; but only half of the non-continuants are voiced. The other non-continuants are, in fact, whispered (i.e., voiceless) versions of the voiced series, as you can easily find out by trying to whisper the series b v d z g j. What you will actually produce by whispering is the unvoiced series p f t s k c. Thus each voiced non-vocalic consonant has an unvoiced equivalent, and vice versa. The sound b is the voiced equivalent of unvoiced p, v of unvoiced f, d of unvoiced t, and so on. This principle will be useful when we come to consider the pronounceability of certain consonant groups in constructing words.

The aspirate h is the seventeenth consonant and has an allophone as well (the only Loglan allophone which is not a sound in most dialects of English) since the 2013 reimplementation. Since h is a burst of voiceless breath, it takes the acoustic shape of the following vowel. When h is not followed by a vowel, the aspirated pronunciation is not possible, and it is instead pronounced as ch in Scottish English "loch" (the hard pronunciation of German "ch"); it is optionally permitted to have this pronunciation in other contexts.^R11

2.8 Stress

Stress is an increase in the length, pitch or loudness of a syllable relative to other syllables in the same word or utterance. Three levels of stress are phonemic in Loglan: zero, light, and heavy (or emphatic) stress. In general, light stress is used to accent the syllables of words and heavy stress to emphasize some words in some sentences. Zero or no stress usually characterizes strings of monosyllabic words in Loglan and provides the monotonic background against which the tunes of light and heavy stress are played.

If we look at the stress-privileges of the syllables comprising Loglan words, we find that they fall into three exclusive classes: (i) those that are always-stressed, whether lightly or heavily; (ii) those that are never-stressed; and (iii) those that are sometimes-stressed (and sometimes unstressed). The always-stressed syllables are the penultimate, or next to the last, syllables of a certain form-class of Loglan words called "predicates", which we will define in Section 2.14. The never-stressed syllables are all the other syllables of predicate words; and the sometimes-stressed syllables are all the syllables of all the non-predicate words in the language.¹⁰ We shall make use of these stress-classes again when we discuss how the various types of words are resolved in speech.

In the pronunciation guides to spoken Loglan we will write unstressed syllables in lower case letters (e.g., [leh-toh] for le to = 'the two'), lightly stressed syllables in capital letters (e.g., [leh-toh-MREH-noo] for le to mrenu = 'the two men'), and emphatically stressed syllables in bold-face capitals; e.g., [leh-toh-ZHOON-tee-MREH-noo] for le to junti mrenu = 'the two young men'; and, as in English, we will underline the emphasized word in text. Later on we will replace these phonetic guides with phonemic transcriptions in which the same stress conventions will apply. For example, /leto/, /letoMREnu/ and /letoJUNtiMREnu/ transcribe the same spoken strings phonemically, that is, as a string of phonemes. In these more compact transcriptions of Loglan speech we will pay less attention to the sounds of individual phonemes and more to the contours of stress and pause that shape the phoneme stream. But obviously you will need to know more about how the phonemes themselves are produced in their various settings before these phonemic transcriptions will be of much use to you. So for a while longer we will attempt to convey the audible rhythms of Loglan speech through these longer, but phonetically more informative, pronunciation guides.

A word about pauses...or rather, about pauselessness. In normally rapid speech in all languages, there are no pauses between words. Pauses, when they do occur, are usually grammatically or morphologically significant. So when the normal pronunciation of a Loglan utterance can be expected to be pauseless, we shall omit the spaces between words in both kinds of transcription, as in fact we have done in all the transcriptions given above. Thus as normally spoken 'The two young men' and Le to junti mrenu are pauseless "blurts" of sound in both languages. This fact is better conveyed about Loglan by /letoJUNtiMREnu/ than by [leh-toh-ZHOON-tee-MREH-noo], with its more languid marking of syllable joints, just as 'ThetwoyoungMEN' would bring the pauselessness of normal English to the reader's attention much more dramatically than 'The-two-young-MEN' would. But the present advantage of a clearly syllabified pronunciation guide is that it shows the English-speaker exactly how each Loglan syllable is to be produced. You will find this kind of phonetic detail useful for a while longer. If you use this information well, you will soon be able to listen to "good Loglan" coming from your own lips. This will speed up your learning immeasurably.

When a pause does occur in Loglan speech we will represent it by a period (full-stop) '.' in both kinds of transcription. Thus [leh-TOH . ZHOON-tee-MREH-noo] and /leTO . JUNtiMREnu/ both show that, when the number word in Le to junti mrenu is stressed, a pause must follow it. The same stress-pause pattern often occurs after emphasized words in English: 'The two (pause) young men'. While variable in English, this is quite a general rule in Loglan. No stressed syllable may be allowed to precede a predicate unless there is an intervening pause. There is more on this in Section 2.12.

The phrase Le to junti mrenu happens to contain instances of all three stress-classes of Loglan syllables. Thus Le and to are sometimes-stressed monosyllables. Indeed, as we have just seen, to may either be emphasized or left completely unstressed. The first syllables jun- and mre- of junti and mrenu are, in contrast, instances of the always-stressed penultimate syllables of predicate words. And the final syllables -ti and -nu of these same two disyllables are instances of never-stressed syllables. Thus the emphatic pronunciation of junti is not */JUNTI/ (the leading asterisk indicates that the sequence so-marked is impermissible) but /JUNti/.

As part of the reimplementation of the language starting in 2013, we installed new ways of representing stress in Loglan orthography. The hyphen - is reserved to represent a syllable break. The apostrophe ' may replace a syllable break or appear in final position to indicate light stress on the preceding syllable; the asterisk * may appear in the same positions to indicate heavy stress on the preceding syllable (initial appearances of * continue to mark incorrect utterances). In the 2013 orthography, we use the comma to represent pauses: we have ensured that an explicit comma can always be inserted where a pause is allowed or required. It should further be noted that the parser now understands syllable breaks, stresses and pauses. The parser does NOT support stress by capitalization. We do not replace JCB's language about his transcription techniques above because it is not our intention to replace all of his transcriptions, though we may add some in our own style. Le to junti mrenu in spaceless style would be le-to-jun'ti-mre'nu or just letojun'timre'nu; the other syllable breaks are optional but the stress must be marked for the reader to understand where the word boundaries are. The form with stressed to would be written leto', jun'timre'nu.

The advantage of our form of transcription is that the current parser understands it (spacelessness and all) as well as the ordinary Loglan orthography.

2.9 Pause

There is only one pause phoneme^R2 in Loglan although it has many allophones. As mentioned in the previous section, pauses are represented by [.], [ . ] or /./ in the guides and transcriptions. They are also sometimes marked in text by commas (,) or periods (.), and sometimes not marked at all in written Loglan. For example, pauses are required before all connectives and after all names; and pauses in these two contexts are marked by commas. Thus, the connective [eh] ('and') in [leh-MREH-noo . eh-leh-BOHT-shee], a phrase which means 'The man (pause) and the boy', is not only preceded by an obligatory pause, but that pause is marked by a comma in writing: thus le mrenu, e le botci. Similarly, all names are separated from their sequelae by pauses; and except in the middle of a serial name (Djan Pol Djonz), such post-nominal pauses are also regularly marked by commas. Thus, we say [lah-JAHN . MREH-noo] for 'John (pause) is a man' in speech; and we write this sentence as La Djan, mrenu in text. We also say [lah-jahn . pohl . JOHNZ . MREH-noo]; but we write La Djan Pol Djonz, mrenu, celebrating only the last of those pauses in text. Thus both connectives and names are rather special words in the Loglan utterance, and the flow of speech is always broken--even if briefly--before the former and after the latter.

Another, even longer pause normally occurs between the utterances in an extended speech. Thus [YAH . ee-mee-KEESH-moo] really consists of two utterances. The little word [YAH], spelled Ia, which means 'Yes' in the sense of 'Yes, that's true' or 'Yes, I agree with you', is the first of the two utterances. (Though a single word, Ia is an utterance because it sends a potentially complete message. That is, the speaker could have stopped with Ia, but chooses not to.) The second utterance is [mee-KEESH-moo], spelled Mi kicmu, means 'I'm a doctor'. These two independent remarks are joined by the "utterance connective" [ee]--which is of course spelled I--and this important word is sometimes translated into loglanized English as 'And' with a capital 'A'. Of course the word I, like all connectives, is required to be preceded by a pause. But the I-connective signals an even greater break in the flow of ideas than connectives between words and clauses do. So I (and its numerous kin) may be and often is preceded by a period or full stop (.) in text (but may also be preceded by a mere comma). Thus we make this little speech [YAH . ee-mee-KEESH-moo], in which the pause is just another Loglan pause. But we write Ia. I mi kicmu with a period (.), rather than a comma before the I. This textual mark has the same grammatical significance as the full stop between sentences in English has; that is to say, between any pairs of them, things should parse. In literal translation--by which I mean word-for-word, or word-for-phrase, translation--Ia. I mi kicmu would come out 'Yes. And I am-a-doctor.' Notice that it takes an English phrase to render that one word kicmu. We will learn why in the next chapter.

Some allophones of the Loglan pause phoneme do not normally appear as anything but the standard interverbal space in writing (though it is always legal to express a medial pause using an explicit comma, after the 2013 reimplementation). We have already mentioned that the pauses that occur between the parts of a serial name do not have to be marked with commas in text; for example, 'John Jones' is pronounced [jahn . JOHNZ] but written Djan Djonz. If a serial name were spoken without a pause, the listener would hear it as a single word and write it as one: Djandjo'nz. So the interverbal spaces in serial names are, in a sense, acknowledgement enough that there are pauses there. The pauses required before vowel-initial words that are not connectives are similarly invisible in text. Thus, the space between the otherwise adjacent /a/s in the phrase la Ailin' ('the-one-named Eileen') is marked by the briefest of pauses in speech: [lah.igh-LEEN]. But such pauses are naturally left unmarked in text: la Ailin'. The short pause in the phrase le igllu [leh.EEG-ll-oo] ('the igloo') is similarly invisible.^R3

All the pauses that we have discussed so far have been obligatory. They are always present in these contexts in faultless speech. There are some other contexts that call for obligatory pauses which we will encounter later. But let us now consider briefly the uses of optional pauses.

If you think about it for a moment, you will realize that the joint between a pair of words in any language may be occupied by a pause but that no joint within a word should ever be. In fact, this is pretty close to being a satisfactory definition of what a word is: a word is any segment of speech that can be separated from other such objects by pauses, and within which pauses never occur correctly. Pauses may of course occur inside a word by accident or hesitation; but it is never correct to put one there. That is to say, it would not be correct to tell someone '"Never(pause)theless" is a word in English.'

So let us agree that a speaker may pause at any word-juncture, as linguists call these "pausable" joints in the speech stream. It is the Loglan writing convention to represent any such intended use of an optional pause by a comma. What such commas mean is that the writer intends the reader who is reading aloud to pause at at least these places...possibly more. By this convention, we help preserve the isomorphism of the two forms of the language...including differences in individual phrasing styles. The main use of optional pauses (and the commas that go with them) is, of course, as phrasing pauses: those breaks in the flow of speech or text which allow the speaker or writer to gather up the threads of what he or she has just been saying, and to make plans to say more. Judiciously placed , such pauses also allow the listener/reader to knit up what he or she has just heard or read; and this of course readies him or her to hear more. So pauses are a reasonably important part of the listening/speaking interaction, just as commas are in the reading/writing act.

Note that since pauses may be used at any juncture, they may be used at every juncture in a given utterance. Thus nothing prevents us from saying [leh . MREH-noo . eh . leh . BOT-shee] to help a novice hear the words in what we are saying. But if we do speak that way, or intend what we write to be read aloud in that fashion, then convention requires that we mark all the pauses in such didactic sentences with commas when we write them down. Thus Le, mrenu, e, le, botci is the textual equivalent of the above utterance. What is "not allowed"--that is, what would be counted as an error--is pausing inside a word; for example, *[leh-MREH . noo-eh-leh-BOHT . shee]. This is of course exactly the kind of helpless hesitation that a newcomer to the language might easily fall into...especially one who hadn't acquired much confidence in his or her control of the vocabulary. Everyone except machines will be tolerant of such errors, of course. Indeed, we human listeners hardly hear them. It is an interesting fact about human listening that we correct other people's slips of speech so swiftly and automatically that we are often unaware that errors have occurred. Still, the point is that pausing inside a word can occasionally be genuinely misinforming to one's human listeners, and will probably always be so to even the most amiable of our machines.

It must be noted however that the Loglan name mechanism creates contexts (in the reimplementation after 2013) in which explicit pauses at verbal junctures after the name article la or other "name markers" can affect how a sentence is parsed. Further, some classes of long structure words admit internal pauses in controlled ways in the 2013 implementation. Of course, JCB's definition of word could be preserved without exception if we concluded that the name markers are affixes and not quite independent words, and that the long structure words in question are in fact streams of short words; these are both reasonable positions to take.

2.10 Intonation

Most natural languages have more pause phonemes than Loglan's one. English, for example, has four pause phonemes, or "junctures" as they are often called. The reason Loglan can get by with just one is that, in natural languages, the many varieties of pauses are combined with the sentence-long rise and fall of musical pitch called intonation to produce composite effects that serve the same classifying function for the spoken tongue as punctuation marks do for the written form. They identify the varieties of utterances: questions, declarations, imperatives, and so on. But, as we have already seen in the case of hyphens, commas, and full-stops, most Loglan punctuation marks are actually "spoken aloud"...even if the "speaking" is a bit of silence. Thus the sound y is often a spoken hyphen, and the word I is like a spoken period and suggests the use of a period in text. But there are also spoken question-marks, spoken parentheses, spoken quotation marks, and so on, in this language. This not only makes possible a very substantial isomorphism between the written and spoken forms of Loglan, it also permits the pause structure of the language to be very simple.

We note in 2017 that in the reimplementation starting in 2013 there is no longer any situation where a punctuation mark represents a phonetic feature other than pause or stress: words are not replaceable by punctuation marks in our current view. We are open to revisiting this issue for some marks of punctuation.

A related simplification of Loglan phonology is that intonation, or the rise and fall of musical pitch that accompany most human sentences, is not phonemic in Loglan. Thus, whether a speaker accompanies a question with falling pitch, or with rising pitch, or with no pitch-change at all, is a matter of structural indifference in Loglan. All differences between sentence-types in Loglan--between its questions, imperatives, declarations, and the like--are either marked with special grammatical patterns or by special punctuation-like words. For example, there are many question-asking words in Loglan, but the one that turns any statement into the kind of question that takes yes or no for an answer is the little word [eighee] spelled ei. Thus if [dah-MREH-noo] or Da mrenu means 'He is a man', as it does, then [EIGHEE-dah-MREH-noo] or Ei da mrenu means 'Is he a man? ('Eh, he's a man?'). But this question can be accompanied by any intonation contour whatever in Loglan, including a perfectly level (i.e., monotonic) one.

Allowing tonal matters to vary freely over the full linguistic range of the native languages of its speakers may have the interesting consequence of making Loglan not only easier to learn--for intonation patterns are usually among the last features of a second language to be mastered by adults--but also remarkably expressive. For here is an entire dimension of language structure which may have been freed for non-structural purposes...for example, expressiveness. On the other hand, we may find that intonation is a biologically necessary part of the human speech performance.¹¹ In that case a structure of intonation will grow up redundantly in Loglan whether we design it or not. Either result would be scientifically interesting.

Having completed our list of the sounds and sound contours that are, and are not, phonemic in Loglan we may now turn to the various forms of Loglan words.

2.11 Three Kinds of Words

A major distinction between the words of any language may be drawn between those relatively few, short but frequently used words that convey the grammatical structure of a sentence--words like 'the', 'of', 'is' and the various affixes of English like '-ing', '-ed', '-es', and so on--and the relatively numerous, but longer and less frequently used words that convey its particular referential content: words like 'cat', 'run', 'John', and 'democracy' in English. Let us call the first kind structure words, and the second content words. Occasionally we will call the simple structure words little words because all of them are; in Loglan all structure words, simple or complex, are called cmapua, "small words".

Content words always refer to something outside the sentence ('That's a cat'); structure words seldom do. But this extralinguistic reference may be made in two quite different ways. It may be made by naming a unique person, place or thing--many capitalized words like 'John', 'France', or 'Democracy' do this work in English ('That's John')--or by predicating some property of it--which means ascribing to it some feature--that may, in principle, be shared by many things: 'cat', 'run', 'blue' and 'democracy' with a small 'd' ('That's a cat'). Let us call the first type of content words names, and the second, predicates.

Every content word is either a name or a predicate. We are using the word 'predicate' to refer to the second and largest category of Loglan content words, and not more detailed grammatical labels like 'noun', 'verb', 'adjective', 'adverb', and so on, because one of the most surprising things about Loglan grammar is that no sharp distinction can be drawn in it between these several ways of ascribing properties to things. The word 'predicate' suggests their common grammatical role, even in English. It happens also to be the word favored by logicians to describe the general class of property-ascribing words.

Summing up, we have three main classes of words in Loglan: structure words, names, and predicates. We are about to see that each has an exclusive set of permissible word-forms. These have been devised in such a way that any word may be classified by the listener from its shape alone.

2.12 Structure Words

Structure words are relatively few in number but among the most frequently used words in any language. They also tend to be among its shortest. Thus, the shortest words in English--'a', 'an', 'of', 'to', 'if', 'so', and so on--are all structure words. This is also true in Loglan. Words composed of single vowels like e [eh] (V-form words), of vowel pairs like ia [yah] (VV-form words), of a consonant followed by a single vowel like le [leh] (CV-form words), and of a consonant followed by a pair of vowels like sui [swee] (CVV-form words) are all structure words in Loglan. All such monosyllabic structure words are called little words (cmapua: but the Loglan term also includes compound structure words). They have the following linguistic formula:

(C){V/vv}(V)

Elements contained in parentheses are optional, i.e., may occur one or zero times. Thus e ('and'), ia ('yes'), le ('the') and sui ('also') represent the four usual permissible forms of little words. The symbol vv represents a monosyllabic pair of vowels as described below, and a word of the most complex shape allowed would be guoa (a proposed additional closer for abstract predicates and descriptions).

Compound structure words are formed by combining two or more little words in some order. Thus leva ('that') is made from le + va ('the' + 'there') and anoi ('if') is made from a + noi ('or' + 'not'). Compound structure words have a fairly elaborate formula. They come in two flavors, one of the shape

VV [VV]

a sequence of VV words. These are attitudinals (modifiers expressing emotional attitude toward the utterance in which they occur). The other more general shape is

{V/C{V/vv}(V)} [C{V/vv}(V)]

where the symbol vv represents a monosyllabic pair of vowels, as described below. The group in braces represents one of the indicated shapes, followed by the bracketed construction, representing one or more occurrences of the shapes indicated. A word beginning with a single V will be a logical or utterance connective.^R4

Here the square brackets mean 'one or more instances of (whatever is enclosed)'. In general, compound little words are the less frequently used structure words of a language. For example, 'nevertheless' and 'howsoever' are relatively infrequently used structure words in English; and not surprisingly they are compounds of simpler English words. Loglan also has a few such polysyllabic monsters. For example, [pah-sheh-NOY-nah] or pacenoina means literally 'before-and-not-now' and translates the claim of English 'no longer' quite precisely. Similarly, [soo-TAWR-ree] or sutori means literally 'at-least-two-th'. It is derived from su = 'at least', to = 'two', and -ri, the general ordinal suffix (hence '-th'); and so is an elegant rendering of that awkward phrase 'second and subsequent' that we need so frequently in talking about Loglan. For brevity compound little words are sometimes called simply compounds. Thus there are two kinds of structure words: little words and the compounds made from them.

There are few phonological restrictions on the formation and use of structure words in Loglan. For example they may be stressed or unstressed as the speaker chooses. Moreover, every possible pairing of the 17 consonants with the 5 main vowels, a e i o u, is permitted in the CV- and CVV-form words. Also, every possible combination of the main vowels with each other is permitted in VV- and CVV-form words. But the 25 vowel-pairs so generated fall into three distinct classes on the basis of how they are pronounced.

The Four Monosyllables: ai ao ei oi are always monosyllabic. These are the four natural diphthongs that occur monosyllabically in most languages. In Loglan ai is invariably pronounced [igh] as in 'high', ao [ow] as in 'how', ei [ey] as in 'Hey!' and oi [oy] as in 'ahoy'. The [ey] of ei is sometimes written [eighee] in the guides to reveal its two component vowels more clearly. Thus ei starts with [eigh] and ends with [ee]; and there is a smooth transition or "glide" between them. All the monosyllabic vowel-pairs are glides in that sense: they start out being one vowel and end up another. Notice that three of the invariable monosyllables end with i. The fourth, ao [ow], is special. The way to remember the un-English spelling of [ow] is to think of Chairman Mao. In the word-formulas, a monosyllabic vowel-pair will be represented by 'vv'.

The Eleven Disyllables: aa ae au ea ee eo eu oa oe oo ou are always pronounced as two syllables. Either the first or second vowel may be stressed, but so long as they are unlike, neither need be. Either the first or second syllable of a doubled vowel (aa ee oo) must, however, be stressed. Thus aa is [AH-ah] or [ah-AH], but never *[ah-ah]. The pronunciations of the unlike pairs are given here with level stress, which is perhaps their most common stress contour: ae au [ah-eh ah-oo], ea eo eu [eigh-ah eigh-oh eigh-oo], oa oe ou [oh-ah oh-eh oh-oo]. A brief glide may occur between the two vocalic syllables, or a glottal stop^R5. Some of these vowel pairs--especially oo and ee--look like English monosyllables but are not ([OH-oh] and [EIGH-eh]). Fortunately for English-trained eyes, these last two pairs are rare. Notice that it is always the prevocalic allophone of e--the [eigh] of 'late' 'freight' and 'sate'--that is called for when e is in first position in any of these words, while it is the primary allophone [eh] that is always called for in second position. In the formulas, a disyllabic vowel-pair will be represented by 'VV'.

The Ten Optional Disyllables: ia ie ii io iu ua ue ui uo uu are normally and preferably pronounced as monosyllables but may occasionally be spoken as disyllables. When one of these optionals is difficult for some speaker to produce as a monosyllable, he or she may opt to spread its sounds out over two syllables. This is especially tempting for the learner when the vowel-pair comes after any of the vocalic consonants m n l r. Thus while [mwee] [nwee] [lwee] and [rwee] are all possible pronunciations of mui nui lui and rui--these are all very brisk sounds and have rather a French air--[MOO-ee] [NOO-ee] [LOO-ee] and [ROO-ee] are easier for the newcomer to Loglan to produce and also permitted. Note that all the optional disyllables commence with either i [y] or u [w]. The i-initial series ia ie ii io iu is pronounced monosyllabically as [yah yeh yee yoh yoo] and disyllabically as [EE-ah EE-eh EE-ee EE-oh EE-oo]. The u-series ua ue ui uo uu is pronounced monosyllabically as [wah weh wee woh woo] and disyllabically as [OO-ah OO-eh OO-ee OO-oh OO-oo]. In the word-formulas the optionals are represented by either 'vv' or 'VV' depending on how they are actually being pronounced.

In rapid speech the stress in compound little words is usually level; that is, there is no distinctively stressed syllable. On the infrequent occasions when there is one, it is usually penultimate, the second from the last syllable. A compound which contains any VV segment is entirely made up of VV segments. Any order of V-, CV- or CVV-form segments is permissible in a compound except that V-form segments may only be initial or follow a Cvv-form (monosyllabic) three-letter segment. Thus [AH-tigh] Atai = a + tai, [ah-TEIGH-oh] Ateo = a + teo, and [TIGH-ah] Taia = tai + a are all permissible compounds; and because they are all acronyms, that is, quasi-predicates, they are all penultimately stressed. But *Teoa [teigh-OH-ah] is not permissible. If it were, it could be heard either as the phrase te oa or as a compound derived from teo + a; and that would be ambiguous. The limited distribution of V-form segments in compounds prevents that ambiguity from arising; and [teigh-OH-ah] in fact resolves as the phrase te oa.

A second rule--one that we have already seen at work--is that any V-initial word, whether it is a structure word or not, must be preceded by a pause...usually, quite brief. Thus in both [lah.igh-LEEN] La Ailin = '(The one named) Eileen' and [leh-MREH-noo . ah-noy-leh-BOHT-shee] Le mrenu, anoi le botci = 'The man (pause) if the boy', the pauses in the two Loglan utterances are both obligatory. Only the second must be marked with a comma, however.

A third rule--also mentioned previously--is that if any emphatically stressed syllable immediately precedes a predicate, the two words must be separated by a pause. Thus [leh-VAH . MREH-noo] Leva mrenu = 'That (pause) man' and [leh-VAH-teh-MREH-noo] Leva te mrenu = 'Those three men'; but not *[leh-VAH-MREH-noo]. (Again, the '*' indicates an impermissible form.) This rule prevents a terminally accented structure word from becoming part of the following predicate word..^R14

2.13 Names

We must note initially that there is a class of acronymic names like DaiNaizA, DNA, which are grammatically names but morphologically structure words. We will not be discussing these words here.^R6

In all languages spoken by peoples with frequent contact with other peoples, proper names are phonologically irregular. Thus neither 'Constantinople' nor 'Robert' was originally an English word. This will be emphatically true in Loglan. Loglan is a culturally neutral language. Its job is to reproduce the products of a great diversity of human cultures as faithfully as possible, including their proper names. So nearly all Loglan names are linguistic borrowings from the natural languages most closely associated with the things named. Thus 'France' is [frahns] spelled Frans in Loglan and 'England' is [EENG-gluhnd] spelled Inglynd; but 'Germany' is [DOYTSH-lahnt] and spelled Doitclant. For we are obliged to follow the phonetic habits of the Germans, not the English, in giving the country of the Germans its Loglan name. Doitclant illustrates still another point. The German word is 'Deutschland'. But since Loglan spelling is phonemic, and that of few natural languages is, we must follow the pronunciation of the natural word rather than its spelling when the two diverge.¹² And [DOYTSH-lahnt] is in fact the way a German would pronounce this German word.

There are, of course, certain "universal" objects, or at least non-local ones, on which no language has a special claim. These, like the Sun and the Moon, the days of the week, and the months of the year, are usually named in Loglan by using simple constructions based on widely shared roots, e.g., [sohl] Sol and [loon] Lun. The CVC-form turns out to be an attractive formula for these constructed name-words, and its use has introduced a modicum of regularity into the otherwise riotous phonology of Loglan names.¹³

Yet even the most imitative Loglan names are regular in one way. You may have noticed that all Loglan names end in consonants and that no other Loglan words do. This is no accident. That final consonant serves to distinguish Loglan names in the speech-flow. The convention is that if the natural name does not end in a consonant, the Loglan version is provided with a final s. Thus the Romans' name for Rome is 'Roma', so the Loglan word is [ROHM- ahs] Romas. (Not Rom, by the way; for this blunt English monosyllable would offend the Roman ear far more than the addition of the sibilant s to their graceful disyllable.) The Italian word for Italy is 'Italia', so the Loglan word is [ee-TAHL-yahs] Italias. Mary's English name in Loglan phonetic transcription is [MEH-ree]; in Loglan phonemes this is /MEri/; so her Loglan name is [MEHR-ees] Meris. And so on. Sometimes a happy accident occurs. The French pronunciation of 'Paris' is [paa-REE]. (Recall that the [aa] I've used in this guide is the French and Spanish 'a' of 'la'. It is more tense and higher-pitched than the Germanic [ah] of 'father' and 'Vater'. [aa] and [ah], then, are dialectical variants of Loglan a and you may use either one. But if one can, one prefers to pronounce the capital of France in a French way.) Phonemically, [paa-REE] is /paRI/. Adding final s to it produces Paris again, but the word may now be stressed in the French way: [paa-REES]. Since stressing a word on its final syllable is not a standard Loglan move--standard stress in all types of words is penultimate in Loglan--the non-penultimately stressed vowel must be marked in the written form so that other loglanists will pronounce it as the maker intended. We'll follow the Spanish custom of marking unexpected stresses. But rather than use an accent mark we'll use the typographically simpler apostrophe after the abnormally stressed syllable. So the final rewriting of French 'Paris' in Loglan is Paris'.¹⁴ In copying the stress, at least, and perhaps the French [aa] as well--not to mention the uvular Parisian 'r' which we loglanists would also regard as an acceptable variation of r in such contexts--we acknowledge the prior phonological claim of the source language.

Just as abnormal stress can be preserved in names, so can abnormal syllabification. Take the name 'Lois'. In English the word is distinctly two-syllables: [LOH-ihs]. (The [ih] in the guide stands for the non-Loglan sound of 'i' in 'this', 'miss' and 'Jim'.) But if, as before, we decide that Loglan i gives the best approximation to non-Loglan [ih] and write Lois, we come face to face with the rule that oi is one of the "invariable monosyllables", and so must be pronounced [oy]. Whence unmarked Lois will be pronounced [loyss], and will rhyme, unhappily, with 'Joyce'. To avoid this fairly large distortion of a natural name we use another diacritical mark, this time a hyphen. Now when we write Lo-is, the hyphen will mark a syllable break. So to any Loglandical reader, the correct pronunciation of Lo-is will be [LOH-ees] as desired. Note that if we had accepted e [eh] as the best approximation of English [ih], as many English-speakers are inclined to do, then Loes would automatically be pronounced disyllabically as [LOH-ess] and would need no mark. Oe is one of the invariable disyllables; so no hyphen would be required. In general, hyphens are used as sparingly as possible. When there are no hyphens in a stream of three or more vowels in a name (or borrowing), the default convention for parsing any string of vowels into syllables is to according to a precedence scheme: reading from the left, first read a mandatory monosyllable if possible (subject to the rule that a mandatory monosyllable ending in i cannot be followed by another copy of its final vowel; this doesn't prevent any vowel sequences, but it forces them to be grouped differently, as in aii, which must be grouped a-ii); otherwise read a single vowel followed by a mandatory monosyllable, if possible; otherwise read an optional monosyllable, if possible; otherwise read a single vowel syllable (which must be possible); repeat as many times as necessary until the whole stream of vowels is parsed. A departure from this scheme must be signalled by a hyphen or stress mark.

Thus unmarked Uaos syllabifies as /U,aos/ and is pronounced [OO-owss] (the second alternative, a single vowel followed by a mandatory monosyllable, is taken). If the pronunciation [WAH-ohss] had been intended, one would have spelled the name Ua-os.¹⁵

Note that after all attempts at good approximation have been made, the resulting name-word is a Loglan word composed of Loglan phonemes, and so must be pronounced in a Loglan way. Thus the closest Loglan approximation of the English word 'Jim' is probably Djim. But, since there is no [ih] in Loglan this word must be pronounced [jeem], not [jihm]. Alternatively, Jim might choose [jem] Djem for his Loglan name. In either case, there will be some distortion. Distortion is, of course, quite natural. In fact whenever a word from one language is taken into another its sounds are likely to be distorted in some way.

Any name that is not final in a sentence must be followed by a pause. This requirement, like the final consonant which it thereby isolates, helps names to be heard as such in the flow of speech. This pause must be written as an explicit comma unless it is followed by another name, a mark of terminal punctuation such as a period, or the little word ci.

Apart from adding final s when necessary, marking non-penultimate stresses and abnormal syllable breaks with apostrophes and hyphens respectively, and finding good approximations in the Loglan phoneme set to the phonemes of the natural name, we require that a Loglan name resolve into syllables^R7 (this requirement was imposed during the 2013 reimplementation).

This means that any sequence of Loglan syllables, however long and in whatever order, if followed by a consonant followed by a pause, is a permissible Loglan name. Thus both Rrl ('Earl') and Ibnn Saud [EE-bnn . SAH-ood] are permissible Loglan names but Babi ('Bobby') isn't. [BAH-bee], in fact, will be heard by any loglanist as the pair of little words ba bi ('something is...').

We have already noted that any syllable of a name word may be stressed or unstressed in any way that reflects the conventions of the language of origin. Therefore names, like structure words, may be thought of as composed of sometimes-stressed syllables.

2.14 Predicate Words

Predicate words form the bulk of the vocabulary of any language. About 90% of most dictionaries is composed of them. They range in English from short, frequently used words, like 'egg', 'run' and 'boy', to very long, seldom-used (and usually short-lived) technical predicates like 'antidisestablishmentarianism'. In Loglan, too, predicate words vary in length from short words like igllu [EEG-ll-oo]^R13 to long technical borrowings like trifenilmethani [tree-feh-neel-met-HAH-nee] ('triphenylmethane'), the only requirement being that they have all the properties of a Loglan predicate. There are five of these:-

1. Predicates must be vowel-final. They share this property with structure words. It distinguishes them absolutely from names.

2. Predicates must contain at least one pair of adjacent unlike consonants, a CC. This distinguishes them from structure words. The pair of consonants may be in the same or different syllables (and in the later case separated by a hyphen or stress mark). The pair of consonants may also be separated by a y-hyphen, as in mekykiu. One of the consonants may be part of a doubled continuant pronounced syllabically ("vocalically") as in igllu.

3. Predicates must have at least two syllables. Thus glu and drei (pronounced [gloo] and [drey]) have the first two requirements but not the third. So they may not be predicates. If monosyllables were allowed to be predicates, they would steal any stressed syllable that preceded them, and grow into other predicates. Thus [gloo] would steal [EE] and try become [EE-gloo] iglu (which is ill-formed for technical reasons); [drey] would steal [SHEE] and become [SHEE-drey] cidrei.

4. Predicates must be penultimately stressed. This, as we have seen, is the Loglan standard for polysyllabic words. All unmarked names and many compound structure words also follow this standard. But in predicates, penultimate stress is invariable. (In determining penultimate stress, syllables whose "vowel" is y or a continuant are ignored: such syllables are never stressed in predicates, and never final, and one such unstressed syllable may intervene between the stressed syllable and the final syllable)

5. Each predicate must be uniquely resolvable as a single word.

What the last property means is that anyone who wants to add a new predicate to Loglan must first make certain that it does not break up into smaller words. Thus, just as *Babi won't do for a Loglan name because it is not C-final and breaks up as ba bi, so *neutroni [neigh-oo-TROH-nee] won't do as the Loglan predicate for 'neutron' because it breaks up as neu troni. Also, the builder must make certain that the new word is not capable of attaching itself to other words in its neighborhood. For example, *proa [PROH-ah] won't do as the word for 'proa' because it will steal any unstressed CV-word that happens to precede it. Thus te *proa [tep-ROH-ah] will be heard as teproa, which is a predicate, alright, but not the one intended.¹⁶

How to make Loglan words that will resolve as you intend them to is a matter that belongs properly to Chapter 6. There we will consider word-making and all its joys and hazards. But in this chapter we are concerned only with correctly-built words, in particular, with how well-made ones may be separated from one another in the speech stream.

One kind of information we'll need to do this is whether a particular consonant-pair may be initial in a word or not. The rules say that at least one CC must reside in every predicate. It is easy to detect CC's in the speech stream, but as an English-speaker you will find it practically impossible to tell whether the one you're hearing can be initial in its predicate or not...not, that is, until you've become familiar with the Loglan set of permissible initials. The complete set of permissibly initial consonant pairs is given in Table 2.1.^R12

Table 2.1 The 38 Permissible Initial Consonant Pairs


         -b  -c  -j  -k  -l  -m  -n  -p  -r  -s  -t  -v  -z


  b-                     bl              br
  c-                 ck  cl  cm  cn  cp  cr      ct
  d-             dj                      dr              dz
  f-                     fl              fr
  g-                     gl              gr
  j-                         jm
  k-                     kl              kr
  m-                                     mr
  p-                     pl              pr
  s-                 sk  sl  sm  sn  sp  sr      st  sv
  t-         tc                          tr  ts
  v-                     vl              vr
  z-     zb              zl                              zv

Any CC that is not in this set cannot be initial in a Loglan word. We'll call the complement of the tabled set the impermissible initials.

Sixteen of these pairs do not occur initially in any English word. Yet none of them is particularly difficult for an English-speaker to produce. For example, the c-initial series ck cl cm cn cp cr ct, which, with added [uh], become [shkuh shluh shmuh shnuh shpuh shruh shtuh], do not occur in English; but all occur in German and are surprisingly easy for us English-speakers to pronounce. Having found that out, why not try the other foreign-looking pairs? Try dz in [DZOH-soh] dzoso ('soap'), jm in [ZHMEE-teh] jmite ('meet'), mr in [MREH-noo] mrenu ('man') which you already know, sr in [SREE-teh] srite ('write'), ts in [TSEHR-oh] tsero ('error'), vl in [VLAH-koh] vlako ('lake'), vr in [VRAH-noh] vrano ('liver'), zv in [ZVOH-toh] zvoto ('out/outside of') and zb in [ZBOO-mah] zbuma ('explode'). You may have been surprised to learn that, while these words may look odd to you, they are not at all odd for your English-trained ears and tongue to hear or say...despite their unmistakably Slavic ring.

You may well ask how Loglan acquired such a formidable set of permissible initials...far larger, for example, than the English set. Certainly this is strange for a vowel-rich language, as Loglan very largely is. The reason is simple. In order to build as much cross-cultural recognizability into Loglan primitive predicates as possible, the phonological features of many quite different language groups had to be combined in deriving its primitive words. For there are Slavic, Germanic, Oriental, Indic and Romance languages in the source set. For example, vl is common in Russian; and the pair dz is quite common in Chinese. The Loglan word for 'go' is largely made from English 'go' and Chinese 'dzou', and came out [GOHD-zee] godzi. So it is the very internationality of Loglan's word-sources that has led to its very wide range of permissible consonant-pairs.

Table 2.2 The Impermissible Medial Consonant Pairs

Any pair of consonants C₁C₂ is permissible in the middle of a word unless they are one of the following:

	E.g.
C₁ = C₂	*kk
C₁ is h	*ht
C₁ is the unvoiced variant of C₂	*pb
C₁ is in /p t k f/ and C₂ is in /j z/	*pj
Both are in the set /c s j z/	*cs
They are bj or sb.

Consonant-pairs which occur in the middle of words ("permissible medials") are much less restricted, of course. They are defined by means of their complement set, the impermissible medials, in Table 2.2. We will not need to use this information until we consider matters of word-building in Chapter 6. But the permissible and impermissible initials are critically involved in word resolution, which is the topic we will take up in Section 2.18.

The general formula for predicates is a trifle complicated; but the interested reader will find it (in a probably outdated version, 2017) in the chapter notes.¹⁷

We continue with a description of the Loglan syllable^R8. This was not adequately defined in any source prior to the 2013 reimplementation of the language, though syllables are clearly essential to the recipe for predicates. Part of the reason for this, we believe, is that the units making up cmapua on the one hand and compound predicates on the other, while certainly related to syllables, are not themselves syllables. The full analysis of syllables is only active in the case of borrowed predicates and, since the 2013 reconstruction, names.

The Loglan syllable is divided into three parts. It starts with an optional group of consonants (a vowel initial word may lack this component). An initial group of consonants is either a single consonant, a permissible initial pair of consonants, or a triple of consonants in which each adjacent pair is a permissible initial pair.

The second component of the Loglan syllable is a mandatory vowel segment. This consists either of a single vowel V, an optional or mandatory monosyllable vowel pair vv, or a continuant, one of mm, nn, ll, rr. A continuant will not occur adjacent to another occurrence of the same consonant (on either side), regardless of intervening hyphens or stress marks or whether the other occurrence is in the same syllable.

The third component of the Loglan syllable is an optional final consonant or pair of final consonants. Neither of these final consonants can be initial in a forbidden medial pair or triple (see Table 2.3) of consonants (regardless of how hyphens or stress marks may be placed in it, and regardless of whether the medial pair or triple continues into the next syllable). A final consonant may not be immediately followed by a regular vowel (a group of consonants prior to a regular syllable will contribute at least one consonant to the following syllable). A pair of final consonants cannot consist of a non-continuant followed by a continuant (such a pair would seem to be forced to be pronounced as an additional syllable; this is the only phonetic rule in the 2013 reimplementation which is not directly supported from the sources).

None of these adjacency rules apply across an actual pause in speech (the break between a word and the next word).

Names must resolve into syllables of the shapes indicated, with no additional restrictions other than that names are consonant final and followed by pauses. Predicates must resolve into such syllables, with additional constraints.

2.15 The Varieties of Predicates

There is one variety of Loglan predicates we can dispose of immediately. These are the numerical or logical words which are used grammatically as predicates but are morphologically indistinguishable from structure words. These are the mathematical predicates, the ordinals and cardinals like English 'first' and 'dyad' which are [NEHR-ee] neri and [TAWR-ah] tora in Loglan and the identity predicates of which the prototype is bi [bee]. Bi is the 'is' of identity, as in [lah-JAHN . bee-let-see-TOH-ah] La Djan, bi le tsitoa = 'John is the thief'...a clear case of a predicate masquerading as a little word. All words like bi, neri and tora are semantic predicates even though morphologically they are structure words. We will take up the construction of such little word predicates in Chapter 6 and their uses in the appropriate places of the grammar chapters. But we will not be concerned with them any longer in this one.

Of morphologically recognizable predicates--that is, of the words that have the properties described in the previous section--there are three distinct types: primitives, complexes and borrowings. Let us consider these important divisions of the predicate vocabulary one at a time.

Primitives are the fundamental building blocks of predicate meaning in any language. They are the 'dog' and 'cat' and 'girl' and 'boy' words of any language, and are never derived from anything else in it. In Loglan, such words are always five letters long and they come in just two forms: the CCV'CV form of [MREH-noo] mrenu 'is a man' and the CV'CCV form of [FOOM-nah] fumna 'is a woman'. Primitive predicates often have reduced or combining forms that we'll call djifoa (formerly, affixes). These are usually shortened versions of the primitives themselves that appear as parts of longer words. Short djifoa (there are longer ones) are three-letter forms like [mreh] spelled mre from mrenu and [foom] and [fwah], spelled fum and fua, from fumna. The longer words constructed from these djifoa are the complex predicates.

Complexes are predicates that are composed entirely of combining forms called djifoa (the older, universally deprecated term for these is "affixes").¹⁸ Thus [MRESH-lee] or mrecli means 'manly' or 'man-like' and is composed of two djifoa: mre- from mrenu and -cli which comes from [SHLEE-kah] clika. Clika is also a primitive predicate and means 'is similar to' or 'is like' something else in some respect. Thus, standing behind each complex there is a defining metaphor, in this case mrenu clika or 'man-like'. In a similar fashion, the word for 'womanly' is derived from the phrase fumna clika or 'woman-like'. The word is [FWAHSH-lee] or fuacli because the preferred djifoa of fumna in this position is fua. We have already seen how 'eye-doctor' can be made from two djifoa plus a separating vocalic hyphen: [MEH-kuh-kyoo] or mekykiu. In that complex, mek was derived from [MENG-kee] menki, which is the primitive for 'eye', and kiu from [KEESH-moo] kicmu, which is the primitive for 'doctor' or 'physician'. The metaphor behind the word for 'thief' in Loglan is "criminal-taker". The word for 'crime' or 'criminal' is [TSEE-meh] tsime; its djifoa is [tsee] tsi. The word for 'take' is [TOHK-nah] tokna; its djifoa is [TOH-ah] toa. Thus you will not be surprised to learn that the word for 'thief' is [tsee-TOH-ah] tsitoa, a word you have already seen.

Borrowings are predicates that imitate words of similar meanings in other languages. If they are Loglan borrowings, they must have all the properties of a Loglan predicate--see Section 2.14--but not be either primitive or complex. Formally, they are whatever is left over in the domain of predicates once the primitives and complexes have been accounted for. As you might imagine, that embraces a huge variety of shapes and sizes. The flexibility of Loglan borrowings is a deliberate design feature of the language. It allows good imitations of words from an extremely wide variety of source languages. For example, protoni is an excellent imitation of English 'proton'; it is an even better one of Italian 'protoni'; and it meets all the other requirements of a Loglan borrowing.

Obviously a Loglan borrowing must never imitate a complex by resolving into djifoa. For if it did, it would be treated by the "resolver"--your computer or some other unforgiving auditor--as that complex. That is to say, you couldn't borrow a word shaped like mekykiu and expect it to be heard by your fellow loglanists as anything but mek + y + kiu. But there is a sense in which borrowings may be shaped like primitives. For if a potential borrowing does look like a primitive--that is, if the natural word is already of either mrenu- or fumna-form, as the Swahili word 'simba' and the Aleut word 'parka' both are--it may be taken into the language anyway, but as something that is morphologically, at any rate, a primitive predicate. Thus, there are many borrowed words in Loglan that are morphologically not borrowings at all but primitives. These are labelled S-Prims in the dictionary when they are borrowed from science (e.g., [SHLAWR-roh] cloro for 'chlorine', [FLOOR-roh] fluro for 'fluorine'), I-Prims when they are local words that have recently become international (e.g., [FOOT-boh] futbo for '(a player of) international football' and [TEL-foh] telfo for 'telephone'), and N-Prims when they are still "native" to, or characteristic of, some local people or place, such as [PAHR-kah] parka for 'parka' and [SEEM-bah] simba for 'lion'. Collectively, these borrowed primitives are called single-source primitives to distinguish them from the composite primitives that are derived from multiple sources. The latter tend to be mosaics of natural fragments--mrenu and fumna are examples--while the former tend to be Loglan variants of an already widely-traveled single word (futbo).

Borrowings satisfy certain phonetic and orthographic restrictions which are driven by these considerations. A borrowing may not contain y. A borrowing may not contain a disyllable of one of the shapes aa, ee, oo which imposes the requirement that one of the syllables be stressed. Any occurrence of ii, uu in a borrowing is required to be monosyllabic. A borrowing will not contain an initial syllable whose vowel segment is a continuant, nor will it contain two successive syllables whose vowel segments are continuants. A borrowing satisfies a set of restrictions on where hyphens or stress markers may be written which prevent an illegal complex predicate from being converted to a legal borrowing by putting a syllable break in an odd place.^R9

Let us consider briefly how borrowings are made. The full story will not be told until Chapter 6. But we need enough information now about borrowings to recognize them in the speech-flow. The loglanist's aim in making a borrowing is, first, to satisfy him- or herself, and then The Loglan Institute, that the new concept should be made as a borrowing and not as a complex; and two, once the borrowing strategy has been decided upon, to make the best possible imitation of the source word, or family of source words, given the resources and limitations of Loglan morphology. Thus [proh-TOHN-nee] protoni and [et-HEEL-lee] ethili are both excellent borrowings because they are unmistakable members of the international sets of scientific words to which English 'proton' and 'ethyl' belong. On the other hand, [aht-HOHM-mee] athomi is not so good a member of the international family to which 'atom' belongs, although clearly it is a member of it. It is, for example, very similar to the Italian plural, which is 'atomi'. But athomi is the best we can do given the requirement that each Loglan predicate must contain at least one consonant-pair. None of the natural words for 'atom' do. But without that inserted h, [ah-TOHM-mee], for example, would "fall apart" as the phrase a to mi; that is, it would appear to the listener to be that phrase. Thus the phoneme h is conventionally introduced into such borrowings to prevent them from falling apart.

At the moment the international vocabulary of science and technology is being freely incorporated into Loglan by making scientific primitives and borrowings. Local food, tool, clothing and music words are also being freely borrowed, especially if their local names, like 'kayak' and 'atyl-atyl'--yielding Loglan [kah-YAHK-hoo] spelled kaiakhu and [aht-LAHT-loo] spelled atlatlu--have already been appropriated by international scholarship. But the current policy of The Institute is to recommend that writers and translators working in areas other than science make the new words they require as complexes. Such policies will of course be subject to change as the language and its uses unfold. Institute policy is discussed as an aspect of word-making in Chapter 6.

2.16 Djifoa ("Affix") Shapes

Before we leave the topic of predicates we need to say a word about the djifoa out of which complex predicates like mekykiu and tsitoa are made. An understanding of the range of djifoa sizes and shapes is crucial for discovering whether a given predicate is a complex or not; and that, in turn, is crucial for recognizing a borrowing. For borrowings, the reader will recall, are just those predicates which are not of primitive shape and which do not resolve as complexes.

You may have noticed that there were three djifoa-shapes involved in the examples of complex predicates given above: the CCV-shape of mre and cli, the CVC-shape of fum and mek, and the CVV-shape of fua and kiu. This is the complete set of "short", i.e., three-letter, djifoa forms. But four- and five- letter djifoa are also derivable from any primitive. For example, the word for 'is a science of' is [SEN-see] sensi; and any primitive may itself be the final djifoa in a complex, as in [tahr-SEN-see] tarsensi, which is the word for 'astronomy'. In this complex the three-letter djifoa tar comes from [TAHR-shee] tarci 'star'. In addition, the final vowel of any primitive may be replaced by y to produce a hyphenated four-letter form to be used in non-final positions. For example, [mreh-nuh-SHLEE-kah] mrenyclika is another and plainer form of mrecli 'man-like' in case a writer or a teacher should require its length or transparency. Furthermore, irregular djifoa may be derived from any borrowing by simply appending y to it^R10. For example, the word [ah-oo-strr-ah-loh-peet-HE-kwee] austrralopithekui (notice the doubled continuant) is borrowed from the Linnaean genus name 'Australopithecus' and is a paleontological term meaning '(is an) australopithecine'. Suppose someone wanted to make an even finer-grained scientific complex from the idea '(is) australopithecine in form'. In scientific English the word 'australopithecoid' conveys this meaning. 'Form' is [FAWR-mah] forma in Loglan, and it has a CVV djifoa, [FO-ah] foa, which we are free to use in this position. So [ah-oo-strr-ah-loh-peet-he-kwee-uh-FOH-ah] austrralopithekuiyfoa is the desired word and says it all. This new word consists of exactly two djifoa, one very long one derived from austrralopithekui by appending a y (giving austrralopithekuiy-), and one very short one derived from forma by dropping both its medial consonants (giving -foa). The doubled continuant /rr/ in austrralopithekui does the same kind of work as inserted /h/ does: it is preventing the /au/ from falling off.

This is not all of the morphology of predicates, but it is enough to take us a good way into this book. The primitive predicates of Loglan are found in Appendices B and C; their djifoa may be looked up in Appendix D; a short list of scientific borrowings will be found as Appendix E; and a sample of complex predicates may be examined in Appendix F.

2.17 Predicate Joints

We must now consider what can happen at the joints of a complex predicate. Some types of joints between djifoa are disallowed. For example, if a word-maker is planning to join a CVC-shaped djifoa to either a CVV-shaped djifoa (like fua) or a fumna-form primitive, but the C/C joint between them is not a permissible medial pair--the impermissible ones are shown in Table 2.2--then that joint must be either hyphenated or avoided. It is this consideration that puts the y in mekykiu. It would also prevent us from making a word like *hap+balma ("happy-ball"), because the p/b joint, too, is disallowed by Table 2.2. (Even in conditions of low noise, such pairings of an unvoiced sound followed by its voiced companion tend to be unintelligible. The pair reduces to its voiced member, in this case b; and what will be heard is ha balma.¹⁹)

Another type of joint is made when a CVC-form djifoa is joined to a CCV-form djifoa or to a CCV'CV-form primitive. In either case this forms a C/CC-type joint such as those tabled in Table 2.3. All such joints must be carefully checked against the proscribed forms found in that table, for a considerable number of them--nineteen, in fact--have been found to be unintelligible.

Table 2.3 Unintelligibility at the C/CC Joint

The following combinations are unintelligible and should be hyphenated or avoided:

 c/dz  c/vl                           n/dj  n/dz
 d/cm  d/ct  d/ts                     p/dz
 g/ts  g/zb                           s/vl
 j/dj  j/tc  j/ts  j/vr               t/vl
 k/dz                                 v/ts
 m/zb

It is important to note that these consonant triples are forbidden in all Loglan words, not only in complex predicates, and they remain forbidden if an internal syllable break is placed differently. As noted above, a final consonant in a Loglan syllable is forbidden to be initial in such a consonant triple (including one which runs into the following syllable), which prevents them from occurring in any Loglan word at all.

For example, suppose one was making the word for 'understand' in the sense of understanding the meaning of a sign. Suppose one was basing the construction on the metaphor "sign-know", which in Loglan is [SAHN-pah-JAHN-noh] spelled sanpa djano. An attractive pair of djifoa from these two words is san+dja. But there is one problem. Even in conditions of virtually no noise the n/dj joint promptly reduces to dj in the ear of the listener, and therefore it is one of those proscribed by Table 2.3. What would be heard if the n/dj joint were used is [SAH-jah] sadja. Sadja is a legitimately shaped word; but it is not the one intended.

The correct move in this case is to use the saa djifoa of sanpa, and make the complex as [sah-AHD-jah] saadja. We shall consider such problems under word-making in Chapter 6. But the point here is that, as a consequence of using Table 2.3 to check their C/CC-joints, word-makers will sometimes hyphenate these otherwise proscribed joints in their creations. Thus a word like [SAHN-nuh-jah] spelled sanydja could turn up in the language.

Another thing that can happen at the joints of a complex is consonantal hyphenation. This happens when someone has built a complex from two CVV-form djifoa like fua and saa. The preliminary result, ?fuasaa, would have the form CVV+CVV, and so would be a word without a pair of adjacent consonants; and such a word could not be a predicate.²⁰ To turn such constructions into predicates, a consonantal hyphen-like infix must be used. We use the pair of continuants /r n/ for this purpose. The sound /r/ is the primary allomorph of this hyphenating morpheme; it is used whenever the following consonant is not another /r/. The sound /n/ is its secondary and used only when the following consonant is /r/. An example of a word that might be made with this /r n/ hyphen is [BOUGH-rr-mough] spelled baormao. It is composed of bao + r + mao, and bao and mao come from the metaphor bakso madzo, which means 'box-maker'. And baormao now has the consonant-pair that the resolver needs to recognize it as a predicate word.

A final subtle point is that an initial CVC djifoa which is followed by a consonant which would make a permissible initial pair must be y-hyphenated, unless the rest of the word is a single three-letter djifoa. This is a rule added in the 1990's which replaced an opposite approach taken earlier. It was originally the case that paslinkui (to give the classic example) was a legal complex, and so the borrowing slinkui was forbidden so that pa- would not fall off the front of paslinkui. This was called the "slinkui test". It is now the case that slinkui is a legal borrowing and that the complex must be pasylinkui. CCVV borrowings such as proa are forbidden, however, so teproa is a legal form for a complex.

2.18 Resolving Words

We commenced this chapter with the observation that the word-forms of Loglan are so regular that the boundaries between them can be quickly and easily sensed by a newcomer even if he hears no pauses. We suppose that this feature of the language, which no natural language shares but all approximate, will not only contribute to its usefulness as a laboratory instrument, and perhaps also to its effectiveness as an interface between humans and their machines, but will also make it remarkably easy for adults to learn. Children, note, do not suffer so keenly from the word-boundary problem. Even in learning second languages they tend to learn words one at a time and in known sentence-frames. E.g., 'C'est la plume', 'C'est le chat', and so on. But to adult second-language learners, the way the speech stream of their new language either does or does not sort itself out into words is a crucially important fact about it. Let us now explore the process of determining word-boundaries in Loglan informally.²¹

Suppose you hear the pauseless utterance:

[yoh-dah-pah-KAHM-lah]

Your first job is translate what you hear into a stream of Loglan phonemes, a task that we predict you will soon be performing swiftly and automatically. Thinking back over what you heard--assuming you pronounced this string of sounds according to the guide--let's assume you are able to translate these sounds into the following phoneme string:

/iodapaKAMla/

Let us suppose further that you have never heard any of these words before. Yet you probably sense that the utterance is composed of three little words (io da pa) followed by a predicate (kamla).

You're right; but how did you know this? Well, the /ioda/-part of the unstressed initial sequence /iodapa/ can be nothing but a pair of little words, or a compound structure word, or two syllables of a longer structure word, for no predicate can begin that way. And if /iodapa/ were part of a name, there would be a final consonant somewhere followed by a pause, and there isn't. As for /KAMla/ we sense intuitively that it is a primitive predicate, that it can't be anything else. For example, if we thought that /KAMla/ were only part of some predicate and that /da/ were its head, making the trial word ?dakamla, we would sense immediately that the /da/ would fall off. Since we can't make anything stick to /KAMla/, kamla must be the word. (Here and in the sequel we will mark trial words with a prefixed '?' whenever the reader cannot yet be expected to know whether they are good Loglan words or not.)

So much is correct, informally. But what if you had heard a pause, for example, after /IOD/?

*/IOD.apaKAMla/

Then you would know that Iod was a name-word; and the rest of the string would resolve just as uniquely as the structure word a followed by pa followed by kamla. And why is not /KAM/ a name and /la/ a structure word? Because again the hypothetical name is not followed by a pause. Io da pa kamla, by the way, means 'Probably X came'. *Iod, a pa kamla, in contrast, is not grammatical and is for that reason starred. But it can be literally translated anyway, and into equally ungrammatical English: *'Yode; and/or came'.

But what about finding word boundaries between polysyllabic predicates? Here is an utterance with one or more long words:

[tah-brah-GIGH-grah-tahr-SEN-see]

Let us suppose you can hear the phonemes correctly, and so hear this:

/tabraGAIgratarSENsi/

First, we note that there are no names. If there were one, there would be a consonant followed by a pause; and there isn't. So we have only structure words and predicates to disentangle. Second, we recall that no structure word can contain a consonant pair and that all predicates do contain at least one pair. We note that we have four CC's in this string (/br gr rs ns/) and that the first one is /br/. Like any consonant-pair /br/ must be part of some predicate or some name; and there are no names. So /br/ is part of some predicate, and because it is a permissible initial (see Table 2.1) it may be the start of one. In fact, since it is the first CC in the utterance, if /br/ is not the start of the first predicate, it must be very near its start; for only a CV-shaped segment may precede it in the predicate. (See Note 17 again.) Third, we recall that every predicate has exactly one stressed syllable, and that that syllable is always the penultimate one. Well, the first stressed syllable in this utterance is /GAI/. Since /GAI/ is later than /br/, it must be the penultimate syllable of the same predicate of which /br/ is, or is near, the start. If /GAI/ is the penultimate syllable of some predicate, then the /gra/ that follows it is the ultimate syllable of that same predicate. So we have found at least one of the word boundaries we seek, namely, the one between /tabraGAIgra/ and /tarSENsi/. Let's write that word-boundary with the usual interverbal space in this partial resolution:

/tabraGAIgra tarSENsi/

We now notice that /tarSENsi/ is pretty obviously a single predicate. It consists of two CVC-shaped syllables and a final CV-shaped one, and the stressed syllable is where it belongs: penultimate in the word. In fact /tarSENsi/ can be nothing but the predicate tarsensi. We already know that it is made of two djifoa: tar from tarci [TAHR-shee] which means 'star', and the full five-letter primitive form sensi [SEN-see] which means 'science'. So it is a word that means 'star-science' or 'astronomy'. It is obviously a complex and not a borrowing or a primitive.

But what about the sequence to the left of our word-boundary, /tabraGAIgra/? /GAIgra/ could be a predicate because it, too, resolves as a complex; but so could /braGAIgra/. Each is a string of djifoa-shaped triplets. But /GAIgra/ does not contain the other piece of the predicate we have already located, namely the consonant-pair /br/. We know that /br/ must be inside the predicate whose stressed syllable is /GAI/. There is only one stressed syllable in this sequence, and that is /GAI/. So of the two possibilities, the predicate can only be /braGAIgra/.

From an entirely different perspective we can see that /tabraGAIgra/ can't be a predicate. It is not a string of djifoa-shaped triplets and it has no hyphens; so it is not an irregular complex, that is, one made with an irregular djifoa. It is certainly not a primitive. So if it is a predicate at all, it must be a borrowing. But if it is a borrowing, its first syllable /ta/ will "fall off". That is to say, it can resolve as the string Ta bragaigra with Ta as a separate word; and if it can, it will. Evidently *tabragaigra does not have Property 5 of Section 2.16. It does not resolve uniquely as a single word. Since *tabragaigra is not a word, and bragaigra can be, we now have only that one possibility left to consider. So bragaigra must be the predicate we have been looking for. We have found the last word-boundary in this pauseless utterance, the one between Ta and bragaigra. We have resolved the utterance as a three-word string:

Ta bragaigra tarsensi

But what does it mean? Ta is obviously a structure word. In fact, it is the demonstrative pronoun 'That'. Ta is followed by two predicates. We already know that the second one means 'astronomy'. The first one is composed of three djifoa bra + gai + gra. If we looked up bra in Appendix D, we would find it is derived from [BRAH-nah] brana 'born/born to'; gai is derived from [GAHR-nee] garni '(to) govern/rule' or '(be a) governor/ruler'; and gra is short for [GRAH-dah] grada 'great'. The defining metaphor of this complex is evidently brana garni grada or 'born-ruler-great'. Well; what is a "born ruler"? An hereditary monarch, a king or queen. What is a kingly kind of greatness? Well; possibly majesty or magnificence. Let's guess.

If you guessed that the sentence may be translated as 'That's magnificent astronomy' you would be right.²²

I do not mean to imply by these few examples that learners listening to the flow of Loglan speech will actually go through these lucubrations to find these boundaries out. What I do mean is that it has been one of my research hypotheses in building Loglan that the presence of such deducible regularities in the morphological structure of utterances, as well as in the structure of complex words, will lead to rapid and largely unconscious inferences on the part of listeners--in this case, to inferences about the identity of elements in the speech-flow and about the components of the words themselves--and that this in turn, will enhance learning. This hypothesis remains to be tested by controlled experiment. But the descriptive evidence about the way these matters work in the natural languages, together with the kinds of errors learners do and don't make when listening to spoken Loglan, already gives it a certain plausibility. What we have done, as we will many times do again, is taken a natural tendency of the genus Language and pushed it outward toward some formal limit. What we have yet to discover is whether such deductively discoverable regularities in the speech-flow have any functional bearing on the listener's role in speech.

2.19 A Summary

We have learned that Loglan has a phonemic alphabet of 23 letters and that its letters are those of the English alphabet minus q, w, x:

a b c d e f g h i j k l m n o p r s t u v y z

All except six of these letters are pronounced as they usually are in English. The exceptional six are c, which is sounded like [sh]; i, which is sounded like [ee] or [y] depending on context; j, which is sounded like the 'z' in American 'azure' or the French 'j' of 'Jacques' and is written [zh] in the guides; u, which is sounded like [oo] or [w] depending on context; y, which is the slack, short sound of 'a' in 'sofa' and is written [uh] in the guides, and h in syllable final position, which is pronounced like the "ch" in Scottish English "loch".

y [uh] has a special role in the language in that it is used to spell some borrowed names, but in addition it plays the role of a hyphen in complex predicates like mekykiu.

There is some contextual variation. The letter e sounds like [eigh], i like English [y], and u like [w] before vowels; the letter o sounds like [aw] before r and i; and the letter n sounds like [ng] before g and k. We have already commented on the pronunication of final h. The consonants m n l r have vowel-like values that are spelled [mm nn ll rr] in the guides, but these are used only in names and in some borrowings.

Names unless final are always followed by pauses. Emphatic syllables before predicates are always followed by pauses. Vowel-initial words are always preceded by pauses.

Any syllable of a name or structure word is free to be stressed or not stressed; but if its stress is not penultimate, then the abnormally stressed syllable is marked with a following apostrophe (Paris'). The penultimate syllable of every predicate is always stressed in some degree, but syllables without regular vowels don't count in determining stress; hence /MEKykiu/, not /meKYkiu/. No other syllable of a predicate word is ever stressed.

Predicates are of three types: primitives, which are either of mrenu- or fumna-form; complexes, which, like bragaigra, tarsensi resolve into unique strings of djifoa; and borrowings, which, like igllu, protoni and athomi, are neither primitive nor complex.

These rules, together with certain restrictions on consonant-pairing and the rules for pronouncing vowel-groups, are sufficient to guarantee that any grammatical string of Loglan sounds, no matter how rapidly spoken, can be uniquely resolved into words.²³

Notes

A brief note on the notes (2017): we will avoid rewriting any notes that JCB wrote. We may add comments to his notes, and we will add additional notes of our own.

1 The exact means by which this was done is given in Chapters 3 and 4, of Brown (1969a). An approximate account is given in Chapter 6 of the present volume.

2 The sound of [h] does not occur in standard Spanish. For example, it is not even listed as a variant of [x] by Wise (1957) in his chapter on "Spanish, Including Mexican". There is, to be sure, a variety of "Californian", or "Northwestern Border Mexican", in which IPA [x] (Loglan [kh]) has disappeared altogether, and has apparently everywhere been replaced by English [h]. In that dialect, I was myself astonished to hear, Spanish 'Jorge' has become [HAWR-heigh], with no trace of the "rough breath" of [x] remaining.

3 A more complete account of Loglan phonology and morphology is planned for in Loglan 6: Formal Structures.

4 JCB originally said here "Note also that [ah eh ee oh oo] are in fact very widely spaced sounds. If a Frenchman pronounces the vowels of matma (Loglan for 'mother') as the French [aa] of 'la' ([ script 'a' ] in IPA), or if an Englishman uses the short 'a' of 'sofa' ([ inverted 'e' ] in IPA) for the final, unstressed vowel of this same word--as both are very likely to do--it hardly matters. For neither of these "free variations" of Loglan a will turn matma into metma, mitma, motma or mutma or into matme, matmi, matmo or matmu; and these are the only confusions with the vowels of matma that can occur in these positions in Loglan" and commented "The addition of schwa ([uh]) as the sound of Loglan hyphen in 1986 weakened this argument but did not quite destroy it. Considerations of stress are still capable of distinguishing an Englishman's matma matma from his hyphenated matmymatma. Thus [MAHT-muh-MAHT-muh] differs from [maht-muh-MAHT-muh] in one clear way. But certainly the argument that this distinction will regularly be drawn in this dialect is now weaker. The only safe thing for English- and German-speaking loglanists to do, in fact, if they want to be certain of being understood by their computers now that [uh] is with us, is to speak Loglan like a Spaniard; for the Spaniard's [MAAT-maa-MAAT-maa] will differ from his [maat-muh-MAAT-maa] in two particulars rather than one." But his comment is too mild: it is important for the English or Russian speaker not to degrade the final unstressed vowels of Loglan to a schwa sound.

5 The sound schwa ([uh]) spelled by the letter y, was originally introduced in Loglan as a consonant-buffer in 1982 (see Brown 1982b). At the time the morph r with three allomorphs r n l was being tried out as the hyphen in complex words. B. Walsh (1983) and R.A. McIvor (1983) both proposed that y replace r as the hyphen as y would require no allomorphs. R.J. LeChevalier (1985) strengthened the case for hyphen y. In 1986 y was officially adopted as the Loglan hyphen and retained as its buffer when certain modifications of the original proposals were made to enable it to play both roles. The dual role of y was described for the first time in Brown (1987).

6 Dr. R.A. McIvor has suggested that "vowel-buffering" may generate another set of Loglan dialects, namely those spoken by loglanists in whose shared native language vowel clusters were rare or difficult. The sounds he suggests as vowel-buffers in such dialects are the two semi-vowels [y] and [w]. They could be used to turn the i- and u-initial VV-series of the standard dialect into definite disyllables. Thus ia, which is standardly [yah], would become [ee-yah]; and ua, which is standardly [wah], would become [oo-wah]; and so on.

MRH notes in 2017 that this note no longer refers to anything in the text, as we have removed the material about buffered dialects. Our belief is that Loglan should have a standard dialect to start with; it does not have the number of interested speakers to allow cultivation of numerous variants.

7 In British English the 'z' of 'azure' is often pronounced as in 'size'. For such speakers the 's' of British 'vision' is a better clue to the sound of Loglan j.

8 Some word-makers feel that only the stressed syllable in these three words deserves the doubled continuant; and they therefore write them as *Rrl, *Mrrtl and *Brrtn. But this is a mistake (and so I have starred them); for all three words have two syllables, and both syllables require the vocalic value of its continuant. To be sure, all the stressed syllables are longer. But the short, unstressed syllables also require vowels. To see this--or rather, to hear it--consider the two English words 'burn' and 'burin'. The first is a monosyllable with just one vocalic consonant, and would be written Brrn (if it were a Loglan name); the second is distinctly disyllabic, and so would be written Brrnn (if it, too, were a name). The only difference between these two natural words is that one has the consonantal, the other, the vocalic allophone of Loglan /n/ in final position. Again this is a difference that is phonemic in English but not in Loglan...except, of course, in just such borrowed names.

MRH adds in 2017: I'm not as convinced as JCB that the correct spelling of "Earl" isn't Rrl; certainly the r is vocalic, but I'm not sure about the l. The 2013 implementation imposed the doubled spelling of continuants: this means that names are often spelled differently than they are in the original sources. Another effect on spelling of names which often involves continuants is that final triple consonants (as in "Holmes", if the l is to be pronounced) are not allowed; where one of the three consonants is a continuant, doubling it (with some effect on the pronunciation) is an often successful fix.

9 The difference between the two sounds represented in IPA by [y] and [Y], both of which occur in German, is too small to differentiate in Loglan. Both 'Muhler' which contains [y] and 'Münster' which contains [Y] may be written with Loglan w.

10 That there is still a category of sometimes-stressed syllables in Loglan morphology reflects the fact that the production of certain classes of Loglan words, principally the compound structure words, has not been observed long enough to furnish us with a clear understanding of the pause- and stress-regularities that will undoubtedly develop in this portion of the Loglan speech system as they have elsewhere. Indeed, perhaps they have already but have not been noticed.

11 Greenhood's and my conjecture (Brown and Greenhood, 1985) that early human speech was song would predict that it is.

12 Possible exceptions to this rule are the names of famous personages which now appear as quasi-predicates in the literature of science or scholarship: 'Marx', 'Freud' and 'Einstein', for example, as in 'Einsteinian relativity'. Do we borrow such names by sight or sound? Do we write Einstein and say [EYN-steyn], or say [IGHN-shtighn] and write Ainctain? This is an open question; there are good reasons in support of each policy. At the moment, among the scholars who have offered opinions on the matter, a slight preponderance seems to favor Einstein over Ainctain...that is, appearance over sound.

MRH adds in 2017: in the 2013 implementation, we take advantage of Steve Rice's useful observation that the so-called "Linnaean" construction is better understood as a universal construction for foreign names, in which foreign spellings can be used. So it is always la Ainctain on the one hand, and lao Einstein on the other.

13 The other closed monosyllable VC, which yields such words as 'of', 'in', and 'at' in English, is at present largely unused in Loglan.

14 With typographically more sophisticated equipment than has been used to set this book, and in handwriting, using an accent mark on the abnormally stressed vowel (as in Spanish), may come to be preferred.

MRH adds in 2017: we are silent in the main text about the fact that the 2013 implementation attaches all indications of stress to the end of the syllable rather than to the end of the vowel segment. The parser is designed to handle explicit stresses written in this way.

15 The original form of this rule was to pair vowels from the left; for it was obvious that the pair-from-the-right rule had some fairly nasty formal consequences. For example, consider the two names (i) Loioioioioioiois and (ii) Loioioioioioioios, the addition of the eighth o to the latter being their only difference. Given pairing from the right, (i) would be pronounced [loy-oy-oy-oy-oy-OY-oyss] while (ii) would be [loh-yoh-yoh-yoh-yoh-yoh-YOH-yohss]. Left-pairing gives the same result for (i) and the not-very-different [loy-oy-oy-oy-oy-oy-OY-ohss] for (ii), and it is all accomplished with no, or little, backtracking. (I am indebted to Dr. Guy L. Steele Jr. for this example.) For this essentially formal reason, pairing from the left was the earliest (1985) form of the pairing rule. But sadly enough empirical considerations soon overturned it. While using the pairing rule in the borrowing process described in Chapter 6, I found that about 90% of the odd length vowel-strings were of length 3; so not much backtracking is required whichever end you start from. More decisively, about 80% of the length-3 strings paired more naturally from the right than from the left. For example, once the Linnaean ending -ea is augmented by /i/, as turned out to be necessary to preserve the distinctions effected by the natural endings of Linnaean words, then the augmented ending -eai is a 3-string which pairs much more naturally as /-Eai/, that is, from the right, than it does from the left, which gives the trisyllable /-eAi/. The first result is pronounced as [-EIGH-igh], the second as [-eigh-AH-ee]; and it is clear that the first is far more reminiscent of the natural ending. Cases of this kind could be multiplied by the hundreds. The outcome of the pairing-rule study is one of the rare cases when a formally superior solution had to be abandoned in favor of a formally weaker one dictated by distributional considerations.

MRH adds in 2017: the rule in the original text on which JCB comments here was to group to the right. The more complex rule that we replaced this with in the 2013 implementation often has this effect, but not always. The rule we use has the formal advantage that it groups from the left again, though with a lookahead of three vowels and it sometimes groups to the right within the initial three vowels.

16 The sequence /tePROa/ will be heard in Loglan, not as the phrase te *proa with its disallowed word-form CCV'V, but as the word teproa, which resolves as the two-term complex tep+roa. This, in turn, deciphers as tepli rodja or "temple-grow". At the moment this is an unassigned metaphor in Loglan, that is, it is not the deriving metaphor of any Loglan complex. The temple- growers haven't arrived yet. But the point is, they could; and so teproa could become a word tomorrow afternoon.

17 Letting '#' stand for word-boundaries, 'Ø' for a null segment of a predicate word, '|' for a boundary of some segment, '/' as the mark of equally permissible alternatives, 'cc' for an impermissible initial consonant-pair, 'CC' for a permissible initial pair, 'vv' for a vowel-pair pronounced monosyllabically, and '(x)' for the statement that the segment x may occur none or more times, there are three types of predicate words: Type I, in which the stressed syllable comes just before the first C-pair; Type IIa, in which the stressed syllable comes after the first C-pair when it is a cc; and Type IIb, in which the stressed syllable follows the first C-pair when it is a CC. Their formulas are as follows:

I	#	C/.	\|	(V)	\|	Ø	\|	Ø	\|	V'/vv'	\|	CC/cc	\|	(C)	\|	V/vv	#
IIa	#	C/.	\|	(V)	\|	Vcc	\|	(C/V)	\|	V'/vv'	\|	Ø	\|	(C)	\|	V/vv	#
IIb	#	CV/Ø	\|	Ø	\|	CC	\|	(C/V)	\|	V'/vv'	\|	Ø	\|	(C)	\|	V/vv	#

What these formulas say is that the Loglan predicate has a uniform center composed of a stressed vowel or a stressed diphthong V'/vv', and a uniform tail which is the next occurring instance of a V or vv after the stressed center. In a Type I predicate a consonant-pair must by definition occur between the center and the tail; it may be either an impermissible initial cc or a permissible initial CC, hence CC/cc. Additionally, there may be a string of none or more consonants (C) between the CC/cc and the tail. (Iglu is a Type I predicate in which there are no additional C's.) In Type IIa and IIb there may also be none or more consonants (C) between the stressed center and the unstressed tail. The Type I head is composed of the first instance of a consonant or a pause C/. that occurs to the left of the stressed center. The head C/. may be optionally separated from the center by a string of none or more vowels (V). In Type II there must be a consonant-pair of some kind (CC/cc) to the left of the stressed center, and between that pair and the center there may be a string of none or more consonants or vowels (C/V) in any order. In Type IIa the earliest consonant-pair in the predicate is, by definition, an impermissibly initial cc. The cc must therefore be preceded by a V, and the V in turn must be preceded by either a consonant or a pause C/.; so the minimum Type IIa head is .Vcc (e.g., alk- in [ahl-KAHL-ee] alkali, a Type IIa predicate). But between the head C/. and the obligatory Vcc segment of a Type IIa predicate may be a string of none or more vowels (V). (Alkali has none.) In Type IIb predicates the earliest C-pair is, by definition, a permissibly initial CC. The CC is either initial, as it is in the predicate mrenu, or preceded by a CV-pair, as it is in pasnaodei = 'yesterday'; hence by CV/0. Whether the first CC of the Type IIb predicate has CV or 0 as its prequel is settled by the "Slinkui Test" described in Chapter 6.

MRH: This note is certainly outdated in details. For example, the slinkui test was abandoned in the 1990's. We leave it for historical interest, and we may eventually correct it.

18 'It seems repugnant to me--if not linguistically immoral--to have something composed entirely of affixes.' (Prof. P.D. Seaman, personal communication and jocular.) I agree that 'affix' is not going to be comfortable for most linguists in this context. What I really mean is almost but not quite 'bound form', which would include some stems as well as affixes. But that phrase is both too long, and too technical to be used as frequently as the word that conveys this notion must be used in this book. 'Affix' is the next most accurate word and quite commonly understood to indicate bound forms. The inaccuracy of both technical terms for the current use is that some of the objects I shall be calling "affixes" are free forms, e.g., -sensi in tarsensi. But most Loglan complexes are "composed entirely of bound forms", namely all those made entirely of three-letter segments. In these cases, all segments of the word are (in my sense) affixes because, except in a rather pale semantic sense (in which I suppose the modificand in the defining metaphor could be argued to provide the stem), there is no stem. If Prof. Seaman can agree that many linguistic objects can be, and often are, composed entirely of such "bound forms", then perhaps he (and others) will be kind enough to place that interpretation on my usage of the English word 'affix'. If we were speaking Loglan, I would suggest djifoa ("join-form"), a word which is itself composed entirely of djifoas.

MRH: We have entirely replaced use of "affix" with use of "djifoa" in Loglan grammatical conversation. The name Affix still appears here and there in the computer grammar, alas.

19 The intelligibility studies on which this section and Tables 2.1 -3 are based were reported in Brown (1982b).

MRH adds, 2017: We do not necessarily take claims of pronounceability or intelligibility underlying the tables of permitted and forbidden sound combinations altogether seriously. We do note that natural languages have such tables implicit in them, which often do not correlate with actual difficulties of pronunciation: no English word starts with "vl", though English speakers have no difficulty pronouncing initial "vl". The phonetic rules give Loglan a particular esthetic character, in our view.

20 Words of CVV+CVV form do exist; but they are acronyms: for example, TaiVai = TV. As acronyms, such words are treated grammatically as predicates, but they are not morphologically detectable as such. Morphologically, CVV+CVV-form words are compound little words, and it is as such that the resolver discovers that they are acronyms and feeds them as predicates to the parser.

MRH: Acronyms are names rather than predicates after the 2013 reimplementation. They are still phonetically irregular in being name words which are not consonant-final.

21 A computer-executable algorithm that accomplishes word-resolution in well-formed Loglan utterances will be found in Loglan 6.

MRH: The computer parser developed since 2013 parses from the level of letters (and other characters) upward. It knows how to resolve words. So this aim has been achieved.

22 One of the three emphatic versions of this sentence, the one that would translate 'That's magnificent astronomy!', would have to have a pause between the stressed /TA/ and the rest of the sentence, according to the rule on p.80. This gives the production /TA . braGAIgratarSENsi/, which then resolves, as desired, as Ta bragaigra tarsensi! Without that pause, however, /TAbraGAIgratarSENsi/ becomes something else, namely Tabra gaigra tarsensi!, which is an imperative which I can't translate for you because tabra is not yet a word. This illustrates the importance of the rule on p.80 that stressed syllables that just precede predicates must be separated from those predicates by pauses.

23 The evolution of the morphology described in this chapter was reported in a series of papers over an eight-year period (Brown, J.C. 1979a,b,e, 1980a,b, 1982b, 1983a,b,c, 1987). Works that also contributed to the unfolding morphology were Barton (1978a,b,c), Brown, J.R. (1979), Carter (1981), Chapman (1987), Darwin (1978b, 1979), Johnson, R.W. (1978), Lovatt (1977), McIvor (1980, 1981a,b,c, 1983), Parks-Clifford (1977a,h,i, 1978, 1979, 1980), Parlette (1978), Prothero (1981), Rosenberger (1981) as well as the papers of Walsh and LeChevalier cited in Note 5.

The further development of Loglan morphology since 1989 is discussed in Appendix H of this volume and in MRH's reference grammar.

R1 The letter q was supposed to be pronounced like English "th"; the letter w was supposed to be pronounced like German u-umlaut; the letter x was supposed to be pronounced as the voiced fricative corresponding to g. These sounds were already being purged from the dictionary in the 1990's; in the 2013 implementation, we have removed the last vestiges of these from the dictionary and do not allow them to appear in Loglan except in foreign text. We have also added Loglan names for these letters which do not involve the now forbidden letters themselves, such as haiu, "the letter x". The other names are kaiu for q and vaiu for w.

R2 This was briefly not the case in the 1990's as can be read in Appendix H. For some time there was a ruling that the pauses between components of serial names such as Djan Braon were instances of a different pause phoneme, in order to support the construction of serial names like Djan Blanu, "John the Blue", and distinguish La Djan, blanu, a sentence, from La Djan Blanu, a name. It struck us (and no doubt others) as absurd that a minor feature such as serial names should have a complex systematic effect on the phonology: we fixed this by requiring that predicate components of serial names be introduced by the particle ci, allowing all pauses to be formally equal again: "John the Blue" is now La Djan ci Blanu. Note that, while there is a pause after Djan in the last utterance, it does not have to be marked with a comma.

R3 JCB here says "The pause used in nearly all these intervocalic contexts is a glottal stop. This is the sound--or rather, the brief absence of sound--that replaces intervocalic /t/ in some Northeastern dialects of American English...in Brooklyn [BAH.ll] for 'bottle', for example. (Try pausing briefly instead of saying the /t/ of 'bottle' in order to hear your own glottal stop.) Glottal stops are always represented by "close periods" in the guides, i.e., by periods without spaces around them."; Randall Holmes writing in 2017 disagrees. I (MRH) think that pauses should be actual pauses, however brief, allowing the glottal stop to appear between vowels in disyllabic pairs.

R4 The more complex formulas presented here are not the result of any innovation in the 2013 implementation: I am giving the full description of compound structure words found in Notebook 3, most of which is actually implicit in verbal descriptions later in the section, except for the point that compounds containing VV units contain only VV units, which is not made in the original 1989 text.

R5 JCB says in 1989: "A brief glide may occur between the two vocalic syllables, but not a glottal stop. The latter would cause the resolver to perceive a word-juncture between the two vowels." MRH writing in 2017 respectfully disagrees: we do not regard the glottal stop as an allophone of the pause, and so think that a glottal stop is appropriate as an alternative to a glide in the disyllables.

R6 As part of the 2013 implementation, in order to avoid nasty phonetic problems caused by unguarded acronymic constructions, we made the (semantically very reasonable!) decision that acronyms are names not predicates.

R7 As part of the 2013 implementation, we ruled that name words must resolve into syllables (i.e, must meet minimal standards of pronounceability). Very few names appearing in the Loglan corpus have had to be rewritten for this reason (some with foreign spellings have been re-articled with lao). Many more names have been rewritten because continuants are required to be doubled. Also, Loglan does not allow syllable-final (and so word-final) clusters of more than two consonants: where such terminal clusters occur in names, one of them is often a continuant, and doubling it may fix the problem, as in Marrks, Hollmz.

R8 The formal definition of the Loglan syllable was an absolute prerequisite for successfully parsing borrowed predicates in Loglan. Complex predicates are parsed by analysis into djifoa, not syllables per se. The details of the initial consonant clusters are in Notebook 3 (including the description of the criteria for three-consonant clusters). The description of the vowel segments is self-evident from the sources; doubling continuants used syllabically makes the work of the parser easier. The fact that two final consonants are allowed is determined by looking at actual borrowings and names. Finally, the rule forbidding a non-continuant final consonant from being followed by a continuant final consonant is nowhere attested but appears self-evident: such a pair of final consonants would be pronounced as another syllable. The universal application of the forbidden consonant triples (independent of where syllable breaks actually fall) seemed an inevitable choice. Our intention in any case is that though there may be options in placing syllable breaks between consonants, a change of syllable break between consonants is never phonemic (it will not change one word to another); as we have already seen, syllable breaks between vowels are phonemic. It is amusing to note that the maximum size of a consonant cluster at a syllable juncture in Loglan is five: two final consonants in a preceding syllable followed by an initial three-consonant cluster in a succeeding syllable.

In addition to observing that the syllable, though always an important concept in the Loglan definition, was never actually defined, we note that some of JCB's choices of syllable break in transcriptions in Notebook 3 are rather odd.

R9 This paragraph is new in the 2017 revision. Parsing borrowings required automated recognition of stressed or stressable syllables: this was made much easier by forbidding vowel pairs with mandatory stresses in these words, and this actually corrected only one word: alkooli, "alcohol", was revised to alkoholi, arguably an improvement! The use of the stress-causing doubled vowels in complex predicates is entrenched in the vocabulary, and did occasion difficulties for the parser writer! The provisos about syllables with continuants in them are evident from the actual function of continuants as "glue" in borrowings, and I believe that a remark forbidding initial syllables of this kind already appears in Notebook 3. The rules for syllable breaks in borrowings preventing an illegal complex from being converted into a legal borrowing are technically interesting but really only a matter of orthography.

R10 The reform of irregular djifoa was undertaken in the 1990's and is reported in Appendix H. The original 1989 text describes a different construction of irregular djifoa which led to difficulties. It is interesting to note that for one reason and another JCB's syllable breaks in his transcriptions in this example had to be extensively revised to accord with the current parser.

R11 It appears that the only situation in which h could appear in obligatory syllable-final position is in names. All combinations hC are forbidden, so an h which was syllable-final but not word-final could appear only at the end of a syllable not followed by a regular vowel. If h appears in final position in a y-hyphenated CVC djifoa, as in rihygru (a corrected version of a predicate in Leith's novel), this can be articulated into syllables as rih-y-gru, expressing the new allophone of h, but also as ri-hy-gru, avoiding it.

R12 Two initial combinations were omitted from the 1989 list which actually occur in the dictionary. The initials sv and zl were reinstated. The first occurred in the word for "Swedish", which we could have given up; the second however occurred in the composite primitive for "lip", not something we wanted to renegotiate. Admitting new initials is not a cost-free move: because of the way the slinkui test was removed, it may cause y-hyphens to be required to be inserted in certain complex predicates.

R13 VccV predicates such as iglu, where the cc is a permissible initial pair, were eliminated in the 1990's. Thus JCB's favorite short word example became igllu. I doubled the l in the transcription, too.

R14 The rule implemented in the current parser is that any stressed syllable (whether emphatically stressed or otherwise) preceding a predicate must have an intervening pause. If the initial predicate of the predicate is unstressed (as it will be if the predicate has three or more syllables), it can be grabbed by a normally stressed syllable. JCB alludes to this rule in other places without the surprising qualification that the stress is emphatic.

(Return to the Table of Contents)

Chapter 2 SOUNDS AND WORD-FORMS