The Loglan PEG Grammar

This document and other related Loglan materials are intellectual property of the Loglan Institute, a Florida corporation which still exists, but the Institute freely permits and enourages the use of this and other Loglan materials for noncommercial purposes.

In this file I will develop the entire Loglan grammar on top of the phonetic proposal PEG notation

A PEG (Parsing Expression Grammar) is made up of lines of the form class_name <- PEG notation

Each PEG notation describes a set of strings with conditions on the context in which they occur.

Concrete strings: 'string' or "string" literally denotes the 6 character string given.

Classes of characters: [aeiou] describes the set of one character strings which are either

a, e, i, o, or u. Ranges can appear: [a-zA-z] describes the union of the sets of lower case letters and upper case letters, considered as one character strings.

If A and B are PEG notations, (A B) denotes a string of class A followed by a string of class B (in which the string of class A is the preferred string of this class read from the beginning of the source string).

If A and B are PEG notations, (A / B) denotes a string of either class A or a string of class B, with a string of class A being read by preference if possible. The fact that a preference is indicated in alternative lists makes PEG reading deterministic (in a sense, there are no ambiguities for a PEG grammar). The problem in a PEG corresponding to ambiguity in a BNF grammar is incorrectly ordered lists of alternatives.

If A is a PEG notation, (A)? represents a string of class A (preferred) or an empty string if there is no string of class A: this represents optional appearance of A. (A)* represents zero or more consecutive strings of class A (as many as possible) and (A)+ represents one or more consecutive such strings.

If A is a PEG notation, &(A) represents a length 0 string which is followed by a string of class A, and !(A) represents a length 0 string which is not followed by a string of class A. This gives us powerful lookahead features: for example, ((A)! B represents a string of class B whose beginning is not also the beginning of a string of class

A: it is tempting but not accurate to say that it does not have an initial segment of class A, because detection of a string of class A longer than the string of class B read would cause reading of this class to fail.

The period . represents the class of single characters (so !. is end of text).

New notations are introduced by lines

class_name <- PEG notation:

this is not just an abbreviation facility because such definitions may be mutually recursive.

A PEG notation applied to a source string will give either failure or a uniquely determined initial string of the source (parsed suitably); in a sense PEG is unambiguous. What corresponds as an issue to ambiguity for a BNF grammar is

inappropriate choice of order of alternatives in PEG disjunctions (A / B): what often represents a problem with a grammar is what I call

"preemption", where an earlier alternative reads an initial segment of a string where a later alternative could have read more of it.

It's possible to have a PEG go into an infinite loop and fail to produce a parse. My PEG generator has a termination checker, so the Loglan grammar does not have these problems. I have contemplated writing a preemption checker, but this is a rather difficult problem. Dated updates now to appear here

1/1/2024: The rule for left boundary defense of names is now that after any name marker the orthography must show exactly where the next actual pause is. This is supported by addition of ~ (tilde), an allophone of whitespace (so it ends words) which must be followed immediately by a consonant and which is not expressed as a pause: this makes it easy to indicate blocks of words to be pronounced without a break. Also cleaned up uses of [,] in favor of named comma classes. There may be more of this to do.

11/30/2024: refinement of the correction of the 29th.

11/29/2024: Use of gio is now optional, except in statements to be converted by hue into inverse vocatives.

2/22/2022 Eliminated all use of ICI and ICA as binary connectives between non-sentences. Non-sentence utterances starting with ICA connectives remain and seem to be adequate for all actual uses along these lines.

2/20/2022 In addition, pulled the tightly binding ICI connectives into sentence instead of utterance structure. I doubt there is any practice in using them in their former role, so they will be experienced as a new feature.

2/19/2022 Major restructuring of sentences. Sentences with head terms with gi and goi (uttAx) are moved into class sen1, so they can be afterthought logically

connected like any other sentences. Note that the fact this couldnt be done means that Loglan never really had OSV sentences in any practical sense. The real reason I need this

is convenient isomorphism between Loglan and standard logical notation. This means that class uttA0 is no longer needed (removed in its one application in favor of sen1).

experimentally allowing terms before keksents, which are in effect headterms with [gio].

2/15/2022 first public release with systematic renaming.

work order 2/10/2022 no action taken yet. To go with [pau], ago, add [fau] from now and [vau] distance away to class PA00. [cioru] is rejected for cause.

2/9/2022 added the [bao] lambda quantifier for constructing abstract relations between more than two objects

2/6/2022 removed kue from NI0 and added it (allowing digit suffixes) to NU0.

2/5/2022 fixed problem with CI as name marker created in literal character cleanup; allowed [pi] to be prefixed to anything of class NI

2/2/2022 Large project: renaming of classes in the grammar. Identify ones which ought to be known to speakers (targets for having associated Loglan predicates). Establish standard suffixes for phonemes, morphemes, lexemes. About done, 2/12/2022.

2/2/2022 no modification yet. Small project: create class NUJI with nuji and nuja, and correct grammar so NUJI takes only arguments. At present, nuja is not supported and nuji can take predicates and modifiers. Later: class NUJI installed and properly restricted: nujiza and the like are not for the moment supported.

1/31/2022 removes a lot of use of literal characters by defining rules sp and stress2. Corrects some minor errors in literal character lists.

a note, not reflecting a modification. I'm wondering whether the pause required in [fo tonira] ([fotonira] means something quite

different) should be a mandatory comma pause. It looks as if it might not be hard to implement.

No modification as yet: I am thinking of banning JIO followed by an imperative. JI should be used.

fixed a bug which broke the end of speech marker "#" #p$

1/29/2022 Created wrapper classes so that parses which mix letters and grammar classes are avoided.

1/29/2022 Also fixed a small bug in class headterms. Fixed an apparent bug in class NameWord ([hue] was omitted) which has never caused any identifiable trouble.

1/26/2022 complete elimination of the alternative parser and all constituent rules (including the word [gaa]) as I have abandoned this idea.

a note: there is a problem with interaction of quoted forms with alien text operators.

1/23/2022 made imperative important and cloned sentence and uttAx in versions which will not be marked as imperative if they lack subjects. Considering simply forbidding jio clauses to contain no subject sentences, but this is not implemented (ji should be used). Fixing the restriction on as a name marker to allow it to be followed by whitespace and a name.

1/22/2022: provisionally removing the requirement that the terms before the predicate in an SVO sentence contain no more than one untagged argument.

The option of using the particle [gio] before any sutori untagged arguments before the predicate remains.

1/21/2022 Starting a literate programming exercise: turn this document into HTML while preserving its performance as a PEG grammar. #Also note that the alternative version is now turned off. The only component present is [gaa] and I do not see a reason for anyone to use it. #The alternative parser is readily turned back on by changing the line statement1x. This version labels the default stressed syllable in a predicate in the PhoneticComplex parse.

a serious problem with ICA, an actual ambiguity which has existed since the beginning of the language,

hopefully fixed: the fix is that an apparent ICA initial utterance which could without the period be

a continuation of a sentence is read as such. The important point is that there is no audible difference

between comma followed by ICA and period followed by ICA: we solve the problem by reading the latter

as the former where possible.

11/24/2021 KIA, the one "word" deletion operator, is installed. What it actually does is a bit subtle.

2/4/2021 Imposed the rule that two final consonants cannot be consonants from voiced/unvoiced pairs

with different voice. Also forbid second final consonant to be h.

I have further fine-tuning of djifoa gluing in mind.

Allow the -r glue to be expressed as

-rr after all mandatory monosyllables, removing the annoying pronunciation problem?

I was thinking of allowing -hy gluing in other contexts, but it is actually a bad idea.

9/15/2019 installed semantic case tags with order distinctions for use with predicates with more than one argument of the same case.

one solution is beucine, beucito... another is beuzi, beuza, beuzu.

4/28/2019 Various debugging of the new predicate algorithm. Added CVVhy as a glued form for CVV djifoa.

added capitalization of djifoa glue! Confirming my apparent earlier decision that a CVV(h)y djifoa must be followed

by a full predicate complex.

4/26/2019: this incorporates various revisions to the phonetics, correcting errors or clarifying rules,

motivated by my development of the phonetics section of a new grammar document. The one notable

change is that [ci] is now only a name marker if followed by an explicit pause. This only requires

changes in writing in serial names. In speech, it is recommended that one not pause after [ci]

except before a name word. The benefit is that non-serial-name related uses of [ci] no longer

threaten mysterious needs to add explicit pauses before following name words.

I want to add the [zao] proposal of John Cowan. Done, 4/15/2019. the imperative pronoun [koo] has been added though not officially. I should also add [dao] for the dummy argument, but not today (it is in as of 4/18) #4/25 Making note of the idea that [ci] should not be a name marker unless followed

by a pause. This would require that one pause before ci-marked names and it would

remove some very confusing corrections for the false name marker problem. If we

required the pause to be explicit we would be imposing the expectation that whitespace

after [ci] is not a pause. Otherwise we could encourage writing a juncture after [ci]

to deny presence of a pause, which is reasonable considering the meanings of [ci].

I am implementing the version with explicit pauses between [ci] and names

and the directive not to pause after [ci] without explicit indication. This solution

involves rewriting existing text only in the rare instances where [ci] precedes a name.

4/25/2019 Corrected some instances of (expanded) badstress. Now forbidding (C)VVVV initial predicates. Probably I should use class badstress systematically in defining cmapua.

4/24/2019 Final consonants in syllables cannot be followed by syllabic continuants.

this rationalizes the definition of SyllableA.

4/22 I am thinking of explicitly flagging imperative sentences; not changing

the grammar but making this visible in the parse. This might also have some

effects on logical connections. 4/23 created an imperative class for atomic

imperative sentences; this has no actual effect on parses, just

organizes them in a more enlightening way.

4/17-18 2019: updates commented out which make sentpred linkable with forethought

and afterthought connectives (making some uses of [guu] to share arguments

unnecessary). There are subtleties. Basically, untensed predicates without

argument lists will be linked by A and KA series connectives. Such a linked

set can be tensed as a whole. Such a linked set will share a following termset.

This will probably change many parses in the Visit and other legacy sources.

This required some really subtle adjustments to work right, divinable from

the actual rules given. Definitely experimental.

3/9/2019 further, extended LIU1 to handle [ainoi] and its kin

(actual mod is to class Cmapua) Further, fixing mismatch

between connective and A classes. One does now have to pause

before [ha] and its compounds.

3/9/2019 repaired bugs in negative attitudinals. A pause

in a negative attitudinal of the [no, ui] form will not break

it. [ainoi] didnt work for two reasons: the clauses

in the definition of NOUI were in the wrong order, and

the connective class mistakenly included [noi] so the

phonetics checker was crashing! I had to move N and NOI

earlier to make this work. Not yet installed in the other

version.

1/26/2019 added [vie], JCB's "objective subjunctive" as a PA

class word. I should add this to the other file as well.

12/22/18: just a comment: one does not have to pause before [ha] and its compounds.

I do not know whether to fix this. One did not have to in LIP either. For the moment I will

leave it as it is. As a matter of style, one probably should pause.

10/6/18 minor adjustments, made only in this file. Allow [sujo] (a wicked thing to say). Do not

allow [futo]: suffixed conversion operators must be nu + suffix.

6/2 fixed LIO + alien text. I also fixed some other glitches described in the reference grammar.

5/11 making version without "alternative parser" features. This version allows GAA but it doesn't

do anything: the definitions of argumentA and kin are the only point of difference. Master version:

becomes "alternative" by reinstating alternative definitions of argumentA and kin. Further, made changes

recommended in the reference grammar. ALTERNATIVE -- this is actually my master version. Edit

this and revise the argumentA and kin entries to make the original version.

4/24 discovered and repaired a bug re ci-marked names suffixed to descriptions. Discovered a bug in numerical

descriptions yet to be fixed: [lio] needs to be an alien text marker, maybe taking double quotes. The description-

with-suffixed-name bug was actually quite gruesome. I think it is repaired.

4/23 streamlined definition of descriptn. Shouldn't change anything. It was remarkably tricky though; preserving the old form

in case of further trouble.

4/22 I think this will be the master grammar file, with alternative lines to turn off the

GAA-related features. (1/21/2022, they are now turned off)

4/22 allowing general predicates in gasent1. This removes an extreme oddity in parsing of imperatives.

I do not see any new dangers from this.

4/22 I changed the final element of a keksent to be a sentence (new class uttA0), not a general sentence fragment.

several parse errors in the Visit were uncovered by this.

4/22: note that I still have the obligation to restore the [zao] construction.

4/9/2018 the large subject marker GAA can also be used to defend the beginnings of gasents and imperatives

from absorbing trailing arguments into an unintended statement. In this context [gaa] may be followed by [ga] ;-)

4/8/2018 this is an alternative version in which an argument which starts an SVO sentence will not be accepted

as a trailing argument of a previous sentence. This allows neat termination of [lepo] clauses preceding

a subject, for example. Unlike the previous alternative approach, this seems to involve a single fairly

tidy change: it is all an issue of avoiding needs for explicit closure. Further refinement: SVO sentences

can be marked with GAA (which is not a tense: it appears optionally just before the predicate, or just

before sutori arguments marked with GIO if there are any), the "large subject marker": an argument which

starts an SVO sentence *not marked with GAA* will not be accepted as a trailing argument of a previous

sentence. This is a sufficiently complex grammar change that it requires thought: it is not conservative

in my usual sense. The fact that GAA carries a mandatory stress is virtuous. Its resemblance to the

particle GA when used as a tense is not a bad thing: it would often be used instead of GA to close

a [lepo] clause appearing as a subject, and it is perhaps better for that purpose. Note that GAA can

and often will be followed by a tense. This grammar change depends strongly on the previous ruling that the O in

SOV(O) sentences must be marked with [gio]: S gio O^n V (O^m).

nuu is an atomic A core and there is no nu-affix to A connectives and their kin

1/20/2018 redefined CA cores to include a possible NU prefix. This allows more logically connected tenses, for example.

1/13/2018 reorganized the internals of class PA in a way which should allow more things and not forbid anything legal now.

this is pursuant on an analysis of the classes NI and PA as phrases, rather than words, as I start writing a global lexicography

proposal document. Enforced explicit pauses after PA phrases appearing as arguments with a following modifier with an argument.

12/30/2017 fixed a problem with name markers in the clas NameWord and made a slight change to the new option in NI (names

as dimensions).

12/27/2017 installing an alternative treatment of acronyms under which they are simply names (suffix -n to acronyms in all uses).

supporting this requires no change at all to acronymic name usage (just use the -n versions with the usual rules for names),

and for dimension usage requires [mue] to be a name marker and support for [mue] PreName as an alternative suffix to NI.

12/27/2017 Frivolously fooling with the capitalization conventions. They ought to work better now...but I could have broken something.

the main new idea was to require that a capitalized embedded letteral actually be followed by lowercase if it was preceded by lowercase

(with the obvious exception for a letteral followed by a letteral). Also changed the rules for diphthongs in cmapua to make all-caps

legal for cmapua. The general idea is that one can start with a capital letter and stay capitalized until one hits a lower case letter,

at which point one can jump back up to caps only at a juncture (after which you can remain capitalized) or temporarily for a vowel

after z- (after which lower case resumes) or an embedded literal (after which lowercase resumes). The total effect is that this allows

attested capitalization patterns in Loglan (including capitalization of embedded literals as in possessive articles and acronyms)

and also allows all-caps for individual words (attested in Leith but suppressed in my version) and supports capitalization of components

of names as in [la Beibi-Djein] (by artful use of syllable breaks: Leith just has BeibiDjein, which does not work for me).

12/26/2017 Installed [niu] (quotation of phonetically legal but so far non-Loglan words). I did not make [niu] a name marker, so if one were to

use it with names (where it isn't really appropriate), one would have to pause initially: [niu, Djan].

I note in this connection that quotation of names with li...lu remains limited, since names by themselves are not

utterances: one needs the [la]. I fixed this as an exception in the previous parser; I may do it here or I may

not, haven't decided. Single name words can be quoted with [liu], of course, but not serial names.

12/24/2017 Refined treatment of vowel pairs for Cvv-V cmapua units. First 12/24 version rather disastrously

broken: this should be fixed!

12/23/2017 This is now completely commented, with minor local exceptions to which I will return later.

This document is the basis on which I will build all subsequent parsers, with due modifications to the comments.

The Python PEG engine and preamble files contain commands for constructinging a Python parser from it directly.

12/22/2017 major progress on commenting the grammar

yet later 12/20: no change in performance of the grammar, extensive commenting in the

grammar section. Considerable changes in arrangement: for example, vocatives, inverse vocatives,

and free modifiers are moved to a much earlier point. I'm hoping to get a genuinely almost readable

commented grammar...

later 12/20 starting the process of commenting and editing the grammar, starting

at basic sentence structures. Notably rewrote the class [keksent] more compactly,

one hopes with no actual effect on parses.

12/20/2017 Do not require expression of pause after finally stressed cmapua before

vowel initial predicate as a comma, since the initial vowel signals the pause anyway.

Allow final stress in names. Fixed bug in CVVHiddenStress. Prevented

broken monosyllables in finally stressed CVV djifoa. refinement of caprule

12/19/2017 seem to have had a versioning failure and lost the fix which requires

CVVy djifoa to be followed by complete complexes. Restored.

12/18/2017 fixed a bug in treatment of stressed syllables in recognizing predicate starts. Also

narrowed the generalized VCCV rule to allow more of the quite unlikely space of predicates with lots

of vowels before the CC pair. Probably they should be banned (and none have ever been proposed with

more than three) but that rule is not the context in which to arbitrarily ban half of them. Some cleanup

of the display of parses, for which updated version of logicpreamble.py should also be uploaded. A refinement

to class "connective" checking that apparent logical connectives are not initial segments of predicates.

This has the effect of delaying the declaration of "connective" until after the declaration of

"predstart".

12/17/2017 further refinement of the 12/16 version: a couple of bugs spotted.

12/16/2017 There should be no change in parsing behavior, but the predstart ruleset is shorter

and more intelligible, and I realized that Complex doesnt need a check for the anti-slinkui test

(the requirement that certain initial CVC cmapua be y hypenated which replaces the slinkui test))

at all: the way predstart works already ensures that initial CV cmapua fall off in the excluded

cases, the idea being that we test the front of a predicate without lookahead in all cases. Also

addressed the subtle point that one wasn't forced to pause after a predicate before following y

(not likely to arise as a problem).

12/14/2017 Corrected vowel grouping to avoid paradoxical vowel triples which are default

grouped in a way which becomes illegal if made explicit. SyllableA really should contain a final

consonant: the previous form was messing up vowel grouping. Serious bug where end of djifoa

and syllable resolution of a predicate may fail to agree. I think I blocked this by ensuring that

final djifoa are not followed by vowels. Other fine tuning of the complex algorithm. Also had

to repair the check for CVCCCV and CVCCVV predicates.

12/13/2017: added kie ( utterance ) kiu to class LiQuote. Did fine tuning to ensure

that cmapua streams stop before [li] or [kie], that names can stop at double quotes or close

parentheses, and that the capitalization rule ignores opening parentheses as well as double

quotes. One can now adorn li lu with quotes (on the inside) in a reasonable way

and adorn kie kiu with parentheses (on the inside) in a reasonable way. One cannot

*replace* these words (or any words) with punctuation in my model of Loglan. Also,

updates to comments, and

(end of utterance) added as a marker of terminal punctuation. Comments on the initial release of this grammar

This is now done, in a first pass. That is, the grammar is adapted and appears to work, more or less.

What is needed is comments on the lexicography and the grammar...Phonetics has now pretty clearly been sorted

from the grammar (there are some places where the phonetics accept grammar information with regard to punctuation).

Alien text is now handled somewhat differently. Some issues to do with quoting names are not finalized and have not been tested.

I added -iy and -uy as VV forms allowed in general in cmapua but not in other words; they are always monosyllabic. What this

immediately allows me to do is to give Y a name which is not phonetically irregular! [ziy] is supported: [yfi] is too, now.

capitalization is roughly back to where it was in the original, but all-caps are allowed.

acronyms are liable to be horrible.

Fixed the recursion problem in a way which will not be visible in ordinary parses. Streams of cmapua will always

be broken at name or alien text markers (instead of using lookahead to check that we do not stand at the beginning

of a name word or alien text word). The next cycle will then check for a name or alien text, and also check for

badnamemarkers; no lookahead is happening while a stream of cmapua is being read except checking for

the markers of names and alien text. This will change the way phonetic parses look (streams of cmapua will

break (and sometimes resume) at name markers or alien text markers, but it will not change any grammatical

parses. Rule name conventions. These should now be enforced.

We define a way to sort rule names into layers, which an automated tool ought to be able to use.

Phonemes: a rule name with one or two lower case letters optionally followed by digits.

Intermediate phonetic and morphological groups: a rule name whose first two characters are an uppercase and a lowercase letter. This includes both classes of phonemes and classes of strings of phonemes.

Lexemes (words, take this with a grain of salt): names made entirely of uppercase letters. Note that names and alien text constructions will tend to be treated as single words including their markers.

Lexeme-like: Names made of more than one uppercase followed by lowercase letters are of lexeme-like things (name and alien text markers).

Lexeme precursors: Names made of uppercase letters followed by digits are precursors of lexeme classes. Cmapua components which are true affixes will be labelled with such a class and not with a lexeme class. The point here is that some of the lexeme classes have internal grammar.

Grammar classes: Names starting with at least three lower case letters (anything may follow) are grammar classes. Ones the speaker should be aware of will not usually be followed by digits. Part I Phonetics

Mod bugs, I have implemented all of Loglan phonetics as described in my proposal. Borrowing djifoa are pretty tricky.

I have now parsed all the words in the dictionary, and all single words of appropriate classes parse successfully.

I have added alien text and quotation constructions which do not conform to these rules; so actually

all Loglan text should parse, mod some punctuation and capitalization issues. The conventions for

alien text here are not the same as those in the current provisional parser.

I believe the conventions for forcing comma pauses before vowel initial cmapua and after names

except in special contexts have been enforced. In a full grammar, one probably would want

to disable pauses before vowel initial letterals (done). This grammar also does not support the lingering

irregularities in acronyms (and won't).

This grammar (in Part I) is entirely about phonetics: all it does is parse text into names (with associated initial

pauses or name markers), cmapua (qua unanalyzed streams of cmapua units),

borrowings and complexes, along with interspersed comma pauses and marks

of terminal punctuation. It does support conventions about where commas are required

and a simple capitalization rule. Streams of cmapua break when markers initial

in other forms are encountered (and may in some cases resume when the markers

are a deception).

a likely locus for odd bugs is the group of predstartX rules which detect apparent cmapua which

are actually preambles to predicates. These are tricky! (and I did indeed find some lingering

problems when I parsed the dictionary). Another reason to watch this rule predstart

is that it carries a lot of weight: !predstart is used as a lightweight test

that what follows is a cmapua (a point discussed in more detail later).

In reviewing this, I think that very little is different from 1990's Loglan (the borrowing djifoa

are post-1989 L1, but not my creation). Some things add precision without making anything in 1990's Loglan incorrect.

The requirement that syllabic consonants be doubled is new, and makes some 1990's Loglan names incorrect.

The requirement that names resolve into syllables is new, and makes some 1990's Loglan names incorrect,

usually because they end in three consonants.

The rule restricting final consonant pairs from being noncontinuant/continuant is new, but

does not affect any actual predicate ever proposed.

Enhancing the VccV rule to also forbid CVVV...ccV caused one predicate to be changed

([haiukre] became [haiukrre], and haiukre was a novelty anyway, using a new name for X in X-ray)

The exact definition of syllables and use of syllable breaks and stress marks is new (the close comma

was replaced with the hyphen, so Lo,is becomes Lo-is); but this does not make anything in 1990's Loglan

incorrect, it merely increases precision and makes phonetic transcript possible.

Forbidding doubled vowels in borrowings was new, was already approved, and caused us to change

[alkooli] to [alkoholi].

Formally allowing the CVccVV and CVcccV predicates without y-hyphens took a proposal in 2013 because

Appendix H was careless in describing their abandonment of the slinkui test, but the dictionary

makes it evident that this was their intent all along. The slinkui test had already been

abandoned in the 1990s.

Formally abandoning qwx was already something that the dictionary workers in the 1990's were working

on; we completed it.

Allowing glottal stop in vowel pairs and forbidding it as an allophone of pause is a new phonetic

feature in the proposal but not reflected in the parser, of course. Alternative pronunciations of

y and h and allowing h in final position are invisible or do not make any 1990's Loglan incorrect.

Permitting false name markers in names was already afoot in the 1990's and the basic outlines of our

approach were already in place. The rule requiring explicit pauses between a name marker not starting

a name word and the beginning of the next name word is new, but reflects something which was already

a fact about 1990's Loglan pronunciation: those pauses had to be made in speech

(and in the 1990's they had no tools to do relevant computer tests)! The requirement

that names resolve into syllables restricts which literal occurrences of name markers are actually

false name markers (the tail they induce in the name must itself resolve into syllables).

Working out the full details of borrowing djifoa was interesting: I'm not sure that I've done anything

*new* there; explicitly noting the stress shift in borrowing djifoa might be viewed as something

new but it is a logical consequence of JCB's permission to pause after a borrowing djifoa, which contains

explicit language about how it is to be stressed, and the

final definition of a borrowing djifoa as simply a borrowing followed by -y. The shift strikes

me as a really good idea anyway, because it marks djifoa with a pause after it as phonetically different

in an additional way other than ending with the very indistinct vowel y. My rules as given here do not

directly enforce the rule that a borrowing djifoa must be preceded by y but I think they indirectly

enforce it in all or almost all cases: the parser tries to read a borrowing djifoa before reading

any other kind of djifoa, so it is hard to see how to deploy a short djifoa in such a way that it would

fall off the head of a borrowing without using y.

These phonetics do not support certain irregularities in acronyms. We note that

it is now allowed to insert [, mue] into an acronym, which would be necessary for example

between a Ceo letteral and a following VCV letteral. Sounds

the sound of silence sp <- ([ ]+/[~] &Co1) sp2 <- [ ]+ Tilde <- [~] &Co1 Vowels #all vowels Vo1 <- [aeiouyAEIOUY] #regular vowels Vo2 <- [aeiouAEIOU] Consonants #consonants Co1 <- [bcdfghjklmnprstvzBCDFGHJKLMNPRSTVZ] #consonants in voiced/unvoiced pairs Cvoiced <- [bdgjvzBDGJVZ] Cunvoiced <- [ptkcfsPTKCFS]

bad voice pair (or pair second term of which is h)

forbidden as pairs of final consonants Badvoice <- ((Cvoiced (Cunvoiced/[Hh]))/(Cunvoiced (Cvoiced/[Hh]))) Letters and capitalization

letters Letter <- (![qwxQWX] [a-zA-Z])

a capitalization convention which allows what our current one allows and also allows all-caps.

if case goes down from upper case to lower case, it can only go back up in certain cases. This

does allow capitalization of initial segments of words. There is a forward reference to the grammar

in that free capitalization of embedded literals is permitted, and capitalization of vowels

guarded with z in literals as in DaiNaizA. Lowercase <- (![qwx] [a-z]) Uppercase <- (![QWX] [A-Z]) caprule <- ([\"(]? &(([z] Vo1 (!Uppercase/&TAI0))/(Lowercase TAI0 (!Uppercase/&TAI0))/(!(Lowercase Uppercase) .)) Letter (&(([z] Vo1 (!Uppercase/&TAI0))/(Lowercase TAI0 (!Uppercase/&TAI0))/(!(Lowercase Uppercase) .)) (Letter/Juncture))* !(Letter/Juncture)) Junctures: syllable breaks and stresses

syllable markers: the hyphen is always medial so must be followed by a letter.

the stress marks can be syllable final and word final. A juncture is never followed

by another juncture. Stress2 <- [\'*] Juncture <- ((([-] &Letter)/Stress2) !Juncture) Stress <- ([\'*] !Juncture) Terminal punctuations and general characters

terminal punctuation Terminal <- [.:?!;#]

characters which can occur in words Character <- (Letter/Juncture) Alien text

to really get all Loglan text, we should add the alien text constructions and the markers of alien text,

[lie], [lao], [sao], [sue] and certain quotations which violate the phonetic rules.

we adopt the convention that all alien text may be but does not have to be enclosed in quotes.

it needs to be understood that in quoted alien text, whitespace is understood as [, y,]; in the unquoted

version this is shown explicitly. This handling of alien text is taken from the final 1990's treatment

of Linnaeans = foreign names, and extended by us to replace the impossible treatment of strong

quotation in 1989 Loglan.

this is a little different from what is allowed in the previous provisional parser, but similar.

A difference is that all the alien text markers are allowed to be followed by the same sorts of alien text.

the forms with [hoi] and [hue] are required to have following quotes in written form to avoid

unintended parses, which otherwise become likely in case of typos in non-alien text cases. Comma0 <- [,]? sp2 AlienText <- ((Comma0 [\"] (![\"] .)+ [\"])/(Comma0 (!Comma0 !Terminal .)+ (Comma0 [Yy] Comma0 (!Comma0 !Terminal .)+)*))

adding wrapper classes for alien text markers HOIalien <- ([Hh] [Oo] [Ii]) HUEalien <- ([Hh] [Uu] Juncture? [Ee]) LIEalien <- ([Ll] [Ii] Juncture? [Ee]) LAOalien <- ([Ll] [Aa] [Oo]) LIOalien <- ([Ll] [Ii] Juncture? [Oo]) SAOalien <- ([Ss] [Aa] [Oo]) SUEalien <- ([Ss] [Uu] Juncture? [Ee]) AlienWord <- (&caprule ((HOIalien Juncture? &(Comma0 [\"]))/(HUEalien Juncture? &(Comma0 [\"]))/(LIEalien Juncture?)/(LAOalien Juncture?)/(LIOalien Juncture?)/(SAOalien Juncture?)/(SUEalien Juncture?)) AlienText)

while reading streams of cmapua, the parser will watch for the markers of alien text. Alienmarker <- ((([Hh] [Oo] [Ii] Juncture? &( [\"]))/([Hh] [Uu] Juncture? [Ee] Juncture? &(Comma0 [\"]))/([Ll] [Ii] Juncture? [Ee] Juncture?)/([Ll] [Aa] [Oo] Juncture?)/([Ll] [Ii] Juncture? [Oo] Juncture?)/([Ss] [Aa] [Oo] Juncture?)/([Ss] [Uu] Juncture? [Ee] Juncture?)) !Vo1)

5/11/18 added [lio] as an alien text marker, to support numerals.

the continuant consonants and the syllabic pairs they can form Complex Vowel Forms Continuant <- [mnlrMNLR] Syllabic <- (([mM] [mM] !(Juncture? [mM]))/([nN] [nN] !(Juncture? [nN]))/([rR] [rR] !(Juncture? [rR]))/([lL] [lL] !(Juncture? [lL])))

the obligatory monosyllables, and these syllables when broken by a usually bad syllable juncture.

The i-final forms are not obligatory mono when followed by another i. MustMono <- (([aeoAEO] [iI] ![iI])/([aA] [oO])) BrokenMono <- (([aeoAEO] Juncture [iI] ![iI])/([aA] Juncture [oO]))

the obligatory and optional monosyllables. Sequences of three of the same letter

are averted. Avoid formation of doubled i or u after ui or ui. Mono <- (MustMono/([iI] !([uU] [uU]) Vo2)/([uU] !([iI] [iI]) Vo2))

vowel pairs of the form found in cmapua and djifoa.

(other than the special IY, UY covered in the cmapua rules)

The mysterious prohibition controls a permitted phonetic exception in djifoa gluing.

compua are never followed directly by vocalic continuants in any case. Vv <- (!(!MustMono Vo2 Juncture? Vo2 Juncture? [Rr] [Rr]) (!BrokenMono Vo2 Juncture? Vo2))

the next vocalic unit to be chosen from a stream of vowels

in a predicate or name. This is different than in our Sources

and formally described in the proposal. NextVowels <- (MustMono/(Vo2 &MustMono)/Mono/(!([Ii] Juncture [Ii] Vo1) !([Uu] Juncture [Uu] Vo1) Vo2))

5/11/18 forbidding consonantal vowels to follow the same vowel.

the doubled vowels that trigger the rule that one of them must be stressed DoubleVowel <- (([aA] Juncture? [aA])/([eE] Juncture? [eE])/([oO] Juncture? [oO])/([iI] Juncture [iI])/([uU] Juncture [uU])/([iI] [Ii] &[iI])/([Uu] [uU] &[uU]))

the mandatory "vowel" component of a syllable Vocalic <- (NextVowels/Syllabic/[Yy]) Complex Consonant Forms

the permissible initial pairs of consonants, and the same pairs possibly

broken by syllable junctures. Initial <- (([Bb] [Ll])/([Bb] [Rr])/([Cc] [Kk])/([Cc] [Ll])/([Cc] [Mm])/([Cc] [Nn])/([Cc] [Pp])/([Cc] [Rr])/([Cc] [Tt])/([Dd] [Jj])/([Dd] [Rr])/([Dd] [Zz])/([Ff] [Ll])/([Ff] [Rr])/([Gg] [Ll])/([Gg] [Rr])/([Jj] [Mm])/([Kk] [Ll])/([Kk] [Rr])/([Mm] [Rr])/([Pp] [Ll])/([Pp] [Rr])/([Ss] [Kk])/([Ss] [Ll])/([Ss] [Mm])/([Ss] [Nn])/([Ss] [Pp])/([Ss] [Rr])/([Ss] [Tt])/([Ss] [Vv])/([Tt] [Cc])/([Tt] [Rr])/([Tt] [Ss])/([Vv] [Ll])/([Vv] [Rr])/([Zz] [Bb])/([Zz] [Ll])/([Zz] [Vv])) MaybeInitial <- (([Bb] Juncture? [Ll])/([Bb] Juncture? [Rr])/([Cc] Juncture? [Kk])/([Cc] Juncture? [Ll])/([Cc] Juncture? [Mm])/([Cc] Juncture? [Nn])/([Cc] Juncture? [Pp])/([Cc] Juncture? [Rr])/([Cc] Juncture? [Tt])/([Dd] Juncture? [Jj])/([Dd] Juncture? [Rr])/([Dd] Juncture? [Zz])/([Ff] Juncture? [Ll])/([Ff] Juncture? [Rr])/([Gg] Juncture? [Ll])/([Gg] Juncture? [Rr])/([Jj] Juncture? [Mm])/([Kk] Juncture? [Ll])/([Kk] Juncture? [Rr])/([Mm] Juncture? [Rr])/([Pp] Juncture? [Ll])/([Pp] Juncture? [Rr])/([Ss] Juncture? [Kk])/([Ss] Juncture? [Ll])/([Ss] Juncture? [Mm])/([Ss] Juncture? [Nn])/([Ss] Juncture? [Pp])/([Ss] Juncture? [Rr])/([Ss] Juncture? [Tt])/([Ss] Juncture? [Vv])/([Tt] Juncture? [Cc])/([Tt] Juncture? [Rr])/([Tt] Juncture? [Ss])/([Vv] Juncture? [Ll])/([Vv] Juncture? [Rr])/([Zz] Juncture? [Bb])/([Zz] Juncture? [Ll])/([Zz] Juncture? [Vv]))

the permissible initial consonant groups in a syllable. Adjacent consonants should be initial pairs.

The group should not overlap a syllabic pair. Such a group is of course followed by a vocalic unit.

this rule for initial consonant groups is stated in NB3.

I forbid a three-consonant initial group to be followed by a syllabic pair. This seems obvious. InitialConsonants <- ((!Syllabic Co1 &Vocalic)/(!(Co1 Syllabic) Initial &Vocalic)/(&Initial Co1 !(Co1 Syllabic) Initial !Syllabic &Vocalic))

the forbidden medial pairs and triples. These are forbidden regardless of placement

of syllable breaks.

each of these is actually a single consonant followed by an initial, and the idea was to identify CVC-CCV junctions which

would be hard to pronounce. But the placement of the syllable break is not relevant to the exclusion of the sequence.

Notice that the continuant syllabic pairs are excluded: this prevents final consonants from being included in such pairs. NoMedial2 <- (([Bb] Juncture? [Bb])/([Cc] Juncture? [Cc])/([Dd] Juncture? [Dd])/([Ff] Juncture? [Ff])/([Gg] Juncture? [Gg])/([Hh] Juncture? Co1)/([Jj] Juncture? [Jj])/([Kk] Juncture? [Kk])/([Ll] Juncture? [Ll])/([Mm] Juncture? [Mm])/([Nn] Juncture? [Nn])/([Pp] Juncture? [Pp])/([Rr] Juncture? [Rr])/([Ss] Juncture? [Ss])/([Tt] Juncture? [Tt])/([Vv] Juncture? [Vv])/([Zz] Juncture? [Zz])/([CJSZcjsz] Juncture? [CJSZcjsz])/([Ff] Juncture? [Vv])/([Kk] Juncture? [Gg])/([Pp] Juncture? [Bb])/([Tt] Juncture? [Dd])/([FKPTfkpt] Juncture? [JZjz])/([Bb] Juncture? [Jj])/([Ss] Juncture? [Bb])) NoMedial3 <- (([Cc] Juncture? [Dd] Juncture? [Zz])/([Cc] Juncture? [Vv] Juncture? [Ll])/([Nn] Juncture? [Dd] Juncture? [Jj])/([Nn] Juncture? [Dd] Juncture? [Zz])/([Dd] Juncture? [Cc] Juncture? [Mm])/([Dd] Juncture? [Cc] Juncture? [Tt])/([Dd] Juncture? [Tt] Juncture? [Ss])/([Pp] Juncture? [Dd] Juncture? [Zz])/([Gg] Juncture? [Tt] Juncture? [Ss])/([Gg] Juncture? [Zz] Juncture? [Bb])/([Ss] Juncture? [Vv] Juncture? [Ll])/([Jj] Juncture? [Dd] Juncture? [Jj])/([Jj] Juncture? [Tt] Juncture? [Cc])/([Jj] Juncture? [Tt] Juncture? [Ss])/([Jj] Juncture? [Vv] Juncture? [Rr])/([Tt] Juncture? [Vv] Juncture? [Ll])/([Kk] Juncture? [Dd] Juncture? [Zz])/([Vv] Juncture? [Tt] Juncture? [Ss])/([Mm] Juncture? [Zz] Juncture? [Bb])) The Syllable

there are no formal rules about syllables as such in our Sources, which is odd since

the definition of predicates depends on the placement of stresses on syllables.

The first rule enforces the special point needed in complexes that

a CVC syllable is preferred to a CV syllable where possible; we economically apply

the same rule for default placement of syllable breaks everywhere, which is, with

that exception, that the break comes as soon as possible.

the SyllableB approach is taken if the following syllable would otherwise start with a syllabic pair.

the reason for this approach is that if one syllabizes a well formed complex in this way...

the syllable breaks magically fall on the djifoa boundaries. This does mean that the

default break in [cabro] is [cab-ro], which feels funny but is harmless. Explicitly breaking

it [ca-bro] will also parse correctly. SyllableA <- (Co1 Vo2 FinalConsonant (!Syllable FinalConsonant)?) SyllableB <- (InitialConsonants? Vocalic (!Syllable FinalConsonant)? (!Syllable FinalConsonant)?) Syllable <- ((SyllableA/SyllableB) Juncture?)

The final consonant in a syllable. There may be one or two final consonants. A pair of final

consonants may not be a non-continuant followed by a continuant. A final consonant may not

start a forbidden medial pair or triple.

The rule that a final consonant pair may not be a non-continuant followed by a continuant

is natural and obvious but not in our Sources. Such a pair of consonants would seem to

naturally form another syllable.

a pair of final consonants cannot be differently voiced FinalConsonant <- (!Syllabic !(&Badvoice Co1 !Syllable) (!(!Continuant Co1 !Syllable Continuant) !NoMedial2 !NoMedial3 Co1 !(Juncture? (Vo2/Syllabic)))) #!((!MaybeInitial)C1 juncture? !syllabic C1 juncture? !syllabic C1) !(&MaybeInitial C1 juncture C1 !(juncture? C1)) Varieties of Syllable

Here are various flavors of syllable we may need.

this is a portmanteau definition of a bad syllable (the sort not allowed in a borrowing). SyllableD <- (&(InitialConsonants? ([Yy]/DoubleVowel/BrokenMono/(&Mono Vo2 DoubleVowel)/(!MustMono &Mono Vo2 BrokenMono))) Syllable)

this (below) is the kind of syllable which can exist in a borrowed predicate:

it cannot start with a continuant pair, it cannot have a y as vocalic unit,

and its vocalic unit (whether it has one or two regular vowels)

cannot be involved in a double vowel or an explicitly broken

mandatory monosyllable. BorrowingSyllable <- (!Syllabic !SyllableD Syllable)

this is the final syllable of a predicate. It cannot be followed

without pause by a regular vowel. VowelFinal <- (InitialConsonants? Vocalic Juncture? !Vo2)

syllables with syllabic consonant vocalic units

this class is only used in borrowings, and we *could* reasonably

require it to be followed by a vowel. But I won't for now.

for gluing this restriction would work, but we might literally borrow predicates

with syllabic continuant pronunciations. SyllableC <- (&(InitialConsonants? Syllabic) Syllable)

syllables with y SyllableY <- (&(InitialConsonants? [Yy]) Syllable)

an explicitly stressed syllable. StressedSyllable <- ((SyllableA/SyllableB) Stress2) Name Words

a final syllable in a word, ending in a consonant. NameEndSyllable <- (InitialConsonants? (Syllabic/(Vocalic &FinalConsonant)) FinalConsonant? FinalConsonant? Stress? !Tilde !Letter) The Pause

the pause classes actually hang on the letter before the pause.

whitespace which might or might not be a pause. Maybepause <- (Vo1 Stress2? sp2 Co1)

explicit pauses: these are whitespace before a vowel or after a consonant, or comma marked pauses. Explicitpause <- ((Co1 Stress2? sp2 &Letter)/(Letter Stress2? sp2 &Vo1)/(Letter Stress2? [,] sp2 &Letter)) The full analysis of names

these are final syllables in words followed by whitespace which might not be a pause.

the definition actually doesnt mention the maybepause class. MaybePauseSyllable <- (InitialConsonants? Vocalic Stress2? &(sp2 &Co1))

a name word (without initial marking) is resolvable into syllables and ends with a consonant. PRENAME <- ((Syllable &Syllable)* NameEndSyllable)

this is a busted name word with whitespace in it -- but not whitespace at which one has to pause. BadPreName <- (((MaybePauseSyllable sp)/(Syllable &Syllable))* NameEndSyllable)

This is a name marker followed by a consonant initial name word without pause.

I deployed a minimal set of name marker words; I can add the others whenever.

I have decided (see below) to retain the social lubrication words as vocative markers

*without* making them name markers, so one must pause [Loi, Djan]. By not allowing

freemods right after vocative markers in the vocative rule, I make [Loi hoi Djan] work as well,

without pause.

MarkedName <- &caprule ((([Ll] !pause [Aa] juncture?)/ ([Hh] [Oo] !pause [Ii] juncture?) / ([Hh] [Uu] juncture? !pause [Ee] juncture?) / ([Cc] !pause [Ii] juncture?)/([Ll] [Ii] juncture? !pause [Uu] juncture?)/[Gg][Aa] !pause [Oo] juncture?/[Mm][Uu] juncture? !pause [Ee] juncture?) sp? &C1 &caprule PreName)

adding wrapper classes for name markers LAname <- ([Ll] [Aa]) HOIname <- ([Hh] [Oo] [Ii]) CIname <- ([Cc] [Ii]) LIUname <- ([Ll] [Ii] Juncture? [Uu]) MUEname <- ([Mm] [Uu] Juncture? [Ee]) GAOname <- ([Gg] [Aa] [Oo]) HUEname <- ([Hh] [Uu] Juncture? [Ee])

second series is for marked names, no pauses after them LAname2 <- ([Ll] !Explicitpause [Aa]) HOIname2 <- ([Hh] [Oo] !Explicitpause [Ii]) LIUname2 <- ([Ll] [Ii] Juncture? !Explicitpause [Uu]) MUEname2 <- ([Mm] [Uu] Juncture? !Explicitpause [Ee]) GAOname2 <- ([Gg] [Aa] !Explicitpause [Oo]) HUEname2 <- ([Hh] [Uu] Juncture? !Explicitpause [Ee]) MarkedName <- (&caprule ((LAname2 Juncture?)/(HOIname2 Juncture?)/(HUEname2 Juncture?)/(LIUname2 Juncture?)/(GAOname2 Juncture?)/(MUEname2 Juncture?)) sp? &Co1 &caprule PRENAME)

This is an unmarked name word with a false name marker in it. FalseMarked <- (&PRENAME (!MarkedName Character)* MarkedName)

This is the full definition of name words. These are either marked consonant initial names without pause defined above,

names without false name markers beginning with explicit pauses (either comma marked or vowel-initial)

and name markers followed, with or without pause, by name words. In the latter case there must be at least

whitespace before a vowel initial name.

a series of names without false name markers and names marked with ci, separated by spaces, may be appended.

there is a look ahead at the grammar: a NameWord can be followed without explicit pause (there is whitespace and

a pause in speech!) by another

kind of utterance only in a serial name when what follows is of the form [ci] predunit, to be included

in the name. Comma1 <- [,] sp2 NAMEWORD <- (((&caprule MarkedName)/(Comma1 !FalseMarked &caprule PRENAME)/(&Vo1 !FalseMarked &caprule PRENAME)/(&caprule (((LAname Juncture?)/(HOIname Juncture?)/(HUEname Juncture?)/(CIname Juncture? &(Comma0))/(LIUname Juncture?)/(MUEname Juncture?)/(GAOname Juncture?)) !Vo1 Comma0? &caprule PRENAME))) ((Comma0 !FalseMarked &caprule PRENAME)/(Comma0 &([Cc] [Ii]) NAMEWORD))* &((sp? [Cc] [Ii] predunit)/(&((Comma1)/Terminal/[\")]/!.) .)/!.))

this is the minimal set of name marker words we are using. We may add more.

I am contemplating adding the words of social lubrication as name markers, but in a more restricted

way that in the last provisional parser, in which I made them full-fledged vocative markers. [Actually,

I preserved their status as vocative markers without restoring their status as name markers, in the latest version].

adding [mue] as a name marker Namemarker <- ((([Ll] [Aa] Juncture?)/([Hh] [Oo] [Ii] Juncture?)/([Hh] [Uu] Juncture? [Ee] Juncture?)/([Cc] &(Explicitpause/([Ii] Juncture? sp PRENAME)) [Ii] Juncture?)/([Ll] [Ii] Juncture? [Uu] Juncture?)/([Gg] [Aa] [Oo] Juncture?)/([Mm] [Uu] Juncture? [Ee] Juncture?)) !Vo1)

this is the bad name marker phenomenon that needs to be excluded. This captures the idea

that what follows the name could be pronounced without pause as a name word according to the

orthography, but the fact that whitespace is present shows that this is not the intention.

it is worth noting that name markers at heads of name words pass this test

(because I omitted the test that what follows is not a PreName in the interests

of minimizing lookahead);

but this test is only applied to strings that have already been determined not to

be of class NameWord.

changing BadNameMarker to test a mew concept for protection of left boundaries of names which involves less lookahead and no reference to possibly distant name words: the next whitespace after a name marker but not immediately after it must be an actual pause. # Badnamemarker <- (Namemarker !Vo1 [, ]? sp? BadPreName) Badnamemarker <- (Namemarker !Vo1 Comma0? (Syllable &Syllable)* MaybePauseSyllable)

we test for the bad name marker condition at the beginning of each stream of cmapua,

and streams of cmapua stop before name markers (and may resume at a name marker

if neither a NameWord nor the bad marker condition is found).

We have at any rate completely solved the phonetic problem of names and their markers. Predicate Start Test

predicate start tests: the idea is the same as class "connective" below, to recognize

the start of a predicate without recursive appeals to the whole nasty definition of predicate.

The reason to do it is to recognize when CV^n followed by CC cannot be a cmapua unit.

New implementation 4/28/2019. This allows only (C)V(V)(V) before the pair of vowels, for much less

potential lookahead. Vthree <- ((Vo2 Juncture?) (Vo2 Juncture?) (Vo2 Juncture?)) Vfour <- ((Vo2 Juncture?) (Vo2 Juncture?) (Vo2 Juncture?) (Vo2 Juncture?))

predicate starting with two or three consonants: rules out CC(C)V(V) forms. Junctures in

the initial consonant group ignored. Predstart1 <- (((&MaybeInitial Co1 Juncture? MaybeInitial)/MaybeInitial) &Vo2 !(Vo2 Stress !Mono Vo2) !(Vo2 Juncture? Vo2 !Character) !(Vo2 Juncture? !Character))

an apparent cmapua unit followed by a consonant group which cannot start a predicate -- CV(V) case Predstart2 <- (Co1 Vo2 Juncture? (Vo2 Juncture?)? !Predstart1 Co1 Juncture? Co1)

a stressed CV^n before a consonant group (CV(V) case) Predstart3 <- (Co1 !Vthree (!StressedSyllable Vo2 Juncture?)? &StressedSyllable Vo2 Vo2? Juncture? Co1 Juncture? Co1)

other (C)V^n followed by nonpredicate Predstart4 <- (Co1? Vo2 Juncture? (Vo2 Juncture?)? (Vo2 Juncture?)? !Predstart1 !(MaybeInitial Vo2) Co1 Juncture? Co1)

other stressed (C)V^n followed by consonant group Predstart5 <- (Co1? !Vfour (!StressedSyllable Vo2 Juncture?)? (!StressedSyllable Vo2 Juncture?)? &StressedSyllable Vo2 Vo2? Juncture? !(MaybeInitial Vo2) Co1 Juncture? Co1)

forms with y; implemented CVVhy alternative for CVV cmapua Predstart6 <- (Co1 (Vo2 Juncture?) ((Vo2 Juncture? [Hh]?)/(Co1 Juncture? (Co1 Juncture?)?)) [Yy]) Predstart <- (Predstart1/Predstart2/Predstart3/Predstart4/Predstart5/Predstart6)

it is worth noting that in the sequel we have systematically replaced tests &Cmapua

with !predstart. The former involves lots of lookahead and was causing recursion crashes

in Python. The phonetics and the grammar are both structured so that any string

starting with a name marker is tested for NameWord-hood before it is tested for

cmapua-hood; the only thing it is tested for later is predicate-hood, and predstart

is a rough and ready test that something might be a predicate (and at any rate

cannot be a cmapua). Structure Word Phonetics

this class requires pauses before it, after all the phonetic word classes.

what is being recognized is the beginning of a logical connective.

To avoid horrible recursion problems, giving this a concrete phonetic definition

without much lookahead. This can go right up in the phonetics section if it works

(and here it is!).

single vowel cmapua syllables early for connectives a <- ([Aa] !Badstress Juncture? !Vo1) e <- ([Ee] !Badstress Juncture? !Vo1) i <- ([Ii] !Badstress Juncture? !Vo1) o <- ([Oo] !Badstress Juncture? !Vo1) u <- ([Uu] !Badstress Juncture? !Vo1) Hearly <- (!Predstart [Hh]) Nearly <- (!Predstart [Nn])

these appear here for historical reasons and could be moved later Connective <- (sp? !Predstart ([Nn] [Oo] Juncture? !i)? (a/e/i/o/u/(Hearly a)/(Nearly uu)) Juncture? !Vo2 !(!Predstart [Ff] [Ii]) !(!Predstart [Mm] [Aa]) !(!Predstart [Zz] [Ii]))

cmapua units starting with consonants. This is the exact description from NB3. The fancy tail in each of the

three cases is enforcing the rule about pausing before a following predicate if stressed.

consonant initial cmapua units may not be followed by vowels without pause.

I am adding [iy] and [uy] (always monosyllable, yuh and wuh) as vowel pairs permitted in VV and CVV cmapua units.

it is worth noting that the "yuh" and "wuh" pronunciations of these diphthongs

are surprising to the English-reading eye.

The use for this envisaged is that the name [ziy] of Y becomes easy to introduce. Adding word space

is always nice, and these words seem pronounceable. I also made [yfi] possible: Y now has phonetically

regular names. CmapuaUnit <- ((Co1 Mono Juncture? Vo2 !(Stress2 sp? &Co1 Predstart) Juncture? !Vo1)/(Co1 (Vv/([Ii] [Yy])/([Uu] [Yy])) !(Stress2 sp? &Co1 Predstart) Juncture? !Vo1)/(Co1 Vo2 !(Stress2 sp? &Co1 Predstart) Juncture? !Vo1))

A stream of cmapua is read until the start of a predicate or a name marker word or an alien text marker word or a quote or parenthesis marker word is encountered.

the stream might resume with a name marker word if it does not in fact start a name word and does not potentially start a name

word due to inexplicit whitespace (doesn't satisfy the bad name marker condition).

we force explicit comma pauses before logical connectives, but not before vowel initial cmapua in general;

other conditions force at least whitespace, which does stand for a pause, before such words.

detect starts of quotes or parentheses with li or [kie] Likie <- (([Ll] [Ii] Juncture? !Vo1)/([Ki] [Ii] Juncture? [Ee] Juncture? !Vo1))

a special provision is made for NO UI forms as single words. [yfi] is supported. Cmapua <- (&caprule !Badnamemarker ((!Predstart (Vv/([Ii] [Yy])/([Uu] [Yy])) !(Stress2 sp? &Co1 Predstart) Juncture? NOI0)/(!Predstart [Nn] [Oo] Juncture? !Predstart (Vv/([Ii] [Yy])/([Uu] [Yy])) !(Stress2 sp? &Co1 Predstart) Juncture?)/((!Predstart (Vv/([Ii] [Yy])/([Uu] [Yy])) !(Stress2 sp? &Co1 Predstart) Juncture?)+/(((!Predstart Vo1 !(Stress2 sp? &Co1 Predstart) Juncture?)/(!Predstart CmapuaUnit)) (!Namemarker !Alienmarker !Likie !Predstart CmapuaUnit)*))/(!Predstart Vo2 !(Stress2 sp? &Co1 Predstart) Juncture?)) !Vo1 !(Co1+ Juncture) !(sp? Connective))

I have apparently now completely solved the problem of parsing cmapua as well as name words. Predicate Phonetics

Now for predicates. Djifoa ("affixes")

the elementary djifoa (not borrowings)

various special flavors of these djifoa will be needed.

These are the general definitions.

The NOY and Bad forms are for use for testing candidate borrowings for resolution

with bad syllable break placements. Borrowings do not contain Y...

CVV djifoa with phonetic hyphens.

added checks to all cmapua classes: the vowel final ones, when not phonetically hyphenated, cannot

be followed by a regular vowel. This is crucial for getting the syllable analysis and the djifoa

analysis to end at the same point.

allowing h to be inserted before y in CVVy djifoa for a CVVhy form.

allowing -r glue to be expressed as -rr

some classes just for djifoa glue wy <- [Yy] ar <- [Rr] en <- [Nn] hh <- [Hh] Dash <- [-] Cvv <- (Co1 Vv ((Juncture? hh? wy Dash? &Complex)/(Juncture? ar ar? Juncture? &Co1)/(en Juncture? &ar)/(Juncture? !Vo2))) CvvNoHyphen <- (Co1 Vv Juncture? !Vo2) CvvHiddenStress <- (Co1 &DoubleVowel Vo1 Dash? Vo1 ((Dash? hh? wy Dash? &Complex)/(ar Dash? &Co1)/(en Dash? &ar)/(Dash? !Vo2))) CvvFinalStress <- (Co1 Vv ((Stress2 hh? wy Dash? &Complex)/(ar Stress2 &Co1)/(Stress2 ar ar Juncture? &Co1)/(en Stress2 &ar)/(Stress2 !Vo2))) CvvNoY <- (Co1 Vv ((Juncture? ar ar? Juncture? &Co1)/(en Juncture? &ar)/(Juncture? !Vo2))) CvvNoYFinalStress <- (Co1 Vv ((ar Stress2 &Co1)/(Stress2 ar ar Juncture? &Co1)/(en Stress2 &ar)/(Stress2 !Vo2))) CvvNoYMedialStress <- (Co1 !BrokenMono Vo2 Stress2 Vo2 Dash? !Vo2)

CCV djifoa with phonetic hyphens. Ccv <- (Initial Vo2 ((Juncture? wy Dash? &Letter)/(Juncture? !Vo2))) CcvStressed <- (Initial Vo2 ((Stress2 wy Dash? &Letter)/(Stress2 !Vo2))) CcvNoY <- (Initial Vo2 Juncture? !Vo2) CcvBad <- (MaybeInitial Vo2 Juncture? !Vo2) CCVBadStressed <- (MaybeInitial Vo2 Stress2 !Vo2)

CVC djifoa with phonetic hyphens. These cannot be final and are always followed by a consonant (well, the

-y form may be followed by a vowel...

an eccentric syllable break is supported if the CVC is y-hyphenated:

[me-ky-kiu] and [mek-y-kiu] are both legal. The default is the latter. Cvc <- ((Co1 Vo2 !NoMedial2 !NoMedial3 Co1 ((Juncture? wy Dash? &Letter)/(Juncture? &Co1)))/(Co1 Vo2 Juncture Co1 wy Dash? &Letter)) CvcStressed <- ((Co1 Vo2 !NoMedial2 !NoMedial3 Co1 ((Stress2 wy Dash? &Letter)/(Stress2 &Letter)))/(Co1 Vo2 Stress2 Co1 wy Dash? &Letter)) CvcNoY <- (Co1 Vo2 !NoMedial2 !NoMedial3 Co1 Juncture? &Co1) CvcBad <- (Co1 Vo2 !NoMedial2 !NoMedial3 Juncture? Co1 &Co1) CvcNoYStressed <- (Co1 Vo2 !NoMedial2 !NoMedial3 Co1 Stress2 &Co1) CvcBadStressed <- (Co1 Vo2 !NoMedial2 !NoMedial3 Stress2 Co1 &Co1)

the five letter forms (always final in complexes) CcvCv <- (Initial Vo2 Juncture? Co1 Vo2 Dash? !Vo2) CcvCvStressed <- (Initial Vo2 Stress2 Co1 Vo2 Dash? !Vo2) CcvCvBad <- (MaybeInitial Vo2 Juncture? Co1 Vo2 Dash? !Vo2) CcvCvBadStressed <- (MaybeInitial Vo2 Stress2 Co1 Vo2 Dash? !Vo2) CvcCv <- ((Co1 Vo2 Juncture? Initial Vo2 Dash? !Vo2)/(Co1 Vo2 !NoMedial2 Co1 Juncture? Co1 Vo2 Dash? !Vo2)) CvcCvStressed <- ((Co1 Vo2 Stress2 Initial Vo2 Dash? !Vo2)/(Co1 Vo2 !NoMedial2 Co1 Stress2 Co1 Vo2 Dash? !Vo2))

the medial five letter djifoa CcvCy <- (Initial Vo2 Juncture? Co1 wy Dash?) CvcCy <- ((Co1 Vo2 Juncture? Initial wy Dash?)/(Co1 Vo2 !NoMedial2 Co1 Juncture? Co1 wy Dash?)) CcvCyStressed <- (Initial Vo2 Stress2 Co1 wy Dash?) CvcCyStressed <- ((Co1 Vo2 Stress2 Initial wy Dash?)/(Co1 Vo2 !NoMedial2 Co1 Stress2 Co1 wy Dash?)) Borrowed Predicates

to reason about resolution of borrowings into both syllables and djifoa (we want to exclude the latter

but we need to define it adequately) we need to recognize where to stop. A predicate word ends either

at a non-character (not a letter or syllable mark: whitespace, comma or terminal punctuation) or it

has an explicit or deducible penultimate stress. Borrowings do not contain doubled vowels, so they

have to have explicit stress in the latter case.

analysis: the stressed tail consists of a stressed syllable followed by an unstressed syllable.

identifying an unstressed final syllable is complicated by recognizing which CVV combinations can

be one syllable. This will either be an explicitly stressed syllable followed by a single syllable

or a syllable suitable to be stressed followed by an explicitly final syllable. CVV djifoa can

contain both syllables in a tail and of course the five letter djifoa have to be tails. A never stressed

SyllableC (with a continuant) may intervene.

tail of a borrowing with an explicit stress BorrowingTail1 <- (!SyllableC &StressedSyllable BorrowingSyllable (!StressedSyllable &SyllableC BorrowingSyllable)? !StressedSyllable &BorrowingSyllable VowelFinal)

tail of a borrowing or borrowing djifoa with no explicit stress BorrowingTail2 <- (!SyllableC BorrowingSyllable (!StressedSyllable &SyllableC BorrowingSyllable)? !StressedSyllable &BorrowingSyllable VowelFinal (&wy/!Character))

tail of a stressed borrowing djifoa, different because stress is shifted to the end BorrowingTail3 <- (!SyllableC !StressedSyllable BorrowingSyllable (!StressedSyllable &SyllableC BorrowingSyllable)? &BorrowingSyllable InitialConsonants? Vocalic Stress2 &wy) BorrowingTail <- (BorrowingTail1/BorrowingTail2)

short forms that are ruled out: CCVV and CCCVV forms. Ccvv <- ((InitialConsonants Vo2 Juncture? Vo2 Juncture? !Character)/(InitialConsonants Vo2 Stress2 !Mono Vo2 Juncture?))

VCCV and some related forms are ruled out (rule predstartF above is about this)

a continuant syllable cannot be initial in a borrowing and there cannot be successive continuant

syllables. There really ought to be no more than one!

borrowing, before checking that it doesnt resolve into djifoa PreBorrowing <- (&Predstart !Ccvv !Cmapua !SyllableC (!BorrowingTail !StressedSyllable !(SyllableC SyllableC) BorrowingSyllable)* BorrowingTail)

ditto for an explicitly stressed borrowing StressedPreBorrowing <- (&Predstart !Ccvv !Cmapua !SyllableC (!BorrowingTail !StressedSyllable !(SyllableC SyllableC) BorrowingSyllable)* BorrowingTail1)

borrowing djifoa without explicit stress (before resolution check) PreBorrowing2 <- (&Predstart !Ccvv !Cmapua !SyllableC (!BorrowingTail !StressedSyllable !(SyllableC SyllableC) BorrowingSyllable)* BorrowingTail2)

stressed borrowing djifoa (before resolution check). PreBorrowing3 <- (&Predstart !Ccvv !Cmapua !SyllableC (!BorrowingTail3 !StressedSyllable !(SyllableC SyllableC) BorrowingSyllable)* BorrowingTail3)

Now comes the problem of trying to say that a preborrowing cannot resolve into cmapua. The difficulty is with

recognizing the tail, so making sure that the two resolutions stop in the same place.

we know because it is a borrowing that there is at most one explicit stress, and it has to fall

in one of the cmapua! This should make it doable.

borrowing djifoa are terminated with y, so the final djifoa needs to take this into account

the idea behind both djifoa analyses is the same. If we end with a final djifoa followed by

a non-character, we improve our chances of ending the syllable analysis at the same point. We control

this by identifying djifoa with stresses in them: a medially stressed djifoa must be the last one

(and the syllable analysis will find its stressed syllable and end at its final syllable, the fact

that djifoa cannot be followed by vowels ensuring that the syllable analysis cannot overrun its end.

When the djifoa is finally stressed, the complex analysis ends with a further djifoa guaranteed to have

just one syllable, and the syllable analysis again will stop in the same place. The medial five letter forms

and borrowing djifoa of course are finally stressed mod an additional unstressed syllable which is skipped

by the syllable analysis, because it allows one to ignore an actually penultimate syllable with y or

a syllabic consonant. In the case where we never find a stress and end up at a final djifoa, the syllable

analysis will carry right through to the same final point.

in the attempted resolution of borrowings, our life is easier because we do not have

borrowing djifoa or medial five letter forms to consider, or any forms with y-hyphens. RfinalDjifoa <- ((CcvCvBad/CvcCv/CvvNoHyphen/CcvBad) (&wy/!Character)) RmediallyStressed <- (CcvCvBadStressed/CvcCvStressed/CvvNoYMedialStress) RfinallyStressed <- (CvvNoYFinalStress/CCVBadStressed/CvcBadStressed/CvcNoYStressed) BorrowingComplexTail <- (RmediallyStressed/(RfinallyStressed ((&(Co1 Mono) CvvNoHyphen)/CcvBad))/RfinalDjifoa) ResolvedBorrowing <- ((!BorrowingComplexTail (CvvNoY/CcvBad/CvcBad))* BorrowingComplexTail)

borrowed predicates Borrowing <- (!ResolvedBorrowing &caprule PreBorrowing !(sp? Connective))

explicitly stressed borrowed predicates StressedBorrowing <- (!ResolvedBorrowing &caprule StressedPreBorrowing !(sp? &Vo1 Cmapua)) #This is the shape of non-final borrowing djifoa. Notice that a final stress is allowed. #The curious provision for explicitly stressing a borrowing djifoa and pausing is supported.

borrowing djifoa without explicit stress (stressed ones are not of this class!)

Note that one can pause after these (explicitly, with a comma, in which case the stress must be explicit too) BorrowingDjifoa <- (!ResolvedBorrowing &caprule PreBorrowing2 ((Stress2 wy Comma1)/(Juncture? wy Dash?)))

stressed borrowing djifoa finally implemented! StressedBorrowingDjifoa <- (!ResolvedBorrowing &caprule PreBorrowing3 wy Dash? (Comma1)?) Complex Predicates

We resolve complexes twice, once into syllables and once into djifoa. We again have to ensure that

we end up in the same place! The syllable resolution is very similar to that of borrowings;

the unstressed middle syllable of the tail can be a SyllableY, and can also be a

SyllableC if the final djifoa is a borrowing.

A stressed borrowing djifoa with the property that the tail is still a phonetic complex is

a unit for this analysis.

note here that I specifically rule out a complex being followed without pause by y. I do not rule

this out for the vowel final djifoa because they can be followed by y at the end of a borrowing

djifoa. DefaultStressedSyllable <- Syllable PhoneticComplexTail1 <- (!SyllableC !SyllableY &StressedSyllable DefaultStressedSyllable (!StressedSyllable &(SyllableC/SyllableY) Syllable)? !StressedSyllable !SyllableY VowelFinal !Vo1) PhoneticComplexTail2 <- (!SyllableC !SyllableY DefaultStressedSyllable (!StressedSyllable &(SyllableC/SyllableY) Syllable)? !StressedSyllable !SyllableY VowelFinal !Character) PhoneticComplexTail <- (PhoneticComplexTail1/PhoneticComplexTail2)

note the explicit predstart test here. PhoneticComplex <- (&Predstart !Ccvv !Cmapua !SyllableC ((StressedBorrowingDjifoa &PhoneticComplex)/(!PhoneticComplexTail !StressedSyllable !(SyllableC SyllableC) Syllable))* PhoneticComplexTail)

the analysis of final djifoa and stressed djifoa differs only in details from

what is above for resolution of borrowings. The issues about CVV djifoa with doubled

vowels are rather exciting.

a stressed borrowing djifoa with the tail still a phonetic complex is a black box unit for

this construction.

My approach imposes the restriction on JCB's "pause after a borrowing djifoa" idea that what follows

the pause must itself contain a penultimate stress: [igllu'ymao] is a predicate but [igllu'y, mao] is not.

while [iglluy', gudmao] is a predicate.

the analysis of the djifoa resolution process is the same as above, with additional remarks

about doubled vowel syllables: notice that where the complex tail involved a doubled vowel syllable

without explicit stress, we insist on that djifoa or the single-syllable next djifoa ending in

a non-character: in the absence of explicit stress, we always rely on whitespace or punctuation

to indicate the end of the predicate.

all sorts of subtleties about borrowings and borrowing djifoa are finessed by always looking for

them first. There are no restrictions re fronts of borrowings or borrowing djifoa looking like regular

djifoa; the fact that borrowing djifoa end in y and borrowings do not contain y makes it always

possible to tell when one is looking at the head of a borrowing djifoa. Regular djifoa just before a borrowing

djifoa need to be y-hyphenated so as not to be absorbed into the front of the borrowing (I don't believe

that I actually need to impose a formal rule to this effect, though I am not absolutely certain; it would

be difficult to formulate [and does appear in the previous version, where it is a truly unintelligible piece

of PEG code]). FinalDjifoa <- ((Borrowing/CcvCv/CvcCv/CvvNoHyphen/CcvNoY) !Character) MediallyStressed <- (StressedBorrowing/CcvCvStressed/CvcCvStressed/CvvNoYMedialStress) FinallyStressed <- (StressedBorrowingDjifoa/CcvCyStressed/CvcCyStressed/CvvFinalStress/CcvStressed/CvcStressed) ComplexTail <- ((CvvHiddenStress ((&(Co1 Mono) CvvNoHyphen)/CcvNoY) !Character)/(FinallyStressed ((&(Co1 Mono) CvvNoHyphen)/CcvNoY))/MediallyStressed/FinalDjifoa) PreComplex <- ((!CvvHiddenStress !ComplexTail ((StressedBorrowingDjifoa &PhoneticComplex)/BorrowingDjifoa/CvcCy/CcvCy/Cvv/Ccv/Cvc))* ComplexTail)

originally I had complicated tests here for the conditions under which an initial

CVC cmapua has to be y-hyphenated: I was being wrong headed, the predstart rules

already enforce this (in the bad cases, the initial CV- falls off). The user will

simply find that they cannot put the word together otherwise. The previous version

did need this test because it actually used full lookahead to check for the start of a predicate. Complex <- (&caprule &PreComplex PhoneticComplex !(sp? Connective)) Quotation and Parenthesis of well-formed Loglan utterances; word classes

format for the LI quote and KIE parenthesis LiQuote <- ((&caprule [Ll] [Ii] Juncture? Comma2? [\"] PhoneticUtterance [\"] Comma2? &caprule [Ll] [Uu] Juncture? !(sp? Connective))/(&caprule [Kk] [Ii] Juncture? [Ee] Juncture? Comma2? [(] PhoneticUtterance [)] Comma2? &caprule [Kk] [Ii] Juncture? [Uu] Juncture? !(sp? Connective)))

the condition on Word that a Cmapua is not followed by another Cmapua

with mere whitespace between was used by [liu] quotation, but is now redundant,

because I have required that [liu] quotations be closed with explicit pauses in all cases. Word <- (NAMEWORD/Cmapua/Complex/CcvNoY)

it is an odd point that all borrowings parse as complexes -- so when I parsed all the words the first time they all

parsed as complexes. A borrowing is a complex consisting of a single final borrowing djifoa!

I did redesign this so that borrowings are parsed as borrowings. (This is the class

I used to parse the dictionary).

Yes, CVC djifoa do get parsed as names in the dictionary, so the CVC case here is redundant. I actually

think that only the CCV djifoa actually get parsed as such. SingleWord <- (((Borrowing !.)/(Complex !.)/(Word !.)/(PRENAME !.)/CcvNoY) !.)

name word appearing initially without leading spaces is important, because one type of NameWord includes a leading comma. The full phonetic utterance classes PhoneticUtterance1 <- (NAMEWORD/(sp? LiQuote)/(sp? NAMEWORD)/(sp? AlienWord)/(sp? Cmapua)/(sp? '--')/(sp? '...')/(sp? Borrowing !wy)/(sp? Complex)/(sp? CcvNoY))+ PhoneticUtterance <- (PhoneticUtterance1/(Comma1)/Terminal)+ Interlude: Phonemes and Pauses Consonants and vowel groups in cmapua

as noted above, !predstart stands in for the computationally disastrous &Cmapua Badstress <- (Stress2 sp? &Co1 Predstart) b <- (!Predstart [Bb]) c <- (!Predstart [Cc]) d <- (!Predstart [Dd]) f <- (!Predstart [Ff]) g <- (!Predstart [Gg]) h <- (!Predstart [Hh]) j <- (!Predstart [Jj]) k <- (!Predstart [Kk]) l <- (!Predstart [Ll]) m <- (!Predstart [Mm]) n <- (!Predstart [Nn]) p <- (!Predstart [Pp]) r <- (!Predstart [Rr]) s <- (!Predstart [Ss]) t <- (!Predstart [Tt]) v <- (!Predstart [Vv]) z <- (!Predstart [Zz])

the monosyllabic classes may be followed by one vowel

if they start a Cvv-V cmapua unit; the others may never

be followed by vowels. Classes ending in -b are

used in Cvv-V cmapua units.

the single vowel classes were moved before the class

connective in the phonetics section. Vo3 <- (Juncture? Vo2 !Badstress) aa <- ([Aa] Juncture? [Aa] !Badstress Juncture? !Vo1) ae <- ([Aa] Juncture? [Ee] !Badstress Juncture? !Vo1) ai <- ([Aa] [Ii] !Badstress Juncture? !Vo1) ao <- ([Aa] [Oo] !Badstress Juncture? !Vo1) ai2 <- ([Aa] [Ii] !Badstress Juncture? &(Vo2 Juncture? !Vo1)) ao2 <- ([Aa] [Oo] !Badstress Juncture? &(Vo2 Juncture? !Vo1)) au <- ([Aa] Juncture? [Uu] !Badstress Juncture? !Vo1) ea <- ([Ee] Juncture? [Aa] !Badstress Juncture? !Vo1) ee <- ([Ee] Juncture? [Ee] !Badstress Juncture? !Vo1) ei <- ([Ee] [Ii] !Badstress Juncture? !Vo1) ei2 <- ([Ee] [Ii] !Badstress Juncture? &(Vo2 Juncture? !Vo1)) eo <- ([Ee] Juncture? [Oo] !Badstress Juncture? !Vo1) eu <- ([Ee] Juncture? [Uu] !Badstress Juncture? !Vo1) ia <- ([Ii] Juncture? [Aa] !Badstress Juncture? !Vo1) ie <- ([Ii] Juncture? [Ee] !Badstress Juncture? !Vo1) ii <- ([Ii] Juncture? [Ii] !Badstress Juncture? !Vo1) io <- ([Ii] Juncture? [Oo] !Badstress Juncture? !Vo1) iu <- ([Ii] Juncture? [Uu] !Badstress Juncture? !Vo1) ia2 <- ([Ii] Juncture? [Aa] !Badstress Juncture? &(Vo2 Juncture? !Vo1)) ie2 <- ([Ii] Juncture? [Ee] !Badstress Juncture? &(Vo2 Juncture? !Vo1)) ii2 <- ([Ii] Juncture? [Ii] !Badstress Juncture? &(Vo2 Juncture? !Vo1)) io2 <- ([Ii] Juncture? [Oo] !Badstress Juncture? &(Vo2 Juncture? !Vo1)) iu2 <- ([Ii] Juncture? [Uu] !Badstress Juncture? &(Vo2 Juncture? !Vo1)) oa <- ([Oo] Juncture? [Aa] !Badstress Juncture? !Vo1) oe <- ([Oo] Juncture? [Ee] !Badstress Juncture? !Vo1) oi <- ([Oo] [Ii] !Badstress Juncture? !Vo1) oi2 <- ([Oo] [Ii] !Badstress Juncture? &(Vo2 Juncture? !Vo1)) oo <- ([Oo] Juncture? [Oo] !Badstress Juncture? !Vo1) ou <- ([Oo] Juncture? [Uu] !Badstress Juncture? !Vo1) ua <- ([Uu] Juncture? [Aa] !Badstress Juncture? !Vo1) ue <- ([Uu] Juncture? [Ee] !Badstress Juncture? !Vo1) ui <- ([Uu] Juncture? [Ii] !Badstress Juncture? !Vo1) uo <- ([Uu] Juncture? [Oo] !Badstress Juncture? !Vo1) uu <- ([Uu] Juncture? [Uu] !Badstress Juncture? !Vo1) ua2 <- ([Uu] Juncture? [Aa] !Badstress Juncture? &(Vo2 Juncture? !Vo1)) ue2 <- ([Uu] Juncture? [Ee] !Badstress Juncture? &(Vo2 Juncture? !Vo1)) ui2 <- ([Uu] Juncture? [Ii] !Badstress Juncture? &(Vo2 Juncture? !Vo1)) uo2 <- ([Uu] Juncture? [Oo] !Badstress Juncture? &(Vo2 Juncture? !Vo1)) uu2 <- ([Uu] Juncture? [Uu] !Badstress Juncture? &(Vo2 Juncture? !Vo1))

adding the new IY and UY, which might see use some time.

they are mandatory monosyllables but do not take a possible additional

following vowel as the regular ones do. So far only used in [ziy]. iy <- ([Ii] [Yy] !Badstress Juncture? !Vo1) uy <- ([Uu] [Yy] !Badstress Juncture? !Vo1) The optional pause and commas

this is a pause not required by the phonetics. This is the only

sort of pause which could in principle carry semantic freight (the

pause/GU equivalence beloved of our Founder) but we have abandoned

this. There is one place, after initial no in an utterance, where

a pause can have effect on the parse (but not on the meaning, I believe,

unless a word break is involved).

this class should NEVER be used in a context which might follow

a name word. In previous versions, pauses after name words were included

in the name word; this is not the case here, so a PAUSE

after a name word would not be recognized as a mandatory pause.

in any event, as long as we stay away from pause/GU equivalence, this

is not a serious issue!

this class does do some work in the handling of issues surrounding the legacy

shape of APA connectives, concerning which the less said, the better. OptPause <- (Comma1 !(Vo1/Connective) &caprule)

more punctuation Comma <- ([,] sp2 &caprule) Comma2 <- ([,]? sp2 &caprule) Part II: Lexicography

In this section I develop the grammar of words in Loglan. I'll work by editing the original provisional PEG grammar.

I place the start of this section exactly here, just before two final items of

punctuation, because these items of punctuation look forward not only to lexicography

but to the full grammar! Period and end of utterance

the end of utterance symbol [#] should be added in the phonetics

section as a species of terminal marker. Done. We do *not* actually

endorse use of this marker, but we can notionally support it and it is in

our sources. End <- ((sp? '#' sp utterance)/(sp !.)/!.)

this rule allows terminal punctuation to be followed by an inverse vocative,

a frequent occurrence in Leith's novel, and something which makes sense. Period <- (([!.:;?] (&End/(sp &caprule))) (invvoc Period?)?)

Letters with y will be special cases

idea: allow IY and UY (always monosyllables) as vowel combinations in cmapua only.

done: Y has a name now. [yfi] is also added. The cmapua word classes

the classes in this section after this point are the cmapua word classes of Loglan (if they begin with sp? or a word class).

I suppose the alien text classes are not really word classes, but they are lexicographic items, as it were.

Paradoxically, the PA and NI classes admit internal explicit pauses. So of course do predicate words!

Loglan does admit true multisyllable cmapua: there are words made of cmapua units which have joints between

units at which one cannot pause without breaking the word. Lojban, I am told, does not.

this version has the general feature that the quotation and alien text constructions are not hacked:

they are supported by the phonetic rules (as dire exceptions, of course) and the grammatical constructions

conform with the phonetic layer. Alien text and utterances quoted with [li]...[lu] can be enclosed in double quotes.

LI only supports full utterances, for the moment. All alien text constructors take the same class as argument:

the vocative and inverse vocative *require* quotes to avoid misreading ungrammatical expressions with typos

as correct (inverse) vocatives. Letterals (first approximation)

the names [yfi], [ziy] for Y are supported. The Ceo names are left as they are. I decided that a second short series

of letteral pronouns is actually a reasonable use of short words, and the Ceio words are there for other uses. TAI0 <- ((Vo1 Juncture? m a)/(Vo1 Juncture? f i)/(Vo1 Juncture? z i)/(!Predstart Co1 ai)/(!Predstart Co1 ei)/(!Predstart Co1 ai2 u)/(!Predstart Co1 ei2 u)/(!Predstart Co1 eo)/(z [Ii] Vo1 !Badstress Juncture? !Vo1 (m a)?)) Logical and causal connectives

a negative suffix used in various contexts. Always a suffix: its use as a prefix in tenses was a mistake in NB3 and I

think still supported in LIP. Ambiguities demonstrably followed from this usage (an example of how the demonstration

of non-ambiguity of 1989 Loglan was compromised by the opaque lexicography). NOI0 <- (n oi)

the logical connectives. [A0] is the class of core logical connectives. [A] is the fully decorated logical connective with

possible nu- (always in nuno- or nuu) and no- prefixes, possible -noi suffix, and possible (problematic) PA suffix, closed

with -fi (our new proposal) or an explicit pause. A0 <- (&Cmapua (a/e/o/u/(h a)/(n uu))) A <- (sp? !Predstart !TAI0 (n [o])? A0 NOI0? !(sp PAWORD0 OptPause) !(PAWORD0 !OptPause [ ,]) (PAWORD0 ((f i)/&OptPause))?)

4/18 in connected sentpreds, fi must be used to close, not a pause.

A2 <- sp? !predstart !TAI0 (N [o])? A0 NOI? !(sp PANOPAUSES PAUSE) !(PANOPAUSES !PAUSE [ ,]) (PANOPAUSES (F i))?

A not closed with -fi or a pause ANOFI <- (sp? (!Predstart !TAI0 ((n [o])? A0 NOI0? PAWORD0?))) AONE <- A

versions of A with different binding strength ACI <- (ANOFI c i) AGE <- (ANOFI g e)

a tightly binding series of logical connectives used to link predicates

this also includes the fusion connective [ze] when used between predicates. CA0 <- (((n o)? ((c a)/(c e)/(c o)/(c u)/(z e)/(c i h a)/(n u c u))) NOI0?) CA1 <- (CA0 !(sp PAWORD0 OptPause) !(PAWORD0 !OptPause [ ,]) (PAWORD0 ((f i)/&OptPause))?) CANOFI1 <- (CA0 PAWORD0?) CA <- (sp? CA1)

the fusion connective when used in arguments ZE <- (sp? (z e))

sentence connectives. [I] is the class of utterance initiators (no logical definition).

the subsequent classes are inhabited by sentence logical connectives with various binding

strengths. I <- (sp? !Predstart !TAI0 i !(sp PAWORD0 OptPause) !(PAWORD0 !OptPause [ ,]) (PAWORD0 ((f i)/&OptPause))?) ICA <- (sp? i ((h a)/CA1)) ICI <- (sp? i CANOFI1? c i) IGE <- (sp? i CANOFI1? g e)

forethought logical connectives KA0 <- ((k a)/(k e)/(k o)/(k u)/(k i h a)/(n u k u))

causal and comparative modifiers KOU0 <- ((k ou)/(m oi)/(r au)/(s oa)/(m ou)/(c iu))

negative and converse forms KOU1 <- (((n u n o)/(n u)/(n o)) KOU0)

the full type of forethought connectives, adding the causal and comparative connectives KA <- (sp? (KA0/((KOU1/KOU0) k i)) NOI0?)

the last component of the KA...KI... structure of forethought connections KI <- (sp? (k i) NOI0?)

causal and comparative modifiers which are *not* forethought connectives KOU2 <- (KOU1 !KI) Quantity words

a test used to at least partially enforce the penultimate stress rule on quantifier predicates BadNIStress <- ((Co1 Vo2 Vo2? Stress (m a)? (m oa)? NI RA0)/(Co1 Vo2 Stress Vo2 (m a)? (m oa)? NI RA0))

root quantity words, including the numerals (removing [kue] for another use) NI0 <- (!BadNIStress ((k ua)/(g ie)/(g iu)/(h ie)/(h iu)/(n ea)/(n io)/(p ea)/(p io)/(s uu)/(s ua)/(t ia)/(z oa)/(z oo)/(h o)/(n i)/(n e)/(t o)/(t e)/(f o)/(f e)/(v o)/(v e)/(p i)/(r e)/(r u)/(s e)/(s o)/(h i)))

the class of SA roots, which modify quantifiers SA0 <- (!BadNIStress ((s a)/(s i)/(s u)/(ie (Comma2? !ie SA0)?)) NOI0?)

the family of quantifiers which double as suffixes for the quantifier predicates

this class perhaps should also include some other quantifier words. [re] for example ought to be handled in the same way as [ra,ri,ro].

No action here, just a remark. Added [bao], which forms lambda abstractions (relations) not statements, to be used in lepu clauses. RA0 <- (!BadNIStress ((r a)/(r i)/(r e)/(r u)/(r o)/(b ao)))

re and ru added to class RA 5/11/18

quantifier units consisting of a NI or RA root with [ma] 00 or [moa] 000 appended; to [moa] one can further

append a digit to iterate [moa]: [fomoate] is four billion, for example. [rimoa], a few thousand.

a NI1 or RA1 may be followed by a pause before another NI word other than a numerical predicate;

one is allowed to breathe in the middle of long numerals. I question whether the pause

provision makes sense in RA1. NI1 <- ((NI0 (!BadNIStress m a)? (!BadNIStress m oa NI0*)?) (Comma2 !(NI RA0) &NI)?) RA1 <- ((RA0 (!BadNIStress m a)? (!BadNIStress m oa NI0*)?) (Comma2 !(NI RA0) &NI)?)

a composite NI word, optional SA prefix before a sequence of NI words or a RA word,

or a single SA word [which will modify a default quantifier not expressed],

possibly negated, connected with CA0 roots to other such constructs. NI2 <- (((SA0? (NI1+/RA1))/SA0) NOI0? (CA0 ((SA0? (NI1+/RA1))/SA0) NOI0?)*)

a full NI word with an acronymic dimension (starting with [mue], ending with a pause) or [cu] appended. I need to look up [cu]

and figure out its semantics. An arbitrary name word may now be used as a dimension, as well. NI <- (sp? (p i)? NI2 ((&(m ue) ACRONYM (Comma/&End/&Period) !(c u))/(Comma2? m ue Comma2? PRENAME !(c u)))? (c u)?)

mex is now identical with NI, but it's in use in later rules. mex <- (sp? NI) The overused CI

a word used for various tightly binding constructions: a sort of verbal hyphen.

also a name marker, which means phonetic care is needed (pause after constructions with [ci]). CI <- (sp? (c i)) Acronyms

Acronyms, which are names (not predicates as in 1989 Loglan) or dimensions (in NI above).

units in acronym are TAI0 letterals, zV short forms for vowels, the dummy unit [mue], and NI1

quantity units. NI1 quantity units may not be initial. [mue] units may be preceded by pauses.

An acronym has at least two units.

it is worth noting that acronyms, once viewed as names, could be entirely suppressed as a feature of the

grammar by really making them names (terminate them with -n). I suppose a similar approach would work

for dimensions, allowing any name word to serve as a dimension. [mue] would be a name marker for use

with dimensions in this case. [temuedain], three dollars. Now supported. ACRONYM <- (sp? &caprule ((m ue)/TAI0/(z Vo2 !Vo2)) ((Comma &ACRONYM m ue)/NI1/TAI0/(z Vo2 (!Vo2/(z &Vo2))))+) Letterals and other pronouns

the full class of letterals, including the [gao] construction whose details I should look at. TAI <- (sp? (TAI0/((g ao) !Vo2 sp? (PRENAME/Predicate/CmapuaUnit))))

atomic non-letteral pronouns. #4/15/2019 reserved [koo] for a Lojban style imperative pronoun, though not officially adopting it. Also adding [dao] for a default, don't care argument, another Lojban feature. DA0 <- ((t ao)/(t io)/(t ua)/(m io)/(m iu)/(m uo)/(m uu)/(t oa)/(t oi)/(t oo)/(t ou)/(t uo)/(t uu)/(s uo)/(h u)/(b a)/(b e)/(b o)/(b u)/(d a)/(d e)/(d i)/(d o)/(d u)/(m i)/(t u)/(m u)/(t i)/(t a)/(m o)/(k oo)/(d ao))

letterals (not including [gao] constructions and atomic pronouns optionally suffixed with a digit. One should pause after the

suffixed forms, because [ci] is a name marker. DA1 <- ((TAI0/DA0) (c i ![ ] NI0)?)

general pronoun words. DA <- (sp? DA1) Tenses, locatives and modals

roots for PA words: tense and location words, prepositions building relative modifiers. All can optionally be negated with -noi. They may also be quantified. They may also be closed with ZI class affixes. PA cores.

put the long list of atomic PA words in a wrapper PA00 <- ((g ia)/(g ua)/(p au)/(v au)/(f au)/(p ia)/(p ua)/(n ia)/(n ua)/(b iu)/(f ea)/(f ia)/(f ua)/(v ia)/(v ii)/(v iu)/(c oi)/(d au)/(d ii)/(d uo)/(f oi)/(f ui)/(g au)/(h ea)/(k au)/(k ii)/(k ui)/(l ia)/(l ui)/(m ia)/(n ui)/(p eu)/(r oi)/(r ui)/(s ea)/(s io)/(t ie)/(v ie)/(v a)/(v i)/(v u)/(p a)/(n a)/(f a)/(v a)/(KOU0 !(n oi) !KI)) PA0 <- (NI2? (n u !KOU0)? PA00 (n oi)? ZI?)

the form used for actual prepositions and suffixes to A words, with minimal pauses allowed.

these are built by concatenating KOU2 and PA0 units, then linking these with CA0 roots (which can take

no- prefixes and -noi suffixes, and next to which one *can* pause), optionally suffixed with a class ZI suffix. PAWORD0 <- ((KOU2/PA0)+ ((Comma2? CA0 Comma2?) (KOU2/PA0)+)*)

prepositional words PAWORD <- (sp? PAWORD0)

class PA can appear as tense markers or as relative modifiers without arguments; here pauses

are allowed not only next to CA0 units but between KOU2/PA units. Like NI words, PA

words are a class of arbitrary length constructions, and we think breaths within them

(especially complex ones) are natural. PAPHRASE0 <- (((KOU2/PA0)+ (((Comma2? CA0 Comma2?)/(Comma2 !mod1a)) (KOU2/PA0)+)*) !modifier) PAPHRASE <- (sp? PAPHRASE0) GA <- (sp? (g a))

the class of tense markers which can appear before predicates. TENSE <- (PAPHRASE/GA)

suffixes which indicate extent or remoteness/proximity of the action of prepositions. ZI <- ((z i)/(z a)/(z u)) Articles and other descriptors

the primitive description building "articles". These include [la] which requires special

care in its use because it is a name marker. LE <- (sp? ((l ea)/(l eu)/(l oe)/(l ee)/(l aa)/(l e)/(l o)/(l a)))

articles which can be used with abstract descriptions: these include some quantity words.

this means that some abstract descriptions are semantically indefinites: I wonder if this

could be improved by having a separate abstract indefinite construction. LEFORPO <- (sp? ((l e)/(l o)/NI2))

the numerical/quantity article. LIO <- (sp? (l io))

structure words for the ordered and unordered list constructions. LAU <- (sp? (l au)) LOU <- (sp? (l ou)) LUA <- (sp? (l ua)) LUO <- (sp? (l uo)) ZEIA <- (sp? z ei2 a) ZEIO <- (sp? z ei2 o)

initial and final words for quoting Loglan utterances. LIWORD <- (l i) LUWORD <- (l u) Quotations and other alien text constructions

quoting Loglan utterances, with or without explicit double quotes (if they appear, they must

appear on both sides). The previous version allowed quotation of names; likely this should

be restored. liquote <- ((sp? LIWORD Comma2? utterance0 Comma2? LUWORD)/(sp? LIWORD Comma2? [\"] utterance0 [\"] Comma2? LUWORD))

the foreign name construction. This is an alien text construction LAO <- (sp? &(LAOalien Juncture?) AlienWord)

the strong quotation construction. This is an alien text construction. LIE <- (sp? &(LIEalien Juncture?) AlienWord) LIOALIEN <- (sp? &(LIOalien Juncture?) AlienWord)

I am not sure this class is used at all. Lw <- Cmapua

articles for quotation of words LIU0 <- ((l iu)/(n iu))

this now imposes the condition that an explicit comma pause (or terminal punctuation, or end) must appear at the end of the

Word or PreName quoted with [liu]. This seems like a good idea, anyway.

this class appeals to the phonetics. Words and PreNames can be quoted. The ability to quote names

here may remove the need to quote them with [li]...[lu]. Of course, some Words are in fact phrases rather

than single words: we will see whether the privileges afforded are used. The final clause allows

use of letterals as actual names of letters.

added [niu]: didn't make it a name marker. LNIU <- (([Ll]/[Nn]) [iI] Juncture? [Uu]) LIU <- ((sp? LNIU Juncture? !Vo1 Comma2? (PRENAME/Word) &(Comma/Terminal/End))/(sp? (l ii TAI)))

the construction of foreign and onomatopoeic predicates. These are alien text constructions. SUE <- (sp? &(([Ss] [Uu] Juncture? [Ee] Juncture?)/([Ss] [Aa] [Oo] Juncture?)) AlienWord) Assorted left and right closers

left marker in a predicate metaphor construction CUI <- (sp? (c ui))

other uses of GA GATWO <- (sp? (g a))

ge/geu act as "parentheses" to make an atomic predicate from a complex metaphorically

and logically connected predicates; [ge] has other left marking uses. GE <- (sp? (g e)) GEU <- (sp? ((c ue)/(g eu)))

final marker of a list of head terms GI <- (sp? ((g i)/(g oi)))

used to move a normally prefixed metaphorical modifier after what it modifies. GO <- (sp? (g o))

marker for second and subsequent arguments before the predicate; NEW GIO <- (sp? (g io))

the generic right marker of many constructions. GU <- (sp? (g u))

various flavors of right markers.

It should be noted that at one point I executed a program of simplifying these to

reduce the likelihood that multiple [gu]'s would ever be needed to close an utterance.

first of all, I made the closures leaner, moving them out of the classes closed

to their clients so that they generally can be used only when needed.

Notably, the grammar of [guu] is quite different. Second,

I introduced some new flavors of right marker. All can be realized with [gu],

but if one knows the right flavor one can close the right structure with a single

right closure.

right markers of subordinate clauses (argument modifiers).

[gui] closes a different class than in the trial.85 grammar, with

similar but on the whole better results. GUIZA <- (sp? (g ui) (z a)) GUIZI <- (sp? (g ui) (z i)) GUIZU <- (sp? (g ui) (z u)) GUI <- (!GUIZA !GUIZI !GUIZU (sp? (g ui)))

right markers of abstract predicates and descriptions.

probably the forms with z are to be preferred (and the other

two are not needed) but I preserve all five classes for now. GUO <- (sp? (g uo)) GUOA <- (sp? ((g uo2 a)/(g uo z a))) GUOE <- (sp? (g uo2 e)) GUOI <- (sp? ((g uo2 i)/(g uo z i))) GUOO <- (sp? (g uo2 o)) GUOU <- (sp? ((g uo2 u)/(g uo z u)))

right marker used to close term (argument/predicate modifier) lists.

it is important to note that in our grammar GUU is not a component of

the class termset, nor is it a null termset: it appears in other classes

which include termsets as an option to close them. The effects are similar

to those in the trial.85 grammar, but there is less of a danger that

extra unexpected closures will be needed. GUU <- (sp? (g uu))

a new closure for arguments in various contexts GUUA <- (sp? (g uu2 a))

a new closure for sentences. In particular, it

may have real use in closing up the scope of a list of

fronted terms before a series of logically connected sentences. GIUO <- (sp? (g iu2 o))

right marker used to close arguments tightly linked with JE/JUE. GUE <- (sp? (g ue))

a new closure for descpreds GUEA <- (sp? (g ue2 a)) Miscellaneous clause constructors

used to build tightly linked term lists. JE <- (sp? (j e)) JUE <- (sp? (j ue))

used to build subordinate clauses (argument modifiers). JIZA <- (sp? ((j ie)/(j ae)/(p e)/(j i)/(j a)) (z a)) JIOZA <- (sp? ((j io)/(j ao)) (z a)) JIZI <- (sp? ((j ie)/(j ae)/(p e)/(j i)/(j a)) (z i)) JIOZI <- (sp? ((j io)/(j ao)) (z i)) JIZU <- (sp? ((j ie)/(j ae)/(p e)/(j i)/(j a)) (z u)) JIOZU <- (sp? ((j io)/(j ao)) (z u)) JI <- (!JIZA !JIZI !JIZU (sp? ((j ie)/(j ae)/(p e)/(j i)/(j a)))) NUJI <- (sp? n u !sp JI) JIO <- (!JIOZA !JIOZI !JIOZU (sp? ((j io)/(j ao)))) Case tags, semantic and positional

case tags, both numerical position tags and the optional semantic case tags. DIO <- ((sp? ((b eu)/(c au)/(d io)/(f oa)/(k ao)/(j ui)/(n eu)/(p ou)/(g oa)/(s au)/(v eu)/(z ua)/(z ue)/(z ui)/(z uo)/(z uu))) ((c i ![ ] NI0)/ZI)?)

markers of indirect reference. Originally these had the same grammar as case tags,

but they are now different. LAE <- (sp? ((l ae)/(l ue))) The predicate constructor me

[me] turns arguments into predicates, [meu] closes this construction. ME <- (sp? ((m ea)/(m e))) MEU <- (sp? m eu) Reflexive and conversion operators

reflexive and conversion operators: first the root forms, then those with

optional numerical suffixes. NU0 <- ((n uo)/(f uo)/(j uo)/(n u)/(f u)/(j u)/(k ue)) NU <- (sp? ((((n u)/(n uo)/(k ue)) !(sp (NI0/RA0)) (NI0/RA0)?)/NU0)+ freemod?) Abstract predicate constructors

I do *not* think

that [poia] will really be confused with [po ia], particularly

since we do require an explicit pause before [ia] in the latter case,

but I record this concern: the forms with z might be preferable. #constructions from sentences PO1 <- (sp? ((p o)/(p u)/(z o))) PO1A <- (sp? ((p oi2 a)/(p ui2 a)/(z oi2 a)/(p o z a)/(p u z a)/(z o z a))) PO1E <- (sp? ((p oi2 e)/(p ui2 e)/(z oi2 e))) PO1I <- (sp? ((p oi2 i)/(p ui2 i)/(z oi2 i)/(p o z i)/(p u z i)/(z o z i))) PO1O <- (sp? ((p oi2 o)/(p ui2 o)/(z oi2 o))) PO1U <- (sp? ((p oi2 u)/(p ui2 u)/(z oi2 u)/(p o z u)/(p u z u)/(z o z u)))

abstract predicate constructor from simple predicates POSHORT1 <- (sp? ((p oi)/(p ui)/(z oi)))

word forms associated with the above abstract predicate root forms PO <- (sp? PO1) POA <- (sp? PO1A) POE <- (sp? PO1E) POI <- (sp? PO1E) POO <- (sp? PO1O) POU <- (sp? PO1U) POSHORT <- (sp? POSHORT1) register markers DIE <- (sp? ((d ie)/(f ie)/(k ae)/(n ue)/(r ie))) freemods and freemod builders

vocative forms: I still have the words of social lubrication as

vocative markers. HOI <- (sp? ((h oi)/(l oi)/(l oa)/(s ia)/(s ie)/(s iu)))

the verbal scare quote. The quantifier suffix indicates how many preceding words are affected;

this is an odd mechanism. JO <- (sp? (NI0/RA0/SA0)? (j o))

markers for forming parenthetical utterances as free modifiers. KIE <- (sp? (k ie)) KIU <- (sp? (k iu)) KIE2 <- (sp? k ie Comma2? [(]) KIU2 <- (sp? [)] Comma2? k iu)

marker for forming smilies. SOI <- (sp? (s oi))

a grab bag of attitudinal words, including but not restricted to the VV forms. UI0 <- (!Predstart ((!([Ii] Juncture? [Ee]) Vv Juncture?)/(b ea)/(b uo)/(c ea)/(c ia)/(c oa)/(d ou)/(f ae)/(f ao)/(f eu)/(g ea)/(k uo)/(k uu)/(r ea)/(n ao)/(n ie)/(p ae)/(p iu)/(s aa)/(s ui)/(t aa)/(t oe)/(v oi)/(z ou)/(l oi)/(l oa)/(s ia)/(s ii)/(t oe)/(s iu)/(c ao)/(c eu)/(s ie)/(s eu)/(s ie2 i)))

negative forms of the attitudinals. The ones with [no] before the two vowel forms are a phonetic exception. The others

should also be (though they present no pronunciation problem) so that they are resolved as single words.

There is a strong reason for [o] here. NOUI <- ((sp? UI0 NOI0)/(sp? n [o] Juncture? Comma? sp? UI0))

all attitudinals (adding the discursives nefi, tofi... etc)

there is a technical problem with mixing UI0 roots of VV and CVV shapes. UI <- (sp? (UI0+/(NI f i)))

the inverse vocative marker HUE <- (sp? (h ue)) Negation

occurrences of [no] as a word rather than an affix. NOWORD <- (sp? !KOU1 !NOUI (n o) !(Comma2? z ao Comma2? Predicate) !(sp? KOU0) !(sp? (JIO/JI/JIZA/JIOZA/JIZI/JIOZI/JIZU/JIOZU))) The large word classes (names and predicates)

Names, acronyms and PreNames from above. ACRONYMICNAME <- (ACRONYM &(Comma/Period/End)) DJAN <- (PRENAME/ACRONYMICNAME)

predicate words which are phonetically cmapua

"identity predicates". Converses are provided as a new proposal. BI <- (sp? (n u)? ((b ia)/(b ie)/(c ie)/(c io)/(b ia)/(b i)/(b ii)))

interrogative and pronoun predicates LWPREDA0 <- ((h e)/(d ua)/(d ui)/(b ua)/(b ui))

here I should reinstall the [zao] proposal.

the predicate words defined above in the phonetics section Predicate <- ((CmapuaUnit Comma2? z ao Comma2?)* Complex (Comma2? z ao Comma2? Predicate)?)

predicate words, other than the "identity predicates" of class [BI]

these include the numerical predicates (NI RA), also cmapua phonetically.

we are installing John Cowan's [zao] proposal here, experimentally, 4/15/2019 PREDA <- (sp? &caprule (Predicate/LWPREDA0/(![ ] NI RA0))) Part 3: The Grammar Proper Right markers turned into classes guoa <- (OptPause? (GUOA/GU) freemod?) guoe <- (OptPause? (GUOE/GU) freemod?) guoi <- (OptPause? (GUOI/GU) freemod?) guoo <- (OptPause? (GUOO/GU) freemod?) guou <- (OptPause? (GUOU/GU) freemod?) guo <- (!guoa !guoe !guoi !guoo !guou (OptPause? (GUO/GU) freemod?)) guiza <- (OptPause? (GUIZA/GU) freemod?) guizi <- (OptPause? (GUIZI/GU) freemod?) guizu <- (OptPause? (GUIZU/GU) freemod?) gui <- (OptPause? (GUI/GU) freemod?) gue <- (OptPause? (GUE/GU) freemod?) guea <- (OptPause? (GUEA/GU) freemod?) guu <- (OptPause? (GUU/GU) freemod?) guua <- (OptPause? (GUUA/GU) freemod?) giuo <- (OptPause? (GIUO/GU) freemod?) meu <- (OptPause? (MEU/GU) freemod?) geu <- GEU

Here note the absence of pause/GU equivalence. gap <- (OptPause? GU freemod?) The vocative and inverse vocative

this is the vocative construction. It can appear early because all of its components are marked.

the intention is to indicate who is being addressed. This can be handled via a name, a descriptive argument, a predicate or an

alien text name (the last must be quoted). The complexities of these grammatical constructions can be deferred until they are

introduced.

HOI0 <- sp? [Hh] [Oo] [Ii] juncture?

restore words of social lubrication as vocative markers but not as name markers: [loi, Djan]

I do not allow a freemod to intervene between a vocative marker and the associated

utterance, to avoid unintended grabbing of subjects by the words of social lubrication when they are used

as vocative markers. This lets [Loi, Djan] and [Loi hoi Djan] be equivalent. The comma needed in the

first because the social lubrication words are in this version not name markers. HOI0 <- ((sp? (([Hh] oi)/([Ll] oi)/([Ll] oa)/([Ss] ia)/([Ss] ie)/([Ss] iu))) Juncture? !Vo1) voc <- ((HOI0 Comma2? name)/(HOI Comma2? descpred guea? namesuffix?)/(HOI Comma2? argument1 guua?)/(sp? &([Hh] [Oo] [Ii] Juncture?) AlienWord))

this is the inverse vocative. It can appear early because all of its components are marked.

the intention is to indicate who is speaking. The range of ways this can be handled is similar to the range of ways it can be

handled for the vocative; there is the further option of a sentence (the [statement] class) and there is a strong closure option

for the case where an argument is used (to avoid it inadvertantly expanding to a sentence). HUE0 <- (sp? &caprule [Hh] [Uu] Juncture? [Ee] Juncture? !Vo1) invvoc <- ((HUE0 Comma2? name)/(HUE freemod? descpred guea? namesuffix?)/(HUE freemod? argument1 &(freemod? gasent1a/gasent2/freemod? &subject statement))/(HUE freemod? statement giuo?)/(HUE freemod? argument1 guu?)/(sp? &([Hh] [Uu] Juncture? [Ee] Juncture?) AlienWord)) Free modifiers

this is the class of free modifiers. Most of its components are head marked (those that aren't appear just above),

and it is useful for it to appear early because these things appear everywhere in subsequent constructions. A free modifier,

of whatever sort, is a freely insertable gadget which modifies the immediately preceding construction, or the entire utterance

if it is initial.

NOUI is a negated attitudinal word. UI1 is an attitudinal word: these express an emotional attitude toward the

assertion (noting that EI marks questions (yes or no answer expected) and SEU marks utterances as answers).

SOI creates smilies in a general sense: [soi crano] indicates that the listener should imagine the speaker smiling;

similarly for other predicates.

DIE and NO DIE are register markers, communicating the social attitude of the speaker toward the one addressed: [die] for

example is "dear"

KIE...KIU constructs a full parenthetical utterance as a comment, which can be enclosed in actual parentheses inside

the marker words.

JO is a scare quote device.

deletion of a previous word or wordlike unit (or more than one) using K IA kiamod <- (Comma2? !(!PRENAME !Predstart k ia) ((PRENAME/LIU/AlienWord/(Cmapua (sp? !(k ia) !PRENAME !Predstart Cmapua)*)/Word) kiamod* Comma2? !PRENAME !Predstart k ia) Comma2?)

the comma is a freemod with no semantic content: this is a device for discarding phonetically required pauses

and the speaker's optional pauses alike. The pause before a non-pause marked prename is part of the NameWord and so

is excluded. Ellipses and dashes are fancy pauses supported as freemods. freemod <- ((kiamod/NOUI/(SOI freemod? descpred guea?)/DIE/(NOWORD DIE)/(KIE Comma? utterance0 Comma? KIU)/(KIE2 Comma? utterance0 Comma? KIU2)/invvoc/voc/(Comma !(!FalseMarked PRENAME))/JO/UI/(sp? '...' (sp? &Letter)?)/(sp? '--' (sp? &Letter)?)) freemod?) Tightly bound arguments and lists thereof

the classes juelink to linkargs describe very tightly bound arguments which can be firmly attached to predicates in

the context of metaphorical modifications and the use of predicates in descriptive arguments.

note that we allow predicate modifiers (prepositional phrases) to be bound with [je/jue] which is not

allowed in 1989 Loglan, but which we believe is supported in Lojban. juelink <- (JUE freemod? (term/(PAPHRASE freemod? gap?))) links1 <- (juelink (freemod? juelink)* gue?) links <- ((links1/(KA freemod? links freemod? KI freemod? links1)) (freemod? AONE freemod? links1)*) jelink <- (JE freemod? (term/(PAPHRASE freemod? gap?))) linkargs1 <- (jelink freemod? (links/gue)?) linkargs <- ((linkargs1/(KA freemod? linkargs freemod? KI freemod? linkargs1)) (freemod? AONE freemod? linkargs1)*) Abstract argument constructions

class abstractpred supports the construction of event, property, and quantity predicates from sentences. These are

closable with [guo] if introduced with [po,pu,zo] and closable with suffixed variants of [guo] if introduced with suffixed

variants of [po,pu,zo] (a NEW idea but it is clear that closure of these predicates (and of the more commonly

used associated descriptions) is an important issue).

using sentenceclone so that subject free sentences will not be marked as imperative abstractpred <- ((POA freemod? uttAxclone guoa?)/(POA freemod? sentenceclone guoa?)/(POE freemod? uttAxclone guoe?)/(POE freemod? sentenceclone guoe?)/(POI freemod? uttAxclone guoi?)/(POI freemod? sentenceclone guoi?)/(POO freemod? uttAxclone guoo?)/(POO freemod? sentenceclone guoo?)/(POU freemod? uttAxclone guou?)/(POU freemod? sentenceclone guou?)/(PO freemod? uttAxclone guo?)/(PO freemod? sentenceclone guo?)) Atomic predicates (predunit)

predunit1 describes the truly atomic forms of predicate.

PREDA is the class of predicate words (the phonetic predicate words along with the special phonetic cmapua which are predicates, listed

above under the PREDA rule. NU PREDA handles permutations and identifications of arguments of PREDAs.

SUE contains the alien text constructions with [sao] and [sue], semantically quite different but syntactically handled

in the same way.

[ge]...[geu/cue] (the closing optional) can parenthesize a fairly complex predicate phrase and turn it into an atomic form. These

forms can have conversion or reflexive operators (NU) applied. I should look into why the class handled in the conversion case

is different. An important use of this is in metaphor constructions, but it has other potential uses.

abstractpred is the class of abstraction predicates just introduced above. These are treated as atomic in this grammar: it should

be noted that their privileges in the trial.85 grammar are (absurdly) limited.

[me]...[meu] (the closing optional, but important to have available) forms predicates from arguments, the predicate being true of the

objects to which the argument refers. [Ti me le mrenu] : this is one of the men we are talking about. predunit1 <- ((SUE/(NU freemod? GE freemod? despredE (freemod? geu Comma?)?)/(NU freemod? PREDA)/(Comma? GE freemod? descpred (freemod? geu Comma?)?)/abstractpred/(ME freemod? argument1 meu?)/PREDA) freemod?)

[no] binds very tightly to predunit1: a possibly multiply negated predunit1 (or an unadorned predunit1) is a predunit2. predunit2 <- ((NOWORD freemod?)* predunit1)

an instance of NO2 is one not absorbed by a predunit. Example: [Da no kukra prano] X is a slow (not-fast) runner vs

[Da no ga kukra prano] (X is not a fast runner, and in fact may not run at all). neg2 <- (!predunit2 NOWORD)

a predunit3 is a predunit2 with tightly attached arguments. predunit3 <- ((predunit2 freemod? linkargs)/predunit2)

a predunit is a predunit3 or a predunit3 converted by the short-scope abstraction operators

[poi/pui/zoi] to an abstraction predicate. This is the kind of predicate which can appear as

a component in a serial name. predunit <- ((POSHORT freemod?)? predunit3)

a further "atomic" (because tightly packaged) form is a forethought connected pair

of predicates (this being the full predicate class defined at the end of the process)

possibly closed with [guu], possibly multiply negated as well.

the closure with guu eliminated the historic rule against kekked heads of metaphors. kekpredunit <- ((NOWORD freemod?)* KA freemod? predicate freemod? KI freemod? predicate guu?) The construction of metaphors

there follows the construction of metaphorically modified predicates,

along with tightly logically linked predicates.

CI and simple juxtaposition of predicates both represent modification of the second

predicate by the first. We impose no semantic conditions on this modification,

except in the case of modification by predicates logically linked with CA,

which do distribute logically in the expected way both as modifiers and as modified.

We do not regard [preda1 preda2] as necessarily implying preda2: we do regard

it as having the same place structure as preda2. It is very often but not always

a qualification or kind of preda2; in any case it is a relation analogous to preda2.

modification with CI binds most tightly.

we eliminated the distinction between the series of sentence and description

predicate preliminary classes: there seems to be no need for it even in the

trial.85 grammar. despredA <- ((predunit/kekpredunit) (freemod? CI freemod? (predunit/kekpredunit))*)

this is logical connection of predicates with the tightly binding CA

series of logical connectives. CUI can be used to expand the scope of

a CA connective over a metaphor on the left. [ge]...[geu] is used to expand

scope on the right (and could also be used on the left, it should be noted).

descpredC is an internal of despredB assisting the function of CUI.

the !PREDA in front of CUI is probably not needed. despredB <- ((!PREDA CUI freemod? despredC freemod? CA freemod? despredB)/despredA) despredC <- (despredB (freemod? despredB)*)

tight logical linkage of despredB's despredD <- (despredB (freemod? CA freemod? despredB)*)

chain of modifications of despredD's (grouping to the left) despredE <- (despredD (freemod? despredD)*)

the GO construction allows inverse modification: [preda1 GO preda2] is [preda2 preda1] as it were.

there are profound effects on grouping. descpred <- ((despredE freemod? GO freemod? descpred)/despredE)

this version which appears in sentence predicates as opposed to descriptions differs

in allowing loosely linked arguments (termsets) instead of those linked with [je/jue] for the predicate

moved to the end by GO. sentpred <- ((despredE freemod? GO freemod? barepred)/despredE) Construction of sentence modifiers

the construction of predicate modifiers (prepositional phrases usable as terms along with arguments). mod1a <- (PAWORD freemod? argument1 guua?)

note special treatment of predicate modifiers without actual arguments.

the !barepred serves to distinguish these predicate modifiers from actual

"tenses" (predicate markers). mod1 <- ((PAWORD freemod? argument1 guua?)/(PAPHRASE freemod? !barepred gap?))

forethought connection of modifiers. There is some subtlety in

how this is handled. kekmod <- ((NOWORD freemod?)* (KA freemod? modifier freemod? KI freemod? mod)) mod <- (mod1/((NOWORD freemod?)* mod1)/kekmod)

afterthought connection of modifiers modifier <- (mod (AONE freemod? mod)*) Serial names (a flash point)

the serial name is a horrid heterogenous construction! It can involve

components of all three of the major phonetic classes essentially!

However, I believe I have the definition right, with all the components

correctly guarded :-) name <- ((PRENAME/ACRONYMICNAME) ((Comma2? !FalseMarked PRENAME)/(Comma2? &([Cc] [Ii]) NAMEWORD)/(Comma2? CI predunit !(Comma2? (!FalseMarked PRENAME)))/(Comma2? CI ACRONYMICNAME))* freemod?) LAWORD <- (sp? [Ll] [Aa] Juncture?) LANAME <- (LAWORD Comma2? name) General construction of descriptive arguments

general constructions of arguments with "articles".

the rules here have the "possessive" construction as in [lemi hasfa; le la Djan, hasfa] embedded. These are not the same

construction in 1989 Loglan, though speakers might think they are. Here they are indeed the same. The "possessor" cannot

be "indefinite" (cannot start with a quantifier word); the possessor can be followed by a tense, as in

[le la Djan, na hasfa], "John's present house", by analogy with [lemina hasfa], which is accepted by LIP (because

LIP accepts [lemina] as a word).

there are other subtleties to be reviewed. descriptn <- (!LANAME ((LAU wordset1)/(LOU wordset2)/(LE freemod? ((!mex arg1a freemod?)? (PAPHRASE freemod?)?)? ((mex freemod? arg1a)/(mex freemod? descpred)/descpred))/(GE freemod? mex freemod? descpred)))

abstract descriptions. Note that abstract descriptions are closed with [guo] entirely independently of abstract predicates:

[le po preda guo] does not have a grammatical component [po preda guo]. This avoids the double closure often apparently necessary

in Lojban. abstractn <- ((LEFORPO freemod? POA freemod? uttAxclone guoa?)/(LEFORPO freemod? POA freemod? sentenceclone guoa?)/(LEFORPO freemod? POE freemod? uttAxclone guoe?)/(LEFORPO freemod? POE freemod? sentenceclone guoe?)/(LEFORPO freemod? POI freemod? uttAxclone guoi?)/(LEFORPO freemod? POI freemod? sentenceclone guoi?)/(LEFORPO freemod? POO freemod? uttAxclone guoo?)/(LEFORPO freemod? POO freemod? sentenceclone guoo?)/(LEFORPO freemod? POU freemod? uttAxclone guou?)/(LEFORPO freemod? POU freemod? sentenceclone guou?)/(LEFORPO freemod? PO freemod? uttAxclone guo?)/(LEFORPO freemod? PO freemod? sentenceclone guo?))

a wider class of basic argument constructions. Notice that LANAME is always read by preference to descriptn. Ciforsuffix <- ([Cc] [Ii]) namesuffix <- (&((Comma2 !FalseMarked PRENAME)/(sp? Ciforsuffix Juncture? Comma2? (PRENAME/ACRONYMICNAME))) ((sp? Ciforsuffix Juncture? Comma2?)/Comma2)? name) arg1 <- (abstractn/(LIO freemod? descpred guea?)/(LIO freemod? argument1 guua?)/(LIO freemod? mex gap?)/LIOALIEN/LAO/LANAME/(descriptn guua? namesuffix?)/LIU/LIE/liquote)

this adds pronouns (incl. the fancy [gao] letterals) and the option of left marking an argument with [ge] arg1a <- ((DA/TAI/arg1/(GE freemod? arg1a)) freemod?) Argument modifiers (subordinate clauses) argmod1 <- (((sp? (n o) sp?)? ((JI freemod? predicate)/(JIO freemod? sentence)/(JIO freemod? uttAx)/(JI freemod? modifier)/((JI/NUJI) freemod? argument1)))/((sp? (n o) sp?)? (((JIZA freemod? predicate) guiza?)/((JIOZA freemod? sentence) guiza?)/((JIOZA freemod? uttAx) guiza?)/((JIZA freemod? modifier) guiza?)/(JIZA freemod? argument1 guiza?)))/((sp? (n o) sp?)? ((JIZI freemod? predicate guizi?)/(JIOZI freemod? sentence guizi?)/(JIOZI freemod? uttAx guizi?)/(JIZI freemod? modifier guizi?)/(JIZI freemod? argument1 guizi?)))/((sp? (n o) sp?)? ((JIZU freemod? predicate guizu?)/(JIOZU freemod? sentence guizu?)/(JIOZU freemod? uttAx guizu?)/(JIZU freemod? modifier guizu?)/(JIZU freemod? argument1 guizu?))))

we improved the trial.85 grammar by closing not argmod1 but argmod with [gui]. But the labelled argument modifier constructors

when building an argmod1 have the argmod1 construction closed with the corresponding labelled right marker, of course. Thus

gui and guiza actually have different grammar.

trial.85 did not provide forethought connected argument modifiers, and we also see no need for them,

though they could readily be added. argmod <- (argmod1 (AONE freemod? argmod1)* gui?) Arguments resume

affix argument modifiers to a definite argument arg2 <- (arg1a freemod? argmod*)

build a possibly indefinite argument from an argument: to le mrenu arg3 <- (arg2/(mex freemod? arg2))

build an indefinite argument from a predicate indef1 <- (mex freemod? descpred)

affix an argument modifier to an indefinite argument indef2 <- (indef1 guua? argmod*) indefinite <- indef2

link arguments with the fusion connective [ze] arg4 <- ((arg3/indefinite) (ZE freemod? (arg3/indefinite))*)

forethought connection of arguments. Note use of argx arg5 <- (arg4/(KA freemod? argument1 freemod? KI freemod? argx))

arguments with possible negations followed by possible indirect reference constructions. argx <- ((NOWORD freemod?)* (LAE freemod?)* arg5)

afterthought connection with the tightly binding ACI connectives arg7 <- (argx freemod? (ACI freemod? argx)?)

afterthought connection with the usual A connectives. Can't start with GE

to avoid an ambiguity (to which 1989 Loglan is vulnerable) involving AGE connectives. arg8 <- (!GE (arg7 freemod? (AONE freemod? arg7)*))

afterthought connection (now right grouping, instead of the left grouping above)

using the AGE connectives. GUU can be used to affix an argument modifier at this top level. argument1 <- (((arg8 freemod? AGE freemod? argument1)/arg8) (GUU freemod? argmod)*)

possibly negated and case tagged arguments. We (unlike 1989 Loglan) are careful

to use argument only where case tags are appropriate. argument <- ((NOWORD freemod?)* (DIO freemod?)* argument1)

an argument which is actually case tagged. argxx <- (&((NOWORD freemod?)* DIO) argument) Term lists

arguments and predicate modifiers actually associated with predicates. term <- (argument/modifier)

a term list consisting entirely of modifiers. modifiers <- (modifier (freemod? modifier)*)

a term list consisting entirely of modifiers and tagged arguments. modifiersx <- ((modifier/argxx) (freemod? (modifier/argxx))*)

the subject class is a list of terms (arguments and predicate modifiers) in which all but possibly one

of the arguments are tagged, and there is at least one argument, tagged or otherwise. subject <- ((modifiers freemod?)? ((argxx subject)/(argument (modifiersx freemod?)?)))

these classes are exactly argument, but are used to signal

which argument position after the predicate an argument occupies.

I think the grammar is set up so that these will actually

never be case tagged, though the grammar does not expressly forbid it. argumentA <- argument argumentB <- argument argumentC <- argument

argumentC <- argument argumentD <- argument

for argument lists not guarded against absorbing a following subject (now redundant) argumentA1 <- argument argumentB1 <- argument argumentC1 <- argument argumentD1 <- argument

a general term list. It cannot contain more than four untagged arguments (they will be labelled

with the lettered subclasses given above). terms <- ((modifiersx? argumentA (freemod? modifiersx)? argumentB? (freemod? modifiersx)? argumentC? (freemod? modifiersx)? argumentD?)/modifiersx)

terms list not guarded against absorbing a following subject (now the same as terms) terms1 <- ((modifiersx? argumentA1 (freemod? modifiersx)? argumentB1? (freemod? modifiersx)? argumentC1? (freemod? modifiersx)? argumentD1?)/modifiersx)

innards of ordered and unordered list constructions. These are something I totally rebuilt, as they were in a totally

unsatisfactory state in trial.85. Note the use of comma words to separate items in lists. word <- (arg1a/indef2) words1 <- (word (ZEIA? word)*) words2 <- (word (ZEIO? word)*) wordset1 <- (words1? LUA) wordset2 <- (words2? LUO)

the full term set type to be affixed to predicates.

forethought connection of term lists termset1 <- (terms/(KA freemod? termset2 freemod? guu? KI freemod? termset1))

afterthought connection of term lists. There are cunning things going on here getting [guu]

to work correctly. Note that [guu] is NOT a null term list as it was in trial.85. termset2 <- (termset1 (guu &AONE)? (AONE freemod? termset1 (guu &AONE)?)*)

there is an interesting option here of a list of terms followed by [go] followed by a predicate

intended to metaphorically modify the predicate to which the terms are affixed. Is there a reason

why we cannot have a more complex construction in place of terms? termset <- ((terms freemod? GO freemod? barepred)/termset2) The general verb phrase construction

this is the untensed predicate with arguments attached. Here is the principal locus

of closure with [guu], but it is deceptive to say that [guu] merely closes barepred,

as we have seen above, for example in [termset2]. barepred <- (sentpred freemod? ((termset guu?)/(guu &termset))?)

tensed predicates markpred <- (TENSE freemod? barepred)

there follows an area in which my grammar looks different from trial.85. Distinct parallel forms for

marked and unmarked predicates are demonstrably not needed even in trial.85. The behavior of the ACI

connectives is plain weird in trial.85; here we treat ACI connectives in the same way as A connectives, but

binding more tightly.

units for the ACI construction following -- possibly multiply negated bare or marked predicates.

adding shared termsets to logically connected predicates are handled differently here than in trial.85,

which uses a very elegant but dreadfully left-grouping rule which a PEG cannot handle. Any realistic situation

should be manageable. backpred1 <- ((neg2 freemod?)* (barepred/markpred))

ACI connected predicates. Shared termsets are added. Notice how we first group backpred1's then recursively

group backpreds. backpred <- (((backpred1 (ACI freemod? backpred1)+ freemod? ((termset guu?)/(guu &termset))?) ((ACI freemod? backpred)+ freemod? ((termset guu?)/(guu &termset))?)?)/backpred1)

A connected predicates; same comments as just above. Cannot start with GE to fix ambiguity with AGE connectives. predicate2 <- (!GE (((backpred (AONE !GE freemod? backpred)+ freemod? ((termset guu?)/(guu &termset))?) ((AONE freemod? predicate2)+ freemod? ((termset guu?)/(guu &termset))?)?)/backpred))

predicate2's linked with right grouping AGE connectives (A and ACI are left grouping). predicate1 <- ((predicate2 AGE freemod? predicate1)/predicate2)

identity predicates from above, possibly negated identpred <- ((NOWORD freemod?)* (BI freemod? argument1 guu?))

predicates in general. Note that identity predicates cannot be logically connected

except by using forethought connection (see above). predicate <- (predicate1/identpred) The sentence

The gasent is a basic form of the Loglan sentence in which the predicate leads.

The basic structure is [PA word (usually a tense) or [ga]) followed optionally by terms followed optionally by

[ga] followed by terms. The list of terms after [ga] (if present) will either contain

at least one argument and no more than one untagged argument

(a subject) [gasent1] or all the arguments of the predicate [gasent2]. We deprecate other arrangements possible in

1989 Loglan because they would cause unexpected reorientation of the arguments already given before [ga] as second

and further arguments were read after [ga]. [barepred] is an untensed predicate possibly with arguments; [sentpred]

is "simply a verb", i.e., a predicate without arguments.

there is a semantic change from 1989 Loglan reflected in a grammar change here:

in [gasent1] the final (ga subject) is optional. When it does not appear, the resulting

sentence is an observative (a sentence with subject omitted), not an imperative.

Imperatives for us are unmarked.

4/22 allowing general predicates in gasent. Otherwise the spaces of observatives and imperatives become quite confused. gasent1 <- ((NOWORD freemod?)* (freemod? &markpred predicate (GATWO freemod? subject)?)) gasent1a <- ((NOWORD freemod?)* (freemod? &markpred predicate (GATWO freemod? subject))) gasent2 <- ((NOWORD freemod?)* (TENSE freemod? sentpred modifiers? (GATWO freemod? subject freemod? GIO? freemod? terms?))) gasent <- (gasent2/gasent1)

this is the simple Loglan sentence in various basic orders. The form "gasent" is discoussed just above.

Predicate modifiers

can be prefixed to the gasent. The final form given here is the basic SVO sentence. The "subject" class is a list of terms #(arguments and predicate modifiers) containing at most one un-case-tagged argument. The most general SVO form is subject, followed optionally #by [gio] followed by a list of terms (1989 Loglan allowed more than one untagged argument before the predicate, but this leads to practical problems #in which preceding constructions with errors in them may supply extra unintended arguments. It should be noted in NB3 that JCB envisioned #a single argument before the predicate, followed by the predicate, which may itself contain further arguments. A gasent nay optionally be negated #(even multiple times).

re [gio] and some other changes, in his comments on the NB3 grammar JCB often notes restrictions on appearances of term lists which he

intends but which he thought were hard to implement in the machine grammar. The appearance of just one argument before the "verb"

in an SVO sentence was one of these (though later he takes it as a virtue that the actual machine grammar supports SOV: we did not

consider it a virtue to have unmarked SOV after observing unintended parses appearing in the Visit text). Another example of this

(which would not have been hard for JCB to implement, in fact) is our restriction of the form "terms gasent" to "modifiers gasent".

His comments make it clear that he does not want arguments among those terms. # statement <- (gasent/(modifiers freemod? gasent)/(subject freemod? freemod? (GIO freemod? terms1)? predicate)) statement <- (gasent/(modifiers freemod? gasent)/(subject freemod? freemod? (GIO? freemod? terms1)? predicate))

this is a forethought connected basic sentence. It is odd (and actual odd results can be exhibited) that the final segment in both

of these rules is of the very general class uttA1, which includes some quite fragmentary utterances usually intended as answers.

12/20/2017 I rewrote the rule in a more compact form. This rule looks ahead to the class [sentence] which we now develop;

for the moment notice that [sentence] will include [statement].

4/14 tentatively allowing initial modifiers here and leaving this out of uttA0 which replaces uttA1 below.

The intention is to eliminate weird sentence fragments.

this is where I could reinstall permission to use headterms without [goi] before keksents. I do not think I want to.

I have done this experimentally. keksent <- (terms? freemod? (NOWORD freemod?)* (KA freemod? headterms? freemod? sentence freemod? KI freemod? sen1))

cloned if not marked as imperative keksentclone <- (terms? freemod? (NOWORD freemod?)* (KA freemod? headterms? freemod? sentenceclone freemod? KI freemod? sen1clone))

sentence negation. We allow this to be set off from the main sentence with a mere pause, because generally

it does not differ in meaning from the result of negating the first argument or predicate modifier. neghead <- ((NOWORD freemod? gap)/(neg2 OptPause))

this class includes [statement], predicate modifiers preceding a predicate (which may contain arguments), a statement,

a predicate, and a keksent. Of these, the first and third are imperatives.

4/23/2019 added actual rule for imperative sentences. This should not

affect the parse in any essential way. imperative <- ((modifiers freemod?)? !gasent predicate)

clone of imperative for labelling which occurrences are actually imperative nosubject <- ((modifiers freemod?)? !gasent predicate) headterms <- (terms GI freemod?)+

[headterms] is a list of terms (arguments and predicate modifiers) ending in [gi]. Preceding a [sen1] with these

causes all predicates in the [sen1] to share these arguments. We propose either that the headterms arguments be directly

appended to the argument list of each component of the [sen1], or that there is an argument with a numbered case tag at the beginning

of the headterms list, and the list is inserted at the appropriate position in each component sentence. Neither of these is

the condition described in Loglan I, which presupposes that we always know what the last argument of each predicate used is.

this is the sentence class below prefixed with a list of fronted terms.

we think the [giuo] closure might prove useful. uttAx <- (headterms freemod? sentence giuo?)

cloned if not to be marked as imperative uttAxclone <- (headterms freemod? sentenceclone giuo?) sen1 <- ((neghead freemod?)* (imperative/statement/keksent/uttAx))

clone of sen1 with nosubject instead of imperative sen1clone <- ((neghead freemod?)* (nosubject/statement/keksentclone/uttAxclone))

the class [sentence] consists of sen1's afterthought connected with A connectives

The logical structure of a [sentence] may not be transparent. The effect of appending another {ICA sen1) to a [sentence] is to connect the previous part of the sentence to the new [sen1] with the ICA connective. In other words, this groups to the left, logically.

Classes [sentence] and [uttAx] play an important role in my proposed definition of scope: the scope of a quantifier is the smallest item of one of these classes

which includes all instances of the variable it binds and all scopes of quantifiers which appear after it in its scope (in the peculiar SOV order we use). Notice

that an initial segment of a [sentence] is not a [sentence]: scopes are to be closed under ICA connectives.

adding another layer to sentences, afterthought connected with ICI connectives, more tightly binding (and removing class uttD below) sentence1 <- (sen1 (ICI freemod? sen1)*) sentence1clone <- (sen1clone (ICI freemod? sen1clone)*)

a sentence will continue through terminal punctuation; this averts ambiguity with the higher level utterance construction with ICA. sentence <- (sentence1 ([!.:;?]? ICA freemod? sentence1)*) sentenceclone <- (sentence1clone ([!.:;?]? ICA freemod? sentence1clone)*) Utterances

weird answer fragments uttA <- ((AONE/mex) freemod?)

a broad class of utterances, including various things one would usually only say as answers. Notice

that this utterance class can take terminal punctuation. uttA1 <- ((links/linkargs/argmod/terms/uttA/NOWORD) freemod? Period?)

possibly negated utterances of the previous class. uttC <- (sentence Period?/(neghead freemod? uttC)/uttA1)

Higher level utterance construction with ICA connectives eliminated. ICAUTT <- ICA # uttE <- (uttC (ICAUTT freemod? uttC)*)

utterances of the previous class linked with I sentence connectives. uttF <- (uttC (I freemod? uttC)*)

the utterance class for use in the context of parenthetical freemods or quotations, in which it does not go to end of text. utterance0 <- (!GE ((ICAUTT freemod? uttF)/(!OptPause freemod Period? utterance0)/(!OptPause freemod Period?)/(uttF IGE utterance0)/uttF/(I freemod? uttF?)/(I freemod? Period?)) (&I utterance0)?)

Notice that there are two passes here: the parser first checks that the entire utterance

is phonetically valid, then returns and checks for grammatical validity.

the full utterance class. This goes to end of text, and incorporates the phonetics check. This incorporates the only situations

in which a freemod is initial. The IGE connectives bind even more loosely than the I connectives and right-group instead of

left grouping.

utterance <- (&(PhoneticUtterance End) (!GE ((ICAUTT freemod? uttF (&I utterance)? End)/(!OptPause freemod Period? utterance)/(!OptPause freemod Period? (&I utterance)? End)/(uttF IGE utterance)/(I freemod? Period? (&I utterance)? End)/(uttF (&I utterance)? End)/(I freemod? uttF (&I utterance)? End))))