?

Log in

No account? Create an account
On the Design of an Ideal Language, Revision 8 - Constructed Languages

> Recent Entries
> Archive
> Friends
> Profile
> Language Creation Society

Links
Language Creation Society
Omniglot
LangMaker
Language Construction Kit
Lexical Semantics of a Machine Translation Interlingua
On the Design of an Ideal Language

June 8th, 2006


Previous Entry Share Next Entry
saizai
01:54 am - On the Design of an Ideal Language, Revision 8
[Originally posted 12 March 02. Last updated 8 June 06.]

Comment! Response is appreciated, and helps catalyze thought.

Note: this is a document under revision. Even *I* don't agree with myself all the time, and I expect people to disagree with the points I have to make here. As that happens, I'll do what I can to reconcile their opinions into some compromise, or at least list "dissenting opinions" alongside the major points, and give a decent answer as to why I said what I did.

Read others' comments! There are a *lot* of interesting ones, and some of the threads serve to explain my meaning a bit better than stated here. Not to mention that there are some excellent points of disagreement.

v8 8 June 06: added Principles of Noise Resistance and Entropy
v7: added the Principle of Good Representation
v6.6: changed title from "On Designing the Ideal Language" to "On the Design of an Ideal Language"; better reflects the no-single-answer aspect
v6.5: added anchor tags for easier naviagtion, copyright notice, various minor edits
v6: added Other Things to Consider: Temporal Order, Analog vs. Quantum Descriptors, and Purposely Wasting Space
v5: added Applying the PSD: Writing
v4.5: added lj-cut
v4: added the section on Combining/Utilizing Input Streams
v3: added the Principle of Cross-Modality and the Principle of Semantic Conservation
v2: added (per Axiem's reminder) the redundancy vs. corruption clause to the PSD



Important!: If you are wondering what exactly I mean by "ideal", and my other goals or meta issues like that, take a look at this post on CONLANG, which goes into some detail about that. Eventually I'll integrate it into this essay, but I haven't yet.


As you may know, I am (was?) attempting to construct my own language (known for now as Saigrok). However, I've run into a stumbling block - namely, my own ambition. I keep learning that I can't make a decision because I haven't yet decided on some higher-order feature of the language. I decided to compile a top-down view of what I want from it; language should not be desinged in a bottom-up piecemeal fashion if one wants the top-down principles to hold.

Therefore, I'll try to describe here exactly what the desirable qualities of an ideal language are (or should be), and how exactly one could go about putting those ideals into a more concrete form.

First, let's define "language". I'm going to use "a system for transmitting or recording ideas". Is that ambiguous? Damn right. But as you'll see, there's a reason for that.


Guiding Principles:

0. Principle of Good Representation

All forms of language use should be as representative as possible of the actual thinking of the target population.

That is, as much as possible, all rules should be designed to match e.g. human neuropsychology, ways of thinking, etc. If the intent is to change these, then of course this need not be taken to mean "be the same as natural languages" - in fact, there may be methods of expression that are *closer* to "native" thought processes than currently available.

Some possible examples:
* basic color terms - based on biologically determined focal colors, i.e. red-green * blue-yellow * black-white
* non-classical categories / words being defined as closely as possible to the "real" - e.g. using prototypes, graded fit, etc
* classical categories can be defined by e.g. their functions - e.g. "sit-thing" (where "thing" is a morpheme) instead of "chair"

1. Principle of Least Effort

Slang, as well as general "language evolution", have generally resulted from some more-difficult form being "corrupted" to an easier one. (e.g.: "thee" being removed, "whom" -> "Who", "television" -> "TV", vowel shift, etc.)

Therefore, the language should *start* with simplicity in mind. This means that things should be "regular" (linguistic term, meaning "hopefully the rules don't have many exceptions") as much as possible, that vocabulary should be as dense as possible (long words for oft-used concepts, especially when shorter words are not "taken", *will* be broken down with natural use), etc.

An example from ASL is that most signs that are physically difficult to make - palm out around chest, below waist, hands together above shoulder, etc. - tend to become simplified into ones that don't involve any strain.

2. Principle of Semantic Density

Any medium used - e.g., speech, 2d static visuals ("writing"), 3d static visuals ("sculpture"), 2d moving visuals ("movies"), 3d moving visuals ("live performance" [maybe eventually "movies", when tech evolves]), touch, etc. (I'll have more on this later) - should be used optimally.

"Optimally"...

This means that

a) everything that *can* be done (bounded by the PLE), is done - in speech, for eample, use of all available phonemes, tones, etc.

b) simpler things are done first. For example, the nonsense word "aijmapnargath" should be much later on in the vocabulary than "jaf". Or another example would be ICQ numbers: start from 1 and work up. Why assign #9143018 when #1402 isn't taken?

c) simpler things are reserved for simpler things. A word for the rotational axis of a particular molecule of some new-age plastic should be implicitly more difficult than a word for "good".

d) all available mediums are used to their fullest potential. This is bounded by a few things.
First, the capacity of the receiver(s) to interpret - e.g., radio & deaf people don't work well. (Clause: sometimes this is desirable, in that a multi-channel communication is interpreted at different levels by those able to receive it on different levels [e.g., signing "this is a lie" while talking, for the respective benefits of the person in front of you and the person listening to the room's mic].)
Second, the capacity of the sender and the medium to *encode* the data in the first place. Does it have to be static, as in a written document or movie? Is it interactive? Do you have the benefit of three dimensions, or four? Can you *produce* it? (e.g.: singing tones, or writing ideographs, or using color, or instruments [music carries data, damn it!])
Third, the density of the medium as used. How many WPM of English can native users do? What about ASL? Manual alphabets? Etc...

e) yes, I said *all* available mediums.

That means that if you're communicating with someone in front of you, and both of you are ordinary non-impaired humans, you should be using your full body movement (bounded by the PLE), full vocal capacity, etc. If it's dark, or someone's blind, you should be using touch instead. Etc.

SIMULTANEOUSLY.

HOWEVER... as Axiem points out (and I forgot to mention on first revision), there comes a point at which you must trade semantic space for redundancy. Such is the case with the armed forces' alpha/bravo/charlie alphabets, and with .rar format "recovery" space (an added 1% or so of space can protect against a surprising amount of corruption).

Thus, there should be a means of doing this - adding "buffer space" to the data - in whatever mode presented. However, it should *not* be a rigid thing; after all, I said "optimal". That means different things in different conditions - clear or foggy, quiet or noisy, etc. Ignoring this means, on one side, having to repeat (or simply losing the message, or losing precision [as is the case in many examples of humorously misplaced/missing commas]), and on the other, losing precious semantic space and thereby conveying less information.

3. Principle of Desired Clarity

Every statement (though "statment" may well be an inaccurate word for a gesture or other "unusual" mode) should be as exactly as semantically precise as the sender wishes.

It should be no less - if you want to specify "table" over "some sort of furniture designed for things to be placed upon" (like shelves, chairs, desks, etc.), you should be able to do so.

Neither should it be any *more* precise. First, if you want to know where somebody's conveyance is, you should not need to first know what method of conveyance they used (car, train, motorcycle, horse, feet...) Second, if you want to be ambiguous, you should be able to... and, as an important sub-prinicple, ambiguity should always be *implicit*. If the gender-neutral pronoun is more difficult to produce than "he" or "she", it will be received as a *deliberate* ambiguity. Of course, that too should be able to be expressed, but it should be different than *implicit* ambiguity, in that the former is inclusive and the latter exclusive.

4. Principle of Default Simplicity

The easiest concepts to render should be the simplest. E.g., Gender-neutral pronouns should be slightly simpler / easier than gender- or quantity- specific ones. The more complex the idea, the more correspondingly complex its expression.

5. Principle of Iconicity

As much as possible, the medium used should represent the thing expressed. This is hard to explain, but an intuitive prinicple.

If you're making a sign for "rain", for example, wiggling your fingers in a downward sweep is more "natural" than, say, making a circling motion with your fists. The same works with other mediums also; harsh concepts should *sound* harsh when heard, whereas gentle ones should be more mellifluous.

There are two cautionary notes to this principle, however.

First, there is the danger of culture bias. Onomatopaeia in spoken languages is a good example; I doubt most English speakers would recognize the Japanese equivalent of "woof woof" or "hee haw", nor vice versa. Also, a sign that represents "money" that symbolizes a sack of coins could well be outdated in fifty years when everybody uses plastic (or other yet-to-be-devised means of exchange). So if there's any question as to the Platonic nature of the representation, it should be completely arbitrary.

Second, there is the implication of this principle: that entities unfamiliar with the rules of expression (i.e., people who don't know the language) will have an easier time understanding it, because it is as "intuitive" / "natural" as possible. The problem is that sometimes, this is *not* a desirable feature - like when one is trying to be secretive. However, I believe that some form of encryption should be devisable, and the base nature (by the PDS) of the language should be intuitive.

6. Principle of Cross-Modality

Anything should be expressable in any/all available means.

There should be absolutely *nothing* lost in "mode shift" - e.g., the written transcript of a radio talk show. This includes all subtleties and other "meta" features that one normally ignores in English, like vocal intonation, pitch, speed, sarcasm, etc.

However, there's two clauses to this.

First, it may be desirable (*optionally*) to drop meaning (like the fact that someone used a word in a derogatory fashion) in favor of brevity, simply because some modes (like those available in communicating with the deaf-blind) are so limited in "bandwidth". I stress however that this is an OPTIONAL and (if relevant) explicit drop; if you want a full mode shift, so be it; it'll just take longer.

Secondly, some mediums may not allow for quite the degree of implicit or other meta-contextual meanings - how, for example, would you indicate that someone had a sarcastic voice when mode-shifting to touch signing? Pressure of the fingers? So, if need be, a shift from implicity to explicity is allowable, following the PDS: it's dropped unless you add it explicitly.

7. Principle of Semantic Conservation

Simply put, there should be no such thing as a "nonsense" or "incorrect" phrase. This overlaps with the PSD.

In English, for example, the phrase "man got job now" is ungrammatical, though composed of acceptable parts - though one could guess at its "proper" translation. However, why not have this *mean* something? I call this "wasted space". Another example: the non-existent, yet short and easily-pronounced word "bock" (unless I'm missing some extremely rare jargon...). Why? Yet we have words like "inexperienced".

There are (again) two warning clauses to this.

First, one must leave "space" for new, yet-unformed vocabulary, and an "official" means of its creation. I find English's way - make up a word that isn't yet taken - rather haphazard. How much space to leave, and how "valuable" (i.e., short words are more "desirable"), is an open question.

Second, similar to the previous mention of clarity vs. density, the first things to go (if there is some sort of "static" or "corruption") should be the higher-end ones; if a message is garbled, its basic meaning should remain intact; oh well if you lose the speaker's emotion.

Third, there is the (open) question of overlap. The word "blue" in english means several things - a color, a mood (depressed), a type of media (soft-pornographic), a blue *thing* ("the blue"), etc. Or "rehd" (when spoken) - a color, past tense of the verb "read", etc.

What to do about it? Should there be a one-to-one correlation of meaning and form? I think perhaps not. If a form can "hold" several meanings, like English words, let it, so long as a) those meanings would not, in most cases, be confused with each other (contextual clarification) and b) those meanings can easily be distinguished (by the PDC) with slightly more effort (e.g., "get", meaning #4, but less obtuse).

Finally, there is the question of how to deal with the fact that, in a fully conserved system, "noise" would have meaning. Literally speaking, everything you hear, see, smell, etc., should (in principle) carry some meaning. How do you choose which are and are not relevant? A hard (and open) question. (Another example: somebody speaking in sign language%2itemid


8. Principle of Noise Resistance

Communications should be comprehensible despite whatever noise occurs during their production, transmission, or reception.

Ideally, this would be a variable, or at least multi-setting thing, so that you can scale your noise resistance (and by extension, other sacrifices made for its sake) to the needs of the situation. However, it could also just be pegged at some decided-upon middle ground that covers a majority of relevant situations.

[Thanks to And Rosta for pointing this out.]


9. Principle of Entropy a.k.a. Principle of High Signal:Noise Ratio (SNR)

The language should have as high an entropy as possible, as a weighted average over all likely contexts, conversations, and soliloquies.

Entropy is a measure of how random any given chunk of data is. That is, how much real *information* does it have? The idea here is to maximize the amount of information you receive, and minimize the amount of repetition of unnecessary, expected, default, or otherwise excess data.

(96 comments | Leave a comment)

Comments:


Page 1 of 2
<<[1] [2] >>
(Deleted comment)
[User Picture]
From:saizai
Date:March 12th, 2002 07:37 pm (UTC)
(Link)
Doh! I was going to cover that. [edit edit]
redundancy - (Anonymous) - Expand
[User Picture]
From:embolus
Date:March 12th, 2002 10:18 pm (UTC)

comments from the linguistically lay...

(Link)
clearly you've got suggestions from people who are better versed in your endeavor than me. But my reaction follows:
- overall i like your approach (esp the principle of default simplicity & ple).
- i also like that you are going to try and incorporate more mediums of expression; however that obviously poses its own limitations on when/where you can use this language.
- again i'm no expert, but can you make the written form truly phonetic? i know that HINDI is a phonetic language but we don't have representations for vocal inflections/pitch.
- are you saying that you don't want culture bias? could be a good thing.

I'm interested in watching this develop. thanks for sharing.
[User Picture]
From:saizai
Date:March 13th, 2002 11:07 am (UTC)

Re: comments from the linguistically lay...

(Link)
1. Yay! Any other principles you'd like to add?

2. Not at all! The idea is (and I'll write at length on this later) to use any *and all* "input streams" available, to their fullest extent. I should add another principle [edit edit] to explain that - any concept should be fully expressible in any one or combination of streams, with the only limit being the data throughput (translation: saying something isn't as fast as saying, acting, and having text scroll across the screen). In fact, this is a high point of the principles - that this Uberlanguage, because it uses *any and all* streams available, could equally be used by and to anybody, no matter what their disabilities (blind, deaf, both, ...), so long as they retain *some* form of sensory input to be used as a medium.

Without having to learn a "new" language just for them, as it would merely be a somewhat "crippled" version of the default.

3. I'm no expert either. Bah and humbug!

Who said anything about making the written form phonetic? Yes, it's possible - use the IPA (international phonetic alphabet; I think an earlier thread in this group covers it). The question is, is it *desirable*?

Perhaps you're confusing words; all spoken languages are by definition "phonetic", in that they use "phonemes" (the basic sound-units, like "r" or "ah"). "Tonal", perhaps, like some Asian languages? These have written representations, too; the Chinese romanization systems are examples. (Chinese doesn't have its own alphabet per se.)


4. YES! No culture bias! I think that's one of the major flaws of nearly every attempt to date - most notably Esperanto, which is so blatantly english/french/spanish/german-based.

Ideally, any given person - no matter what their background - should have an equal chance of learning this language, and an equal use of it (not like Japanese trying to speak English - damn r vs. l ;-)), limited only by their own inherent abilities. (Intellectual and input/output capacity - not everybody has hands (or *two* hands, or whatever else), after all.)

5. Don't just watch, *participate*. It's through discussion, feedback, arguments, etc., that things evolve. I try to take in all viewpoints I can, but sometimes I overlook one or don't give it sufficient thought. Or simply don't think about some implication (or forget it, like with Axiem's point).
From:dweezie
Date:March 12th, 2002 11:38 pm (UTC)

I like the phrase "linguistically lay"

(Link)
I too am a budding Linguist, so this endeavor of yours is fascinating to me. I'd feel like I am talking out of my ass, so to speak, if I analyze any of your theory. So, I'm going to watch it unfold and see what I can learn.

I'm so glad I joined this community. It makes me feel like a kid in a candy store again!
[User Picture]
From:saizai
Date:March 13th, 2002 10:53 am (UTC)

Re: I like the phrase "linguistically lay"

(Link)
Feel free to talk out of your ass, or whatever other orifice or animal you choose. Although we generally prefer typed English, as you just used. :-)

I, incidentally, though moderator of the Conlangs group, have no "official" status as a linguist whatsoever. Took a couple (boring) courses in a community college, and know a few, but that's it. And this essay (if that's even the correct term - "monograph", perhaps?) was written without any planning or extensive research involved.

So I too could be accused of "writing out of my ass".

I just seem to do it so well that nobody realizes it. ;-)
From:marqwithaq
Date:March 14th, 2002 05:14 pm (UTC)
(Link)
I love your post. I agree with just about everything here.

However, I am wondering if you have heard of the <lj-cut> tag?
[User Picture]
From:saizai
Date:March 14th, 2002 05:26 pm (UTC)
(Link)
Certainly have, and use it.

I wasn't certain however whether I *ought* to in this case.

BTW, "just about everything" implies that you disagree with something.

So, tell me what you like and what you don't, and what I've missed.

And of course, keep that praise a'comin'. I'm a bit starved. *g*
(Deleted comment)
[User Picture]
From:tonique
Date:March 15th, 2002 12:07 pm (UTC)
(Link)
[Now in its right place]

Uh, I can't really comment all points. This has to suffice now! I believe that languages can be developed to be ideal for a particular use (or some such) but not universally. I haven't got much evidence for this -- it's one of my (rather numerous) opinions about language. So, I comment from the viewpoint of (some) natural languages.

1. Principle of Least Effort
[...]

Therefore, the language should *start* with simplicity in mind. This means that things
should be "regular" (linguistic term, meaning "hopefully the rules don't have many
exceptions") as much as possible, that vocabulary should be as dense as possible (long words for oft-used concepts, especially when shorter words are not "taken", *will* be broken down with natural use)


This seems to be true to an extent. The sign (or marker) for Finnish conditional is -isi-, to which suffixes of person are suffixed. Third person singular has no suffix here, though, and the final i is often dropped. Thus we get forms as tulis, antais, unohtais 'would come, give, forget'.

Sometimes a word seems to be lengthened, though. An example would be the partitive singular forms of kala 'fish'. The standard Finnish form is kalaa but many dialects and, hence, many speakers use kallaa instead. This is called "general gemination" and it occurs in originally single consonants before a long vowel (that is in an unstressed syllable). But in southwestern dialects, the form is kalla, after the shortening of the unstressed long vowel.

2. Principle of Semantic Density
Any medium used [...] should be used optimally. -- This means that


a) everything that *can* be done (bounded by the PLE), is done - in speech, for eample, use of all available phonemes, tones, etc.

Are you suggesting we should have all "phonemic space" used? Finnish has a great unused space in monosyllabic words that end in long vowel or diphthong. I composed a list of these existing words and the fraction of existing words is about 21% (or 28%, if I'm more allowing...) to all possible words.

b) simpler things are done first. For example, the nonsense word "aijmapnargath" should be much later on in the vocabulary than "jaf". Or another example would be ICQ numbers: start from 1 and work up. Why assign #9143018 when #1402 isn't taken?

Of course, suffixing may produce long words very quickly! The Proto-Finno-Ugric that has been theorized and reconstructed is something where suffixes are already added to form longer words. Moreover, in the reconstruction, only the vowels a/ä and e exist later than the first syllable. -- Um, I forgot what I was going to write here, so I must skip this... :p

[...]
Thus, there should be a means of doing this - adding "buffer space" to the data - in
whatever mode presented.


It seems to me that all natural languages have plenty of buffer space. In English, one is probably the rigid word order and particles; in Finnish, the suffixes. The buffer space is evident in speaking: we add little words and gestures. And context is all-important.
[User Picture]
From:saizai
Date:March 15th, 2002 07:51 pm (UTC)
(Link)
[Thanks.]

Opinions are handwaving unless backed up. So, can you try to provide some more concrete examples / explanations for why it is necessarily impossible to craft a universally ideal language? Especially when that language is non-static; the thing on modalities, for example, gives you *options* rather than *limitations*.

Sometimes a word seems to be lengthened, though. An example would be the partitive singular forms of kala 'fish'. The standard Finnish form is kalaa but many dialects and, hence, many speakers use kallaa instead. This is called "general gemination" and it occurs in originally single consonants before a long vowel (that is in an unstressed syllable). But in southwestern dialects, the form is kalla, after the shortening of the unstressed long vowel.

Define "partitive" for the nonlinguists, please.

Also, could you provide a phonetic version of those words? Most of us don't know Finnish, and reading from an English perspective, those seem to be identical. Could you also provide some explanation for why the word becomes longer (if you know of any)?

Are you suggesting we should have all "phonemic space" used?

Yes, with the clauses listed. Do you have an argument against doing so? (Incidentally, the PSC addresses this more directly.)

I should point out that suffixing is *not* the only means of "complicating" a base; you can add inflection, tonalities, cross-modalities, etc.

It seems to me that all natural languages have plenty of buffer space. In English, one is probably the rigid word order and particles; in Finnish, the suffixes. The buffer space is evident in speaking: we add little words and gestures. And context is all-important.

I don't consider that buffer space, but rather *wasted* space, since the rule-breaking versions (*"I John eat.") have *no* meaning whatsoever. I think one could devise *far* more optimal forms of redundancy. Also, I tried to emphasize that this "buffer space" should be - as with all other features of the proposed language - *optional*.

[note to nonlinguists: it's customary to precede "incorrect" examples with an asterisk]

In fact, that is probably something I should stress overall. Absolutely everything should be *optional*, from cross-modality to redundancy. This means (for example) that, yes, you should have a means for communicating, merely by touch, that you are being sarcastic, without any actual referent (or with an implict referent, which is semantically different [e.g., middle vs. passive voice]).
redundancy - (Anonymous) - Expand
From:(Anonymous)
Date:June 10th, 2002 03:36 pm (UTC)

Your ideas are interesting

(Link)
Second, if you want to be ambiguous, you should be able to... and, as an important sub-prinicple, ambiguity should always be *implicit*. If the gender-neutral pronoun is more difficult to produce than "he" or "she", it will be received as a *deliberat e * ambiguity. Of course, that too should be able to be expressed, but it should be different than *implicit* ambiguity, in that the former is inclusive and the latter exclusive.

Deliberate ambiguity does strike me as being useful, but why not allow for accidental ambiguity too? It could come in handy, in somewhat odd circumstances.
[User Picture]
From:saizai
Date:June 12th, 2002 09:09 pm (UTC)

Re: Your ideas are interesting

(Link)
Define "accidental" ambiguity, as different from implicit or explicit ambiguity.

And posting anon, remember to write your name.
[User Picture]
From:relsqui
Date:January 19th, 2003 04:50 pm (UTC)
(Link)
What about expandability? As the world changes around us, there is a constant need for new vocabulary. Would there be any precedent for adding it?

The reason English uses words like "inexperienced" instead of words like "bock" is that it comes from roots that are already familiar to someone who speaks the language (in + experience--can that be broken down further?--+ ed). It seems to me that it would actually be more difficult to learn a language whose vocabulary was based on brevity rather than familiarity; in fact, that's a contradiction to your Principle of Iconicity.

Speaking of contradictions, what about the Principle of Cross-Modality versus the Principle of Least Effort? If something can be expressed using only gestures, or only sounds, wouldn't the communicator choose to do that instead of combining the two and adding a song and dance? You say yourself that important concepts which are complicated or difficult to express will be broken down into slang.

Also, you speak of making the language easily accessable to learners from any culture or mother tongue, but using so many forms of expression would make it vastly difficult to learn or even describe, coming from any background. I have no serious history with any Asian language but it would be easier for me to learn Chinese than a language with all the permutations and variations which you describe.

Oh, yes. And hi! I'm new here ; ) just grabbed this link from the userinfo intending to come back to it and ended up reading the whole thing.
[User Picture]
From:saizai
Date:January 19th, 2003 11:21 pm (UTC)
(Link)
Welcome.

Vocab generation: I think I touched on this, though it's been so long since I wrote it that I'm not sure. It may have been in one of the follow-up posts.

Basically, my intent is to have a systematic method of creating new vocabulary - be it at the moment of language creation, or farther down the line. This method would (in theory) be fairly easy, and derivative in some way. I have not yet thought through how to one-up the "change or agglutinate an existing word" method, so I'll beg out of answering that until later.

As for PCM vs. PLE / learning... yes, learning other modes will be difficult more many people. Just as learning ASL as opposed to Spanish - most people are not used to using any but the audio/temporal mode, with the rest sprinkled in haphazardly through gesture and whatnot. There's not much to be done about *that*.

However, those modes, subtracting their being different modes in the first place, would still conform to the PLE and other relevant principles. I.e., they should be regular, easy to understand, etc. A somewhat difficult distinction.

Cross-modality is probably one of the more controversial portions of all this. It has the potential of increasing the possible expression - in breadth, depth, and/or speed - by orders of magnitude. However, it is something that (as far as I know) does not have any extant analogue whatsoever, so it is hard for me to predict how difficult it would be to learn... or, for that matter, to design.

I do believe that, in princple, it should be possible to create cross-modal "modes" that still are simple and least-effort. Interleaved or some of the more complicated variants would obviously be much harder to design.

I could, however, point to the existance of very primitive cross-modals existing: spoken English plus visual changes. For example, you can say something while shaking your head - that negates the meaning to whoever sees you do so. Or "I feel [so-so sign w/ hand]." From there, it's a matter of using a more complete and thought-out system than occasional gestures, and having grammatical rules for intertwining them.

I suppose my primary belief with regard to designing a language is that the more clever (and possibly complex) the design, the simpler ("elegant") the result. So, while doing a linguistic analysis of the language might be fairly difficult, using it should not be.

Note, though, that this is primarily a list of *goals*. It is not an implementation; that will come later.

And of course, the implementation will probably be far harder to do than writing the design goals. But then, I've never been known for choosing easy goals... ;-)
(Deleted comment)
[User Picture]
From:saizai
Date:March 6th, 2003 02:05 am (UTC)

Re: efficiency vs beauty

(Link)
Most of this is focused on mechanics, ideas, and suchlike. I did do a follow-up post specifically about aesthetics in conlangs. I do not think the two to be mutually exclusive, and that I focus on more gritty things shoud not be taken to mean that I discount the need for beauty.

Aggluntination is not a particularly graceful nor aesthetic (to me) way of creating new words.

Idiosyncracies are *NOT* needed for expressiveness. English is both fucked up (to put it simply) and expressive; this is not causal, as far as I can tell. I would like to design a language that is both regular, logical, etc., *and* fully expressive. I also have a problem with the idea that limitations are the means by which expressivenses is conveyed; I don't believe that full power of expression would prevent one from using more flowery, analogy-laden, image-provoking prose. To the contrary, a *full* power of expression would by definiton enhance that aspect as well.

Of course, all this should go with the clause that I haven't quite figured out exactly what makes a language beautiful yet, and thus cannot tailor an ideal one to be maximally so. (And yes, before you say that what I said is simplistic and unitary, I fully acknowledge that there may be multiple maximums.)

Also, note that it is 2:04 am. That's my excuse.
From:tangledweave
Date:October 17th, 2003 11:51 pm (UTC)
(Link)
I must state right off that it's not that in some absolute sense I don't like your clauses, it's just that I think for my conlang some of them are inappropriate. My conlang is at present pretty much at it's conception state and can't even referred to as a blastula.

One example is the principle of least effort and the principle of default simplicity. I see my conlang as having itself centered around a number of concepts which would be rather extremely complex from a regular linguistic perspective. Many of these words will end up being rather simple for this reason. Mainly my point is that the so called simplicity of a concept seems rather biased to me and has more to do with frequency of use in english, than any objective measurement.

You're discussion of using multiple input streams reminds me a lot of the rune language of the sartan's and the patryn's of "the death gate cycle". In case you are unfamiliar, the rune language is a language used for discussion (it's purely verbal form anyway) but when one wishes to cast a spell, three levels of communication are required (for sartan's anyway). Sartan's sing in their language, while dancing in an extremely meaningful manner drawing symbols in the air with their feet and hands and they are also projecting particular thoughts with their minds. The sartan writing system is a web just as you described based on a root rune and the runes that modify add to and in anway form the meaning of the rest of the text flanking it on all sides (again in a particular pattern).

Another language I'm hoping to work on (completely none spoken) would be for a race of spider like creatues that envisioned. The spiders are capable of expressing simple thoughts by bending the joints of their pedipals and forelegs in particular ways. This system is also the equivalent of shouting in human languages. For more intimate conversation, they would utilize movement of particular sets of hairs (i'm going to explain the possibility of this by saying that the multiple eyes of the spiders and particular brain morphology gives them the ability to identify and remember subtle placement differences.

okay, enough rambling from me.


[User Picture]
From:saizai
Date:October 18th, 2003 04:13 am (UTC)
(Link)
One example is the principle of least effort and the principle of default simplicity. I see my conlang as having itself centered around a number of concepts which would be rather extremely complex from a regular linguistic perspective. Many of these words will end up being rather simple for this reason. Mainly my point is that the so called simplicity of a concept seems rather biased to me and has more to do with frequency of use in english, than any objective measurement.

Elaborate, please? My guess is that you misunderstood my intent with the PLE & PDS.

Deathgate Cycle

Read it, enjoyed it. Excellent series, and yes I remember the rune-writing. A good concept, methinks, but ah how to implement it?

Also, please note that web-writing is merely a *possibility*. I do not claim it to be The Perfect Ideal; rather, I am attempting to describe qualities which make an ideal. These can be present in different forms; my guess is that there are an infinitude of possible 'ideal' languages, especially since tradeoffs are inevitable and thus will make one version more suited to a particular circumstance than another.
[User Picture]
From:jeremysmith
Date:November 20th, 2003 03:58 pm (UTC)
(Link)
That is a very insightful paper, Ilya - although I hesitate to agree that all available mediums should be used - there may be mediums that can be used for expression which won't be comfortable to everyone and thus should be avoided. For example, it is very possible for me to communicate by grabbing someone and sticking my tongue in their mouth, but it probably isn't a very good idea in most cases. This is why interpersonal communication seems to be limited to aural and visual media - it's safe because it's not invasive. Otherwise, movies may have developed in such a way that the story is told through, say, an anal probe rather than a movie screen. Not very appealing :P

As for schools, you definitely ought to (re)apply to Berkeley, as well as some other U.C.s. I hesitate to suggest UCI because while I'm having a great experience here, your GPA and test scores make you vastly overqualified. You may want to focus on your specific interests rather than cognitive science because you are going to end up wasting a lot of time with a lot of stupid biology classes which you will hate. Pick your greatest passion from the subjects you mention (I would think this would be compsci, although your interests may have shifted) and major in it. Then minor in the other two (or double major and pick a minor.) Pick the major carefully as it may influence the schools you get into (i.e. it will be much easier to get in somewhere for linguistics major than for engineering - nobody does linguistics while everyone and their mother does engineering)

I'm probably going to stay an extra year at UCI and minor in linguistics, more for application to natural language processing than anything else (although it is an interesting, if inexact, science)
[User Picture]
From:jeremysmith
Date:November 20th, 2003 04:04 pm (UTC)
(Link)
Incidentally, regarding cross-modality: As many possible forms should be related in such a way that given a set of rules rather than a knowledge of all the forms, one could extrapolate one form from another. For example, Tolkein's "Tengwar" writing system is constructed in such a way that each (consonant) glyph can be broken down into pieces which represent the activity of different parts of the mouth in order to produce the sound which the glyph would produce when read. Thus the characters are lexicographically arranged into a sort of gradient of sounds rather than an arbitrary ordering.
(Deleted comment)
[User Picture]
From:saizai
Date:January 24th, 2004 01:03 am (UTC)

Re: Comments on ODIL [part 1]

(Link)
Sai, I've finished reading through your ODIL document. Very impressive, especially for someone whose profile says he took only a couple of boring community college classes in linguistics.

Thank you. ;-)

At the time I wrote this, I believe I had one (very feeble) intro linguistics class, and some scattered reading of my own as caught my interest. I've since had a different intro linguistics class which actually covered linguistics (as opposed to child development, very basic Chomsky, and grammar), but that too didn't seem to teach me much, unfortunately. It was the only one available, though. *shrug*

I've really had very little formal training or reading on the subject (probably the most formal linguistic books I've actually read were The Signs of Language and its sequel); this represents almost entirely my own thought. I'm pretty satisfied with out it turned out, since I wrote nearly all of it in one sitting. ;-)
(Deleted comment)
[User Picture]
From:joxn
Date:March 16th, 2004 11:45 pm (UTC)
(Link)
I have a number of interrelated comments.

First, I wonder what you think of lojban. While it is much less ambitious in scope than your project, it does have (and successfully fulfill) a design goal which you only implicitly state, namely that of grammatical unambiguity (you implicitly demand this with your Principle of Cross-Modality). In lojban, if you hear a complete grammatical utterance, you can unambiguously parse it into a stream of words which you can then unambiguously assign grammatical roles. Of course, there is no guarantee of semantic unambiguity, but the language does also attempt to provide a pretty complete implementation of the Principle of Desired Clarity. Of course, in pursuit of grammatical unambiguity, lojban had to violate your principle 2.a.

Second, I think it is a misconception that semantic space is "precious". Humans can only process information at a certain speed; you only need enough semantic space to carry that amount of information per unit time -- any more is superfluous.

Third, the PSC is probably impractical in and of its own right. Leaving all the other complexity out of the question and dealing only with spoken language, I can't even imagine what kind of grammar one could develop in which no utterance is ungrammatical. Of course, just because I can't imagine it doesn't mean it doesn't exist. But my imagination is quite vivid.

Finally, in light of points two and three above, I'd say that the biggest problem is that you don't take explicitly into account the human language-learning capability, a capability which is natural (by which I mean, biological) and constrains the forms language can take. This is the principle of "linguistic universality", which in some form (be it strong or weak) is pretty much accepted by the linguistic community. I should think that the most important characteristic in the design of an "ideal language" is that it be speakable by humans!
[User Picture]
From:saizai
Date:March 17th, 2004 06:01 pm (UTC)
(Link)
I've only briefly looked at Lojban; it's on my list of "thing to eventually get to".

How do you see the PCM as implying grammatical unambiguity?

If anything, I do argue that a certain amount of ambiguity *is* necessary (or desirable, at least) - note the PDC. I haven't figured out how this should be implemented, though; look up my post about "explicit implicity" if you can find it. PDC *would* require that if you want to be absolutely unambiguous, the grammar should support you in this. But that would be an *option*. And by the PDS/PSC, it would probably take longer to take than something which is ambiguous. (Unless its ambiguity is of a "greater order", e.g. having a word be ambiguous between "frog" and "orange" vs. between "man" and "woman".)

Also, I don't see how [PSD] 2a conflicts with disambiguity.

Humans can only process information at a certain speed

I've yet to see a convincing proof of this that does not take into account the potential for language constituting a restriction of some kind. If you can point me to one, please do. If one were to believe that posit, then your conclusion does make sense, and that would mean that the "bounding" effect of redundancy vs. PSD would be set at that "certain speed".

Not to mention - what makes you think that current languages *achieve* that "certain speed"? What is it exactly, and in what measurement?

I can't even imagine what kind of grammar one could develop in which no utterance is ungrammatical

Why not? Putting aside the possible reasons (as mentioned) for *wanting* ungrammatical utterances, that is. One would simply need a set of morphosyntax rules that are capable of having an interpretation for any given combination. (This would include leaving some rules as "free" - e.g. having "I store go" and "I go store" being semantically equal and legit; there are languages in which this is true.) What part(s) do you have a problem with? Example?

constrains the forms language can take

Actually, that's one of the main reasons for doing this. I don't accept the (circular) argument that current languages represent the full limits of what humans are capable of; that seems just plain silly (not to mention it's a logical fallacy). I have yet to see any proof whatsoever of this posit, for that matter; linguistic universals are descriptive, not prescriptive. But again, if you can show me a proof, please do, as I would find it interesting. However, I refuse to take it as an assumption.

Please name a couple examples of physical (i.e. biological) constraints upon language that would cause problems with what I've proposed?
[User Picture]
From:ouwiyaru
Date:January 3rd, 2005 07:20 pm (UTC)
(Link)
I found this a stimulating read, if I didn't always agree with your aesthetics. In any case, I have a few comments:

2. Principle of Semantic Density
b) What do you mean 'first?' you mean in the dictionary, or in construction of the language?
c) I disagree with this point from a number of angles:

i) This will relegate 'jargon' which is the handful of unique words most important to any conversation to longer words. That may begin as just an aesthetic, but it will put some words out pretty far in your lexicon. A more 'optimal' strategy would make some of the most key 'complex' words short, so you can also build jargons which do not take too long to express when they are used all at once. This will make your distribution of short words look more fractal when mapped to how complex the idea is.

ii) This will fail your PLE in the long term, since people will eventually start using 'generic'/small words for more specific things. Eventually, the generic word will take on the new meaning. This is documented, for example in The Evolution of Grammar (Bybee, et al).

iii) New words would have to come at the end

7. Principle of Semantic Conservation
I think you need to be more precise about what you mean by "acceptable parts." Phonemes?, words? Can the words have rules? For example, your ungrammatical sentence would be fine if you added articles to the nouns (e.g. 'The man got a job now'). Besides English's somewhat arbitrary decision to separate the words when writing, how is this different than saying that definite/indefiniteness must be specified in nouns? I suppose you're saying

On the other hand, I very much like your idea that someone could 'unpack' their words to articulate their expression better. This might potentially violate 2b of the PSD, if I understand it correctly.

1. Temporal Order
"What color is the cat?"
Possible answers:
"Black"
"The cat is black"
What makes the end of a sentence any more likely to be cut off than the beginning? If the speaker knows this is true, then they can answer the first way, right?

[general program of filling all input streams with 'dense' meaning]
This kind of project seems basically mystical in its real sense. I would be scared that such a culture would impose too much thought-control on its people. If you just want to make a happy tune, but the tune means "Down with the capitalists!" or "Your mother was a hampster..." then you will sensor yourself and try to change the meaning to something more mild--losing some possibly potent subconscious expression that should NOT have had a meaning. So I see this as, in fact, undesirable.
[User Picture]
From:saizai
Date:January 3rd, 2005 09:33 pm (UTC)
(Link)
I found this a stimulating read, if I didn't always agree with your aesthetics.

Excellent. ;-)

[PSD (b)]: What do you mean 'first?' you mean in the dictionary, or in construction of the language?

In the creation of the vocabulary. That is, if you have a "finished" (heh) vocabulary, it should try to "fill up" the e.g. possible 2-syllable words before embarking on the 3-syllable or 4-syllable ones. So there should be a much higher percentage of "unused" possible N+1-syllable words than N-syllable ones. Although "saving space" intentionally (5%?) for the creation of new words later (or for use as placeholders / acronyms / pronouns / etc) is potentially useful too.

iii) New words would have to come at the end

See above on reserving space.

c) I disagree with this point from a number of angles: [...]

Good point. Perhaps it should be better phrased as "simplest, given context". One other potential solution would be to have some sort of context-dependent variables (set implicitly or explicitly) which then modify the meanings of words. E.g., set context to "computer jargon"; after that, "foo" may mean something radically different from another context. Thus, you could reserve a certain number of words (of different lengths) for use as jargon terms, which would have different meanings (although as similar as possible) in different jargon-contexts. You could then add explicit tags to make them more complex but more speficied, if you want to refer to a out-of-context jargon.

The two problems I see with this immediately are that, for one, it would need a better way of coping with multi-jargon conversations. (Perhaps a shortened form of the specifier would work?) For two, there is the potential for confusion... but that could be just explained as "in the context /foo/, /foobar/ -> /bar/; /bar/ without context is ambiguous (=? /foobar/, /garbar/, ...)" - a sort of contraction.

[PSC continued in another comment]
[User Picture]
From:creases
Date:June 13th, 2006 07:46 pm (UTC)
(Link)
Hi, I drifted here from conlangs.

This has been mentioned before and I don't want to belabor the point, but what constitutes "ideal" depends on your purpose. One person's "ideal" could be an exercise in using as many phonemic distinctions as can be found in natural languages (for example, among occlusives: voiced, aspirated, ejective, postnasalization, gemination, etc. etc.). But another person's "ideal" could be maximum comfort as evinced in a survey of the most common phonemes. These two ideals would be at cross-purposes (the latter would severely limit the phoneme inventory, perhaps to as few as few as 10 phonemes), but each is motivated by an ideal.

My point is not that yours is just one among many ideals (you've already acknowledged that); my point is that it's important for you to make clear what your "ideal" is, what is motivating it, how it differs from other possible design goals, etc. As it is, it looks like you're going for maximal saturation of the vocal range. Is that right? Why does that interest you?
[User Picture]
From:saizai
Date:June 13th, 2006 07:57 pm (UTC)
(Link)
This is *on* conlangs :-P. I take it you mean that you saw my recent update post.

Of course that's true about ideal, and it's a question I've answered (albeit not here, and should update that here). Read the post I linked to - http://listserv.brown.edu/archives/cgi-bin/wa?A1=ind0606b&L=conlang - "On the design of an ideal language - finally, my (long!) response". It should be a fairly comprehensive reply.
[User Picture]
From:music_dissident
Date:September 5th, 2007 09:52 pm (UTC)
(Link)
I see comments here from all over time, so I don't feel too awkward posting a new one.

Have you considered memetic replication fidelity of an ideal language at all? (In other words, how faithfully is it reproduced when taught to others.) This means a question of what features of the language itself preserve its characteristics as it is transfered from one speaker to a new one, and what modes of language transmission are ideal for the language. Does the language only survive effectively when a written component exists? A recorded bank of perscriptive materials, audio, visual, etc.? Or is the language preserved with a high level of fidelity with only normal human transmission, independent of other aids?
[User Picture]
From:saizai
Date:September 6th, 2007 06:39 am (UTC)
(Link)
I have. To be honest, I think that this is fairly problematic, as the mechanisms by which language evolves only support *some* of these ideals. Some others it will be neutral or detrimental to. Prescriptivism is a whole set of problems, which I'll just say I don't believe can work in a living language and leave it at that.

I think that to address it, what you need is for it to be built with its evolution in mind. Some of the natural processes should be, as it were, "pre-washed" - phonetic and orthographic simplification, for example, because if they're not already smoothed out it's gonna be the first to go. And you need room to grow, e.g. a robust morphological or other system so that new words (both derived and fundamental) can be made yet still fit well, etc.

An example where prescriptivism failed is Toki Pona - people tried to adopt it as an auxlang, and force it to have a "rich vocabulary", which fundamentally goes against its philosophy of simplicity. I'm not sure how this could be fixed, if at all; John Clifford's talk at the last Language Creation Conference may be helpful for this.

> Go to Top
LiveJournal.com