Classification of Signs/Morphology: Day One
by Edward Vajda

The elements of any communication system are called signs.  The great Swiss linguist Ferdinand de Saussure noted that signs are composed of form and meaning.  He called these two halves of the sign the signifier (the form, the expression) and the signified (the meaning, function, content).  We have seen how all communicative signals, even those of animal systems of communication, contain signs composed of two halves, form + meaning.

Signs can be classified as icons or symbols based on the degree of physical similarity between form and meaning. 

Icon--sign physically resembles what it represents: map, picture of deer, the cigarette on a no smoking sign, photograph, mug shot.  Icons are based upon physical similarity (either visual or auditory).  The term icon  comes from the Greek for picture; it also means the picture of saint in the Russian Orthodox Church and the picture-directions used in apple computer systems. 

Think up some icons. Both religious icons and computer icons are also icons in the linguistic sense just described.  Occasionally, words can be iconic, if they are based on sound symbolism, or onomatopoeia: Navaho chidí, car; Cherokee word for pig, siqua; English whipoorwill, chickadee.

But most words are not iconic because they do not show any natural connection between their sound and their meaning.  Most words are the type of signs called a symbol.  A symbol is a sign in which the relation between form and meaning is arbitrary, based neither upon resemblance or any other natural physical connection. Instead, the form-meaning connection is based on convention within the given speech community: De Saussure noted that in most cases the union between sound and meaning in human language is completely arbitrary and conventional: cf. words for tree and rock

Iconicity can be a matter of degree (realistic vs. more abstract picture; cf. the Russian and English words for "sneeze").      

Think up some symbols other than words. Many other human signs outside the realm of language are also symbols: red and green traffic light, the black and white checkered flag at a race track.  Such symbols, being arbitrary, vary from culture to culture, just as words do:

a) Royalty is symbolized by purple in Europe, yellow in China.

b) Death is symbolized by black in Europe, white in Japanese. 

c) Most gestures are symbols, cf. the nodding or shaking of the head, or the Russian sign meaning Let's drink.  

Signs can also be classified in terms of the presence or absence of natural physical contiguity between form and meaning--as an index or a non-index.

Index--is a sign used in direct spatial and temporal connection with its meaning, often in the sense of event and consequence: smoke--fire, wind vane--direction of wind. 

Icons and symbols may be used as indexes or non indexes, but words are not natural indexes.

The iconic images of a man and a woman are indexes when they denote men's and women's bathrooms; they are icons when they stand for the man or woman themselves. 

Signs in animal systems of communication seem to be used exclusively as indexes--to point to something directly present in space and time--almost to the point of being physical reactions to the immediate environment. 

Words in human language may also be used as indexes, such as shouting the word Fire! in the presence of a fire. But most human communication is non-indexical: the topic of conversation is not in the immediate context.

Any form of communication is a system of signs, but only human language is composed mainly of arbitrary signs (of symbols).  Only a few words are partly motivated (onomatopoeic words, but even these are partly dependent on the arbitrary sound system of each language.) 

When we classify the signs of human and animal communication as icons, symbols or indexes, striking differences emerge.  Animal systems contain some symbolism, such as the bees' use of gravity to symbolize the sun as a direction reference.  But even such symbolic signs are used exclusively an indexes by animals, since they are capable of being expressed only as reactions to what is directly present in time and space.  Human language is, in fact, the only system which need not use its constituent signs as an index.


Words.  The most obvious sign in human language is the word.   To the non-linguist, the word, is the most obvious sign, the most obvious semantic unit of language.  But what is a word? We will see that there is no precise definition of "word" that might apply in all instances.

1) In English and many other languages, a word is separated in print from other words by a space: The hunter captured a tigress. We are so used to reading in this way that we would have difficulty picking out the words if the spaces were not there.  Acatsatsonamat.  This, of course is not true of all written languages (in Old Russian texts, words were written together).  Also, how would one define the concept of word in an unwritten language? (Many of the world's languages are still unwritten, or are written only by linguists using the International Phonetic Alphabet and not by native speakers.)

2) Words can be separated from other words in speech by a pause, but that is not necessarily done.  The phenomenon of pauses in the flow of speech is called juncture: What are you doing?  Whacha doing?  Sometimes parts of words may also be separable by pauses: im--possible; un--believable there the pause separates two morphemes, yet intuition tells us that we are still pronouncing only one word.  Sometimes intuition tells otherwise: Behave is one word not two, but see the cartoon on textbook p. 42.

3) Another way to define the concept of word is by its syntactic separability form other words:

The /bad hunter /cruelly captured/ a /poor tigress.

Words in set expressions cannot be separated by the insertion of other words: soap opera, wee hours of the night, eat the hair of the dog that bit you

Butsometimes morphemes can be separated from one another by other morphemes, much like words can be separated by other words: social, unsocial, antisocial, antiunsocial.  Or unbelievable, un-friggin'-believable, guaran--damn--teed.

4) Yet another way to define words is in terms of meaning, the degree of semantic independence the unit has.  Words often denote single ideas, but may not: I closed the box. I sealed up the box.  The word up is not semantically independent from the verb sealed.  It is what is called a verbal particle.  Morphemes may also show a degree of semantic independence, such as the prefixes un, non, which have exactly the same function as the word not: He is unkind, He is not kind.

American Linguist Leonard Bloomfield (1887-1949) thought of words as minimal free forms, the smallest units of speech that can meaningfully stand on their own.  This definition covers the majority of words, but some items are treated as words in writing, but never really stand on their own in natural speech, such as English the and of, or French je for I.

5) One final angle to use in trying to define words is to contrast them with sentences.  Words tend to be a fixed part of the language that must be learned before being used.  If you invent a new word, other people will not readily understand the word (cf. wug). New sentences, on the other hand, are constantly invented, or generated, rather than being learned; yet, new sentences are readily understood by people who have never heard them before. 

Some words may also be readily invented by a speaker and easily understood by a listener (antilinguistic attitude). Such words have as high a degree of natural generability within the system of the English language as sentences do.

The concept "word," then, turns out to be a complex and fuzzy category, in which several various and sometimes contradictory parameters play a role: juncture, stress, concreteness of meaning, and syntactic separability, generability.  There is a gray area between the notion of morpheme and word, on the one hand, and word and phrase, on the other.  "Wordness," then, is not an absolute property, but rather a matter of degree of possession of several different properties--some of which are contradictory.  There is not a more precise definition that would apply equally well to words across all languages.


However the concept "word" is defined, the set of words in a language, is called the lexicon of the language (lexicon means dictionary in Greek).  Every language has thousands of words.  English may have the most with over 450,000, according to Webster's Third International Dict; Russian is said to be second with 240,000.  These high numbers are due to extensive scientific and technical vocabularies in all sorts of fields. 

Every other language also has tens of thousands of different words. In fact, the number of words in any language is far greater than the 15,000 different words used in all of Shakespeare's plays.  And every language has the potential to add many thousands more new words given exposure to new concepts.  Navaho has over 500 special terms used in relation to ceremonial dances; these have no good equivalents in other languages.  The Sahaptin language spoken in the Yakima and Warm Springs Reservations has special words for 30 or more varieties of lomatiums, a genus of wild plants related to the carrot which bear edible roots.  English must resort to scientific terms in Latin to distinguish these plants.  And the Sahaptin terms actually describe the different genetic varieties of these plants with a higher degree of accuracy than the Latin.

Classification of Words.  Let's now examine some of the ways linguists classify the symbols we call words. Words may be classified in according to whether or not they have concrete meaning:

Content words denote entities, actions, or qualities that can be experienced or imagined: sun, red, man, dayNouns, verbs, adjectives, and adverbs of manner (-ly) are content words.

Function words are the linguistic glue that links parts of a phrase or sentence together.  They have little or no concrete meaning outside their grammatical function in the sentence. Function words include articles, conjunctions, prepositions, verbal particles: the, a, if, and, or, to, becauseThe other function word categories are a bit more like content words: prepositions, pronouns, and adverbs of place and time: here, there, then, now, above/below, after, vs. besides, to.  

Function words can more easily be defined by explaining how they are used in sentences; content words can more easily be defined by reference to the world.

The content--function distinction for parts of speech is a logical continuum, another fuzzy distinction rather than an absolute one--but a useful one to make nonetheless. For instance, verbs are content words, but linking verbs like be/am/is/are are more like function words.  Also, pronouns like he, she, although they are function words, have more content than demonstrative pronouns such as this, thatInterjections like yuck, oh, ouch are also somewhere in the middle of the content/function distinction.

The number of function words in a language is rather small, and native speakers know all of them.  By contrast, there are many thousands of content words, and there is probably no speaker--even the most educated-- who knows them all. An analysis done of the vocabulary used in AP news communications found 350,000 different words; fewer than 1,000 of these were function words

Content and function words can also be distinguished according to how readily each group admits new members to its ranks.  We said that language is creative.  Some elements in language are easily inventible.  We can talk about the "generability" of language units.  Brand new sentences, of course, are invented every day by small children; new nouns, verbs, and adjectives also enter the language every year.  Content word categories readily admit new members to their ranks; thus they are said to be productive and are called open classes.  The main productive, open class, content word categories are: the verbs, nouns, adjectives (in some languages), adverbs of manner.  It is possible to coin new words in these categories (bogosity, unmicrowavability, morphed-out, pig out). 

Function words, on the other hand, belong to unproductive or closed classes; such classes contain fixed and usually rather small numbers of words: examples are conjunctions, prepositions, articles, linking verbs, adverbs of time and place.  It is not possible to sit down and invent a new article or preposition, for instance.  (cf. the Lewis Carrol poem The Jabberwocky.)  New members of unproductive classes of words develop only gradually through time.  When looking at elements in language it is helpful to keep in mind whether they belong to productive or unproductive classes.

Morphemes.  But the word is not the smallest unit of form with a specific meaning, however.  Take a look at the word unrenewable.  It can be broken up into sound groups with specifically identifiable meanings: un-re-new-able.  Specifiable sound groups with specifiable meaning are known as morphemes.  A word can further be divided into individual sounds that reoccur in other words, but these sounds, for the most part have no meaning in and of themselves. The English word apple. can be broken up into separate sounds or sound groups, but the resulting phonetic units cannot be assigned any particular meaning.  The word apple is not only a word, it is also a morpheme; the single word unrenewable, on the other hand, consists of three morphemes.  Morphemes are usually not unique to any single word; they are repeated in other words and add the same nuance of meaning each time: repay, reinvent, retell, rewrite, workable, reusable.  Or fat, fatter; thin, thinner; but the -er of writer is a very different meaning and thus is a different morpheme: the -er of redder and the -er of writer are homomorphs (two morphemes with identical form but unrelated meanings). (Note the example of fat, fatter; red; redder and mention the existence of allomorphs).

Although the word morphology comes from the Greek morphe, which means form, the study of morphemes involves meaning (semantics) as well as phonological form.  Morphology thus involves the study of specifiable sounds or sound groups with specifiable meanings (whereas phonology is the study of sounds without regard for any particular meaning). 

Classification of morphemes/Morphology day two

So far we have discussed only the form of words.  Aristotle believed that the word constituted the basic unit of meaning in a language.  But, as we have seen, words are not the most basic semantic units since many words can be broken down into smaller parts, each of which has a predictable meaning that reoccurs in other words: hunter, captured, tigress.  We call these smallest units of meaning morphemes. A morpheme may be part of a word or an entire word.

Morphemes may be classified according to how many syllables long they are: in English morphemes tend to be one or two syllables long but many are several syllables long: salamander, triskaidekaphobia, armadillo.  In English, as in many languages, the longer, polysyllabic morphemes tend to be those borrowed from other languages, where originally they were divisible into separate morphemes: perestroika.  In Mandarin Chinese, 89% of morphemes are monosyllables: some, however, are two or three syllables, most of these being borrowed forms: putao,"grape" (borrowed from Persian budag); a few are not: hudian, "butterfly"  (originally, each syllable probably had a separate meaning).

Morphemes can also be classed according to the order they appear in a word.  Each word contains at least one morpheme that denotes its basic meaning or function--that morpheme is called the root.  Morphemes added to modify the meaning of the root are called affixes.  In polymorphemic words, morphemes are named according to where they occur in relation to the root.  prefix-root-suffix.   English has many affixes, both prefixes and suffixes: un-, non-, etc.  In some languages words change to reflect syntactic usage (cf. Latin, Georgian, Russian, or English -s of the 3rd person sing.)  Traditionally, when analyzing such languages, linguists call a suffix that conveys grammatical information an ending.  All other morphemes in the word except the ending are referred to as the stem, which may consist of a bare root or a root + affixes: re - write + s.

There are other types of affixes besides suffixes and prefixes.  An infix is inserted inside another morpheme (see text p. 44 for Bontoc infix um, which makes nouns and adjectives into verbs: fikas--fumikas, strong--be strong; kilad--kumilad, red--be red; fusul--fumusul, enemy--be an enemy).  A circumfix is inserted around another morpheme; circumfixes occur in Georgian and other Caucasian languages, many Native American languages (see text p. 45 for Chickasaw circumfix ik ... o, which negates verbs and adjectives; chokma he is good--ikchokmo he isn't good; lakna--iklakno it is yellow--not yellow); Russian na--sya "to perform an action to satisfaction," and even in German past passive participles: lieben -->> geliebt

Free and bound morphemes  

Morphemes can also be classified according to whether they are capable of standing alone as separate words or must be connected to other morphemes to make a free-standing word.  

Free morphemes are morphemes that may be used as separate words.  In English, many root morphemes are free. And many English content and function words consist of single morphemes: hunt, kill, the.  The same is even truer of Chinese, Vietnamese and some other East Asian languages.  This is not true of all languages. 

In Navaho only a very few root morphemes denoting basic concrete concepts can stand as separate words: shash bear, chizh firewood, oljéé moon.  All of the remaining words in Navaho contain special affixes.  This is part of the reason that Navaho has borrowed very few words from English.  Perhaps as few as 30 nouns in Navaho are known to be borrowed, no verbs: mósí, chidí, béeso.  (In English, over 75% of the words are borrowed.)  So the grammatical structure of a particular language seems to play some role why one language borrows many words and another does not.

Bound morphemes

Normally, all affixes are bound morphemes: ive, ness, un.  Rarely, an affix can be used as a separate word: ex., ism.

Bound morphemes may also be roots, as in English with many Latin borrowings consisting of prefix + bound verb root: reject, project, object, inject, eject all contain the bound root ject .  We say that ject is a bound morpheme because although it has a still identifiable meaning, to throw, it cannot be used without a prefix in the basic meaning throw.' 

Other bound roots include the 'couth' of 'uncouth,' and the ceive and mit of receive and permit.  Most Anglo-Saxon verb roots in English (i.e., the ones not borrowed from Latin or Norman Frence), however, can be used as free morphemes: run, take, have, etc.  Many English nouns may likewise exist as free morphemes: sun, man, knife, etc.

Classification of words according to their morphemic structure

Simple words consist of a single morpheme. (20% of English words, including many basic content words as well as function words; 40% of Chinese words.)

Complex words consist of root plus at least one affix. As we will see, nearly all words in languages such as Navaho and Swahili are complex words (as are more than half of all English words).  In Mandarin Chinese very few words (less than 5%) are complex.

Compound words consist of two free morphemes, usually two root morphemes connected together.  Many English words consist of a noun + noun: trashcan, fighter-bomber, mailbox, lazybones.  In English the separate roots of a compound words may be written together, written as if separate words (compound word vs. a phrase: white house vs. White house), or spelled with a hyphen. There are a tremendous number of compound words in English. In Mandarin Chinese, over 50% of words are compound words (dianhua). In other languages, such as Russian, there are fewer compounds than in English.

Complex words differ from compound words in the degree of semantic independence of each part.  Teachers and spelling experts often argue about whether to place a hyphen between the two root morphemes of compounds and treat them as separate words or to write them together as a single word.  But only a linguist engaged in morphological analysis would hyphenate a complex word like re-tell-ing-s; sometimes a plus sign rather than a hyphen is used to separate ending from the stem: re-tell-ing+s.

Let's take a closer look at complex words.  We noted that extra meanings can be attached to root morphemes by means of various affixes (prefixes, suffixes) and occasionally by infixes and circumfixes.  These various types of "fixes" (to use a bound root morpheme as a free morpheme) are all, of course, bound morphemes, since they cannot stand as independent words.  Affixes can be classified according to whether they have some degree of concrete meaning or serve as purely grammatical markers.

Derivational morphemes add concrete meanings to words.  They create new words out of old ones, separate lexical items.  There are quite a large number of derivational morphemes in English, both prefixes and suffixes: -er, -ess. -hood, -ive, ness, re-, un-.  (but English lacks infixes and circumfixes). In contrast, there are very few derivational morphemes in Chinese: ex. kuai fast-->kuaize  chopstick .

Inflectional morphemes express more purely grammatical relations between words in a phrase or sentence.  They serve to mark different grammatical forms of the same word.  There are very few (only eight productive inflections) in all of English: three for verbs: -ed, -s, -ing; three for nouns: -s, -'s -'s; and two for adjectives: -er, -est.  (There are some unproductive ones too, like the plural -en in oxen, and the participial -en in given.) The meanings added by inflectional morphemes are more purely grammatical than those added by derivational morphemes. 

In this way, inflectional morphemes are more like function words and often alternate with them to express the same grammatical relation: the group's meeting--the meeting of the group; smaller--more small.  Derivational morphemes are more like content words and often alternate with them in paraphrases of the same meaning: tigress--female tiger, dislike--not like

Derivational and inflectional morphemes do not represent two absolute opposites any more that do content and function words: once again, there is a continuum.  Some derivational morphemes add more purely concrete meanings than others: teach-er, re-write vs. good-ness; likewise, there is an element of concreteness in the meaning of certain inflectional morphemes: English noun plural (kid--kids can be paraphrased as one kid--several kids) vs. 3rd person singular verb endings. The difference between buy--buys cannot be reduced to a paraphrase.  Such degrees of abstractness can also be shown to be have psychological reality.  For example, English has three inflectional morphemes with the phonetic form [s]: noun plural marker, noun marker for genitive of possession, and verb marker for 3rd singular.  Studies of child language acquisition have shown that children learning English invariably learn the noun plural first, the genitive second and the 3rd singular verb marker last.  Thus, children learning English always learn the three functions of -s in the order of more simple to more abstract.

Problems with morphological analysis/Morphology day three

Do the words redder and hunter contain the same morpheme -er.  No, because the meaning is different.  A morpheme is defined by form as well as meaning. Homonymy and synonymy may complicate morphological analysis.

You may have noticed other problems with morphological analysis. Not all words can be broken up into neat little morphemes.  Just as language is not a finite set of words, so the lexicon of a language cannot be described as a finite set of morphemes.  Morphological analysis has all sorts of untidy edges.  Here are three problems:

1) Sometimes morphemes change shape when they are combined.  The addition of affixes (prefixes, suffixes) to form complex words--the phenomenon so prevalent in European languages that gave the original impetus for the study of morphemes--itself often creates problems for morphological analysis.  Morpheme only sometimes combine together in a way that completely maintains the integrity of each: anti-dis-establish-ment-ari-an-ism.  Morphemes combine precisely in this way in Swahili and other Bantu languages of Africa.  But often the presence of one morpheme causes phonetic changes in surrounding morphemes: red--redder, explain--explanation, electric--electricity, severe--severity, long--length.  This is called fusion.       

In the previous examples, the root morpheme changed to accommodate a suffix.  Sometimes the shape of an affix changes to conform to the shape of the root to which it is added: cf. Kazakh plural: at-tar horse-s; it-ter dog-s.  When the vowels of the root morpheme dictates the vowel in the affix we have a phenomenon is called vowel harmony.  Compare English plurals: /s/, /z/, /ez/.  Different versions of the same morpheme are called allomorphs.  The study of how phonological rules affect the shape of morphemes is called morphophonology.  (Also mention partial suppletives, and suppletives).

2) Semantic shift. Very ofter the meaning of a word changes through time so much that the meaning of the whole no longer is the sum of its morphological parts: (postman, chairman, thinkable) this is happening to compounds such as blackboard, (yellow blackbird) and has already happened to the compound bigwig.  At some point such compounds become single morphemes.  Here, the morphemic structure merely hints at the etymology, or origin of the word, not at its present meaning (historically, important public figures wore showy, powdered wigs--this is never true of "bigwigs" nowadays.)  Other examples: disease, malaria.  Semantic change through time thus complicates morphemic analysis, and the morphemic structure may reflect an original, historical meaning that may be obscured in the modern language.

3) Sometimes a morpheme is attached to another form that looks like a morpheme but is not repeated in other words.  It is unique in the language. The remaining morphological residue cannot be assigned a meaning and therefore is is not a normal morpheme, since a morpheme by definition has some regularly recurring meaning: huckle-berry, luke+warm, in+ert, un+couth.  Unique bound roots such as huckle, luke, etc. are known as Cranberry morphs (although cran is no longer a cranberry morpheme since it occurs in other words).  Usually in these cases, the cranberry morpheme was borrowed from another language: cran- is borrowed from Dutch.

Minor word formation techniques (the major ones being compounding and affixation)

There are several minor techniques of word formation (in addition to prefixing, suffixing, infixing, and circumfixing) that also confound any attempt to divide all words into neat little units called morphemes.

a) Null affixation.  Sometimes a new meaning is added to a word without any change in form. This can be a grammatical inflectional concept, such as in sheep (is) vs. sheep (are). Or it can be a derivational concept, such as (to) run vs. (a) run. In such cases the morpheme is called a zero morpheme and the process of creating a new meaning with it is called null affixation. Using null affixation to change one part of speech into another is known as morphological conversion. Zero morphemes pose problems for the division of words into morphemes because, technically, they have no form.

b) Sometimes two morphemes of identical form are compounded together.  This technique is called reduplication.  Many Polynesian languages use reduplication.  Hawaiian wiki to move fast --> wikiwiki quick, helu count-->  heluhelu read.  But reduplication may be only partial, as in: 'uku flea --> 'u'uku tiny.  Partial reduplication complicates morphological analysis.  Once again, we have a case where the form of the morphemes reflects the origin of the word rather than its modern form-meaning connection.  (Cf. the problem in the book, p. 70; also give the Swahili example with dogodogo meaning very small.)

b) Sometimes two perfectly good morphemes are blended together so much that is is impossible to divide them into sound-meaning units.  Compare: reusable, where each morpheme combines and in doing so completely preserves its original form, to smog (smoke + fog).  Such combinations are called Blends. Such blends are becoming more and more popular in Modern English and often are used in advertising: Great Eggspectations (the name of a restaurant).  Each of them can only be described as a new morpheme.  The proper linguistic name for such a blended morpheme is portmanteau morpheme. (Also mention contractions, which are a type of syntactic blend

c) Clipped words are words that lose part of themselves, usually the end syllable or syllables, thus tearing apart good morphemes at the sub-morphemic level: grad, prof, dorm.  Many familiar forms of proper names are clipped words: Liz, Flo, Ed, Al.   Clipped words may form compounds: polysci, physed. These are called stub compounds.

d) Acronyms (words made from the first part of several words, usually from the first sound) also cannot be divided into morphemes in the conventional way: scuba, radar, AIDS.  If the names of the letters are pronounced as in DJ, VIP, OK, USA a special type of acronym called an alphabetism.  Clipped words, acronyms, and alphabetisms are all referred to as abbreviations.

e) Back formations result when a word consisting of a single morpheme is reinterpreted as consisting of two. Examples are inept and uncouth are yielding ept and couth; enthusiasm is yielding enthuse; some of these are not fully accepted yet, but historically many back formations have become completely accepted words, act from action; pease (once the singular form, like rice) yielded a new singular pea

f) Sometimes the exact opposite occurs: an extra morpheme is added which completely reproduces the meaning of the first: irregardless.  These types of agglomerations are considered mistakes and do not usually become words.  The process of agglomeration often occurs when words are borrowed from one language to another because speakers of the receiver language do not know the morphological divisions inherent in the borrowed word: Russian komiksy, klipsy, Zaliv Pyudzhet Saund.  This complicates morphological analysis because two morphemes reduplicate the same meaning to a word.

g) It might be added here that proper names also complicate morphological analysis: sandwich, Sequoia, sideburns, hooker.  A common noun resulting from a proper noun is called an eponym.  There are a few thousand eponyms in English.  The creation of eponyms is one way that new free morphemes can enter the language. The presence of eponyms doesn't complicate morphological analysis because they, as well as the proper nouns that spawn them, are usually single morphemes.  But eponyms do alter the morphological structure of the language, since many of them are multisyllabic rather than only one or two syllables long.