Syntax (by Edward J. Vajda)

Let us now move on to another major structural aspect of language, syntax.  The word syntax derives from the Greek word syntaxis, which means arrangement.  Morphology deals with word formation out of morphemes; syntax deals with phrase and sentence formation out of words.

What is a sentence?

Although everyone knows or thinks they know what a word is and what a sentence is, both terms defy exact definition.  The sentence as a linguistic concept has been defined in over 200 different ways, none of them completely adequate.  Here are the most important attempts at defining the sentence:

The traditional, or common sense definition states that a sentence is a group of words that expresses a thought .  The problem comes in defining what a thought is.  The phrase an egg expresses a thought but is it a sentence?  A sentence like I closed the door because it was cold expresses two thoughts and yet it is one sentence.

Another definition is that a sentence is a group of words expressing a topic (old information) and some comment (new information) about that topicJohn left.  (Notice how intonation--which is a part of phonology--interacts closely with syntax in delimiting topic from comment--another example of the grammatical interconnectedness of all the so called levels of language.)  The problem with the topic-comment definition is that many sentences have no clear topic and comment structure:  It's raining.

The grammatical definition of the sentence is the largest unit to which syntactic rules can apply.  In terms of syntactic categories, most sentences--at least in English-- can be divided into a subject and a predicate.  This applies to sentences with or without a clear topic/comment structure: John ---left. Many sentences have no clear topic and comment structure: It--is raining. (The word it here is the so-called dummy it used to fill the subject slot for impersonal verbs in English; cf. prshí, snezí.) 

Another problem with grammatical, or syntactic, definitions of the sentence is that not all sentences--even in English--are divisible into subject and predicate. Some sentence types make no internal syntactic structure; there is no distinction between subject and predicate:

a) Emotive sentences such as Gee!  Wow. Darn!  Yes!  No! 

b) Imperatives:  Go! Leave! Taxi! All aboard!  Down with alcohol!

c) Elliptic sentences: Who took the car?  John.

d) small talk phrases:  Hello. Good-bye. Good morning.

In polysynthetic languages the single word serve as a complete sentence much more frequently.  In such languages, morphology rather than syntax usually expresses the distinction between subject and predicate.

Types of sentences containing a subject and a predicate

Syntax usually examines sentences that have a clear inner division into subject and predicate.  There are 3 types of subject/predicate structured sentences:

a) a simple sentence contains at least one subject and one predicate:  John read Pushkin.

b) a compound sentence is two or more simple sentences joined into a single sentence:  John read Pushkin and Mary read Updike.  Each simple sentence maintains its own internal syntactic structure.  They may be joined by a coordinating conjunction such as and or or, or asyndetically (without a conjunction).

c) a complex sentence is a sentence in which one of the syntactic roles is played by an embedded sentence:  I made students read Chomsky.  The simple sentence students read Chomsky plays the role of object of the verb made.  Because the syntax of the two parts of a complex sentence is intertwined, it is often not possible to divide them into two free-standing simple sentences.   *I made.  Students read Chomsky.  I saw Mary run. 

Complex sentences, then, are said to consist of a main clause, with a subordinate clause imbedded into its structure (the subordinate clause is often referred to as an imbedded sentence).  In phrase structure notation a subordinate clause, or imbedded sentence, is notated as S', pronounced s-bar

The word that connects a subordinate clause to a main clause, such as the word that in the previous example, is known as a subordinate conjunction; in syntactic analysis a subordinate conjunction is known as a complementizer, and is notated as Comp.  In some English complex sentences the complementizer is optional, in others obligatory: I know (that) you snore. vs. I hate when you snore (if the complementizer has a temporal meaning it can't be left out.)

Parts of speech

Words and phrases can be grouped according to their sentence building functions.  Syntactic classes of words are traditionally called parts of speech.  English has the following parts of speech: verb, noun, adjective, adverb, pronoun, preposition, verbal particle (the off in turn off the light), article

Note the following test to determine what is a preposition and what is a verbal particle in English: 

a). The mouse ran up the clock--Up the clock he ran.  (Prepositional phrases can be fronted). 

b.) The man ran up a big bill.--*Up the big bill he ran. (Verbal particles cannot.)  Also:  The mouse ran up it (pronoun is object of the prep and can follow the preposition) but not *The mouse ran it up. But, The man ran it up (pronoun is object of the verb and follows the verb) not *The man ran up it

Not all languages have the same parts of speech.  Many languages have postpositions rather than prepositions, like Georgian skolashi, to school; skoladan, from school.  Serbo-Croatian, Slovak and many other languages have clitics (clitics are affixes attached to phrases instead of single words).   Dal som knigu prijatel'ovi/ Knigu som dal prijatel'ovi/ Prijatel'ovi som dal kniguI gave it to my friend.  Spanish uses the object marking clitics le and lo after verbs: Dice mi lo

A common assertion is that all languages have at least nouns and verbs.  It is true that all languages have some means of conveying information as a concept or as an event, but what a noun or verb is differs from language to language.  In the Salishan languages of the Puget Sound, a single word can be translated into English as village and a village exist or there is a village; in other words, morphemes denoting stationary concepts are often bound roots that require verbal affixes to stand as words.  So parts of speech--even nouns and verbs-- turn out to be at best fuzzy categories across languages, not identical or even present in every language.  Some people thing of parts of speech or grammatical categories as similar to protons, electrons and neutrons in how they contribute to the structure of languages, but such is not the case.  The form/meaning connections differ from language to language.  There are universal tendencies, but these do not seem to be absolute universal properties.

Parts of speech are based on syntactic function, not concrete, extra-linguistic meaning.  Notice that words is different syntactic classes can have the same concrete meaning and differ only in their ability to combine with other words:  The sky darkens,  the darkening of the sky, a dark sky, the darkness of the sky.

Thus syntactic patterns as well as syntactic categories cannot be said to be limited to any concrete real-world meaning; they are linguistic structures relevant for expressing meaning and yet have no specific meaning of their own.  Note Chomsky's famous semantically anomalous statement:  Colorless green ideas sleep furiously.  This sentence is utter nonsense but it is nonsense stated in English and conforms perfectly to a complex set rules of English syntax; thus one is tempted to devise a surrealist interpretation of it.  The utterance *green sleep colorless furiously ideas is not a sentence of English at all and even the most imaginative person could not devise a meaning for it.  

Syntactic atoms

The basic unit of syntax is not the word, but the syntactic atom, defined as a structure that fulfills a basic syntactic function. Syntactic atoms may be either a single word or a phrase that fulfills a single syntactic function.

Fido ate the bone.

The dog ate the bone.

The big yellow dog ate the bone.

Our dog that we raised from a puppy ate the bone.

Elements with syntactic equivalence all belong to the same type of syntactic atom (NP, VP)  

A language also contains specific rules for properly connecting syntactic atoms to form sentences--these are called phrase structure rules (look at problem 5 on page 116).  The sentence: The big yellow dog ate the bone. is well formed because it uses the parts of speech in a way that conforms to the rules of English syntax.  The string of words: big the ate bone dog yellow the, is not a sentence because it violates syntactic rules.  It is often not even possible to assign any meaning to a syntactically ill-formed utterance. 

This is why the syntactic rules of a language can be followed perfectly to produce illogical or semantically highly improbable sentences: The bone ate the big yellow dog.  Since a new context could be imagined to render such a statement at least fictionally logical, it is fortunate that our language has a ready made means of expressing it. The fact that syntactic structures are not restricted in the meanings they may express is one reason why we can so easily produce novel sentences never before heard. The semantic independence of the phrase structure rules is one of the main factors that provides for the infinite creativity of human language. Animal systems don't have any structural units that are meaningful yet totally independent of meaning.

Syntactic Relations and phrase structure rules

Let's examine syntactic relations within English sentences.  One approach is to divide the words of a sentence into phrases (defined as words closely associated with one another syntactically).  This technique is know as parsing.  The most fundamental division is between subject and predicate. (of course, this is because we are cheating and ignoring sentence types that lack this division).  Phrases containing different parts of speech can serve one and the same function.

The big yellow dog //ate /bones 

He //ate the old bone.

The big yellow dog //slept.

The dog //growled at John.

Each of these sentences consists of a subject and a predicate.  But in each sentence different syntactic types of words or combinations of words constitute subject and predicate.  Different combinations of parts of speech fulfilling the same syntactic function are said to be syntactically equivalent.  It is possible to write rules describing syntactic equivalence.  These rules are called phrase structure rules. These rules use special symbols designed exclusively for syntactic descriptions.  Grammatical terms or graphic notation devices devised to describe language structure are examples of meta-language, defined roughly as language about language.  The syntactic metalanguage used in writing phrase structure rules involves mainly abbreviations from English words for parts of speech.

S--> NP VP  A sentence consists of a noun phrase and a verb phrase. (These correspond to subject and predicate.)

NP--> (art) (adj) N or NP --> pronoun

(Go over exercise 5 on page116 in the textbook.)

Phrase structure rules are said to be recursive.  That is, identical elements in the structure of a phrase can repeat.  These repeating elements are sometimes known as parallel items in a series

Parallel subjects: the sentence John came--John, Bill, and Mary came. is a simple sentence with a recursive subject.   (Compare John came and Bill came which is a compound sentence each part of which has a simple subject.)

Parallel verbs: Caesar came, saw, and conquered.

Parallel modifiers

adverbs:  a very good book--a very, very good book; or                             

adjectives: a green and red and pink and blue book.

Parallel compound sentences:I came and Bill came and Mary came and...

Multiple subordinate clauses in a complex sentence: I know an old lady who swallowed a fly which was chased by her cat who had been bored because there was nothing to do in the house that Jack built when he. . .

Remember the ability of syntactic elements to occur in multiples is known as recursion.

It is possible to write an entire book consisting of just one single recursive complex sentence.  The property of recursion means that it is impossible to propose limits on the length of sentences.  No one will ever be able to state with certainty what the longest possible sentence can be.  There are a limited number of words in each language, but a potentially infinite number of sentences.  This realization prompted 19th century German linguist Wilhelm von Humboldt to say:  "Language makes infinite use of finite means."  Such a statement could not be made about animal systems of communication, in which the number of messages is strictly limited.

The syntax/morphology interface/ Day two

      1) The syntactic atom is the basic unit of syntax; syntactic structures are made up of other syntactic structures; although syntax is separate from meaning (we can have syntactically correct sentences which are utterly anomalous semantically, it is not possible to separate syntax from morphology compeletely: there are some instances where specific phrase structure rules are constrained semantically, for instance in VP = V + NP  (Subcategorization rules for verbal complements).

      Let's take a closer look at verb phrases, which are more complex than noun phrases.  First of all, VP can = a single verb (V) He ate; or the verb may have an auxillary (aux): He was eating, He has eaten; or the verb clause might contain verb + dependent words.  There are several types of verb-dependent words, known collectively as verbal complements:  He ate yesterday (Adv); He ate meat (NP); He ate in the cafe. (PP) Object noun phrases, prepositional phrases, and adverbs all fulfill the same syntactic function--the verb complement.  (Yesterday we noted that in language typology the complement is notated as O.

      The noun phrase complements of action verbs are called direct objectsHe kicked the ball.  Verbs that can take a direct object are called transitive verbs. Some transitive verbs are obligatorily transitive: that is, they cannot be used without a complement: *He made.  Other transitive verbs may omit the object: I write vs. I write a letter.

      Verbs that cannot take a direct object at all are called intransitive. For instance, the verb sleep cannot take a direct object complement:  He slept (yesterday, at home), but not * He slept a fish

      The complements of linking verbs are called predicate nominals, which may be either nouns or adjectives: John arrived healthy. We became ill

      Sometimes the same verb can have two different meanings, one requiring a direct object, the other a predicate nominal: We smelled the roses.  The chef made (created) a good salad. vs. The roses smelled good.  He made (became) a good chef.  

      The study of what grammatical form may or may not be used after a verb is called verb government.  It is also known as lexical subcategorization, the point being that it is not enough to know the meaning of a word and what part of speech the word belongs to.  One must also know additional requirements about how the word may or must combine with other words in a phrase.

      Mention that in polysynthetic languages this is part of morphology. (There is no clear division between morphology and syntax that can be drawn across all languages.)  The division between syntax and morphology varies across languages.

Phrases and heads

      Since they cannot be defined as having specific meanings, syntactic atoms (single words or whole phrases) are defined by how they interact with syntactic rules.

1) They do not allow reordering of their constituents, It's the bone the dog ate. The bone, he ate it.  (cleft sentences and sentences with left dislocation). You can't front only part of a syntactic atom any more than you can change the order of morphemes in a word rewrite but not *write-re: *The big, he ate the bone. (NOTE: When used as examples, grammatically ill formed sentences and words are traditionally marked by an asterisk *.  This also applies to morphologically ill formed words:  *ingrun, *runre.)

2) One may not anaphorize, or substitute for, only a part of a morphologically complex syntactic atom (I like the tea's flavor. I like its flavor.  Here is coffee and here is a coffeepot I like its pot.)

3) Also, if a morphologically complex syntactic atom takes inflectional endings, then only the head can be so modified, not any of the subordinate constituents. (Workaholic--workaholiclike, *workedaholic, *workingaholic.)

The head of a syntactic atom can sometimes be a zero morpheme: withstand, grandstand, leaf--> maple leaf Toronto Maple Leafs, fly--> fly out (a window), a fly ball--to fly out (in baseball) He flied out.

Notice that noun phrases often have internal rules.  English noun phrases observe a strict word order: article, adverb, adjective, noun.  Noun phrase structure rules differ from language to language: In French, Hawaiian, and many other language the adjectives come after the noun.  In many languages the form of articles or adjectives changes to reflect the gender of the noun.  When words in a phrase change grammatically to accommodate one another the process is called concord or agreement.  French is a good example: le petit garcon vs. la petite fille; German: das Haus; der Apfel; die Blume. In such cases we say that the noun is the head of the phrase, since it causes other words to change and yet remains unaffected by whatever adjective or article is added to it.   In English, the head of the syntactic unit called the sentence is the subject NP, since the verb agrees with it and not the other way around.   Each syntactic atom has its head.

Diagramming sentences, how to deal with ambiguity

Let's now turn to instances of ambiguity in syntax.  Sometimes a sentence or phrase allows for two different syntactic interpretations.

Parsing using parentheses to show syntactic relations can disambiguate such a phrase as: old men and women

Other sentences do not lend themselves to such a linear approach.  Sometimes the words that belong to the same syntactic unit are separated by other words: The book that was lying under all the other books is the most interesting.  Tree diagrams can be used to show such "long distance" grammatical relations.

Consider also the sentence The fish is too old to eat.  Here, parsing and even tree diagramming cannot separate out the two potential meanings.  In such cases of semantic ambiguity, paraphrases can be used to express two meanings hidden in a single linear form:

The fish is too old for the fish to eat.  The fish is too old to be eaten.

Noam Chomsky, a linguist at MIT, became interested in the phenomenon of syntactic ambiguity.  He noticed that languages contain systematic ways of paraphrasing sentences: 

a.) Active sentences can regularly be turned into passives: The boy kicked the ball.--> the ball was kicked by the boy. (passive transformation)

b.) Statements can be regularly turned into questions: He is there?  Is he there?  (interrogative transformation) 

He came to believe that such parallel syntactic means of expressing the same meaning were simply surface manifestations of deeper structural units of language.  To study and describe such deep structures, he devised the theory of transformational grammar.  The three main tenets of this theory are:

1) The surface forms of a language are reducible to a limited number of deep structures.  The same deep structure is manifested in several different ways in actual sentences.  This is similar to the use of the principle of allomorphs to describe morpheme variants.

2) These deep structures are universal--in other words, the same for all languages of the world; only the rules for deriving the surface forms from the deep structures differ from language to language.

3) The reason these deep structures are universal is that they are inborn, part of the human genetic code; being inborn they help children discover the surface forms of language so quickly.

Transformational grammar has maintained its popularity since 1957 when Noam Chomsky published his first book, Syntactic Structures.  But major problems continue to dog the theory.  The main problems are:

Transformational rules only work for sentences composed of separate noun and verb phrases.  We have seen that not all sentences are of this type.

Mainly English data was used to find these supposedly universal deep structures.  Usually one of the paraphrases is taken as the basic one and the other derived from it: cf. active and passive.  But active is not more basic in all languages; Japanese uses the passive as its more basic form.

No deep structures have been described that would apply across all languages.  Structural universals tend to be proposed, then disgarded as data from new languages disprove them.  There seem to be universal tendencies in syntax, but no universal has yet been proven to exist that would be more specific than the general creativity in humans.

Thus, no real progress has been made in writing a universal grammar that would be applicable to all human languages, a sort of In chemistry we have the Periodic table of Elements--all substances on earth can be seen as compounds of a finite set of elements.  Human language doesn't seem to work this way, and no such table of universal grammar elements has been found.

Definitions of Grammar

Since sentence formation is the most obvious and frequent manifestation of creativity in any language, the syntactic rules of a language are often referred to as the grammar of the language.  But morphology and phonology are also part of the grammar in that they, too, are creative tools. 

Here it might be pertinent to mention a few other definitions of the term grammar that are widely used.

a) A descriptive grammar is a description of the structure of a language in all its aspects--morphology, syntax, phonology--which attempts do portray the language as accurately as possible in terms of how it is naturally used by speakers.

b) A prescriptive grammar is a description of a language which assigns value judgments to competing ways native speakers use in forming words or sentences.  Prescriptive grammars do not attempt to describe the language as it is naturally spoken, but rather to tell the speakers how they best should speak it.

c) A third type, grammars of foreign languages written for second language learners fall in between the other two types.  They represent attempts to describe a language as it is spoken by natives in order to tell non-natives how to speak it.

When thinking of grammar in the general, descriptive sense, remember that there is no absolute division between syntax, morphology, and phonology.  Even in the same language these so called levels of language are not completely separate. 

It is not always possible to separate phonology from syntax. For instance, certain phonological rules depend on syntax.  Look at these examples from fast speech: What are you doing? where are = an auxiliary verb, becomes Whacha doin?  But What are you? where are = the main verb of the predicate, can't be run together as *Whacha?  Similarly, I'm going to work now. (in the sense of I am planning to work now)--> I'm gonna work now.  But I'm going to work now in the sense of setting out for work, can't be contracted.  The phonetic environment is the same; but syntactic class the words belong to affect which of them can and cannot be contracted.  

Morphology and syntax also interact, as we have seen.  Compound words are part of morphology, yet they are dependent on syntactic parameters, as well.  Compound words or adj/noun combinations that act as single words can express different syntactic functions.  One must understand these underlying syntactic functions to understand the meaning of the words: magnifying glass, falling star vs. looking glass, laughing gas

The difficulty of completely separating morphology, syntax, and phonology is especially evident when comparing different languages.  What in one language is a part of syntax in another language will be a part of morphology, a fact particularly evident when comparing analytic languages like Chinese to polysynthetic languages like Eskimo.