Universal Grammar

I have recently read an article on Newsweek concerning the evolution of language and the so-called Universal Grammar. Long story short, the text is about the ways in which languages evolve. After all, we know there are thousands of languages spoken all over the world today (the exact estimate for the year 2009 made by Ethnologue is 6,909, see an article by the Linguistic Society of America for more information here), and many of them are related to each other. They form part of numerous language ‘families’, however. How did we get this diversity? Of course, many factors influence language spreading and differentiation. It can be geography (both the distance between two or more speech communities and the type of terrain on which these communities reside, the type and frequency of contact, etc.). Social factors also have a say in how languages evolve. On the internal part of the matter, the very way in which a language system is organised can simply differ from speaker to speaker and well-known, learned grammatical structures can gradually acquire new properties.

The typical assumption about language change over time and space is that we first invent new words, which spread to other members of our linguistic community and (possibly) also to the neighbouring communities. Grammar, i.e. sentence structure and morphological relations between words, then follows, or adjusts to the newly created words and phrases. Hence lexicon changes (or expands) first. This can also include semantic changes, i.e. changes in meaning or interpretation of the same lexical item. We can imagine, for instance, the word ‘trigger’ which literally means an object forming part of a gun that, when pulled, makes the gun fire (the bullet is pushed out). Nowadays the meaning of the word ‘trigger’ is extended to many other abstract of metaphorical contexts. Thus, an emotion, attitude or action of any kind can be triggered by a wide array of things, and the word itself is used either as a noun or as a verb. Many taboo words also appear or become mitigated as a result of semantic changes over time. From a slightly different perspective, one may think that pronunciation is what changes first in language evolution. Small differences in how we produce consonants and/or vowels may lead to changes in whole words, and even to the creation of new lexical items. Typically then, no one would assume that native speakers simply come up with a new grammatical rule and start applying it suddenly at some point in time. Thus, grammar does not seem to be the leader of language change… Although I must say that this is not at all untenable, even if such a turn of events is a bit counterintuitive. We can imagine that a speaker omits a preposition or changes it slightly, or omits a plural marker (which could be attributed both to morphology and to phonology) in continuous speech. They might also change the order of some words for convenience or repeat certain structures for some reason. This might then be replicated by others, and voilà – we have language change!

According to a study* conducted on 81 Austronesian languages, reported in the Proceedings to the National Academy of Sciences, and contrary to the expectations, grammar changes faster than lexicon. It is also more affected by neighbouring languages, whereas lexicon is more likely to be different when a new language is formed. As the study was based on languages from the same region, which have certain similarities, the researchers were able to model change on a timescale. Their findings suggest that language contact affects syntax more than lexicon, and that different aspects of language evolve at differing speeds. The fact that grammatical changes are faster than vocabulary changes is somewhat surprising. As argued by a linguistics professor from Stony Brook University, however, the study was conducted on Pacific languages spoken by small communities. These communities have many features in common: social structure/relations, living conditions, climate and geography, which definitely affects the extent of vocabulary people use, the notions they employ and objects they denote, as well as the type of interactions they engage in that require specific communicative tools. Under such conditions it is only natural that the vocabulary does not change much over time, while the means of expression do.

Interestingly, the study was correlated with one of the major hypotheses concerning language used in research today: Universal Grammar (following Chomsky 1975). The simplified definition of this notion is that grammar is innate, which means that we are born with an instinctive knowledge of basic grammatical rules. This makes it easier for us to acquire our native language: we simply apply the rules of Universal Grammar to the structures we hear and then reproduce them, and further learn to produce novel sentences and word combinations that had never been produced before. In this sense, if grammar is so susceptible to change, unlike the lexicon, the theory of Universal Grammar is not completely right. But, as I said, the above definition is a simplified one. At this point, following decades of linguistic investigations and insights based on numerous studies on both infants and adults, I doubt there are many scholars simply thinking that we are born with a grammar. Rather, we should treat it as a general framework governing complex brain computations, or even more simply as a metaphor.

The hypothesis of Universal Grammar (abbreviated as UG) is based on the assumption that some basic language properties are shared across all languages. Given these structural similarities, we can assume a common body of knowledge or scheme we are all born with. This is the principle of universality, which, unfortunately, is not completely borne out as many pieces of evidence point to diversity rather than sameness. As a result, the number of features shared by all languages is reduced to quite a small set. For instance, all languages have vowels and consonants (but again, I have recently heard of a language that is completely devoid of vowels…, which makes it extremely difficult to argue in favour of universality).

The main problem with UG is that it is fairly vague and there is not much agreement on the extent or limits of this notion. Also, there is not much evidence in support of the strong interpretation of the term (=we are endowed with a grammar at birth, before any exposure to language). The existence of Creole languages is often mentioned in support of UG. Creoles are natural languages which evolved from simplified systems of communication created for the purposes of trade and business relations in distant places (called pidgins). Although the latter are usually mixtures of language structures from two or more different cultures, children born to parents using these systems develop full-fledged natural languages based on a grammatically impoverished input. The fact that the outcome has all the essential properties of natural languages is said to confirm the existence of a universal grammar. The same can be said of deaf children born to hearing parents who learn how to sign in order to ensure everyday communication. Such children usually develop a sign language with all the necessary structural and grammatical properties. This new communication system has nothing to do with a spoken language expressed with the use of signs (a signed language). See the famous case of the emergence of the Nicaraguan Sign Language for instance.

Another important argument used in support of the UG hypothesis is the so-called ‘poverty of stimulus’, which means that without UG children would be unable to learn all the complex linguistic structures because they are exposed to scarce input. In other words, if something is (apparently) unlearnable, i.e. cannot be deduced from the input, then it must form part of the UG – it is innate. This is indeed an argument difficult to refute, especially when we think about the simplified motherese caretakers tend to use when addressing babies. Complex adult structures are simply not there during the first stages of language development. The arguments against this central tenet of UG are typically limited to logic. Otherwise, we might think that assuming that everything has to be explicitly provided in the input for us to be able to learn it is a bit narrow-minded. Our brain mechanisms are sophisticated enough to acquire and deduce complex structures. We learn by trial and error – if something is not there and we produce it, we will know sooner or later that it is ungrammatical and learn the new rule. Besides, many studies now show that human learning mechanisms are to a large extent probabilistic, hence ‘poverty of the stimulus’ does not necessarily mean that the Universal Grammar does the work. Our brain circuits do it.

In an interesting paper on the ‘suspect’ nature of UG, Dąbrowska (2015) lists arguments typically used to support Chomsky’s hypothesis. I decided to enumerate them here for convenience, followed by a summarised discussion of why they are not necessarily convincing. As the discussion below is quite lengthy, you can skip directly to conclusions (“So what is Universal Grammar?”) if you’re not interested in such details 🙂

1. Language Universals: (All) human languages share certain properties.

This is the principal tenet of UG and at the same time a very controversial issue. On the one hand, certain properties are strikingly similar across languages, for instance the existence of the same or similar grammatical categories, the fact that sentences are typically created with some basic elements (subject, verb, object), semantic categories, phonetic and phonological categories such as words, syllables, segments. Finally, certain features (e.g. the division into nasal and oral sounds, consonantal and vocalic, lateral vs. central, etc.) and natural classes of sounds (sound groups that share common properties and behave in the same way in certain contexts). It is typically assumed that a consonant-vowel sequence is universal (languages that have other types of syllables always have this type of syllable in the first place), the presence of voiced stops entails that there are also voiceless ones but not the other way around… There are numerous pieces of converging evidence showing that the world’s languages are in some ways alike. On the other hand, however, language diversity is enormous and some languages seem to have nothing to do with others in many aspects. There are tone languages (Chinese, Vietnamese, Swedish) and stress languages (English, French), languages that have only three vs. ten or more tenses, languages with and without cases, articles or prepositions. There are languages with and without diphthongs, those that have 3 vowels (like Quechua) and those that have 16 (like French). Based on such data, how can we say that language is universal? I will not try to resolve this issue. I can only say that UG assumes that only some basic elements (features) and/or parameters are innate, the rest is subject to cross-linguistic variation. After all, we also have to account for the fact that completely unrelated languages from two different parts of the globe tend to converge on certain basic properties…

2. Convergence: Children are exposed to different input yet converge on the same grammar.

According to some more recent studies, there are substantial individual differences between the grammars and the metalinguistic knowledge of adult speakers from the same community (Dąbrowska 2012). What is more, there are significant differences between competence (our implicit, internal knowledge) and performance (the way we use this knowledge in communication). So, the ‘same’ grammar may only be an illusion. Individual grammars may be very similar but not exactly alike. In the more abstract, theoretical sense, slight individual differences can occur given that learning and language use depend on many social, environmental and cognitive factors. Nevertheless, the end result is always some approximate version of the grammar, i.e. the same grammar of a given language acquired thanks to our shared innate mechanisms. Such a grammar, in the abstract sense, can show different structural preferences, lexical choices, phonetic outputs, but is virtually the same across the members of a given speech community. At least that is how I see the second argument in favour of UG.

From another perspective, why would children raised in the same culture, with the same linguistic background, learn different grammars? After all, they are exposed to the same grammar, expressed by their caretakers. Moreover, individual differences based on poor input or on social status differences are typically levelled in a school setting or even before, during interactions with other children and adults and when confronted with standardised forms (school, television). Thus, the convergence argument does not seem so strong to me. Assuming deductive learning based on input and correction on the part of the caretaker, we can easily get the ‘rules’ of a language with sufficient exposure and consolidate these ‘rules’ through practice. At the end of the process, we have acquired the same (or very similar) rules as our neighbours. This is especially thanks to the fact that the broad principles of a language are quite rigid – there is not much room for deviation (individual features). Personal speech markers tend to predominate in the realms of lexicon and phonology (different choice of words, special intonation, typical expressions, particular pronunciations). Conversely, we might ask why this is so… Because of UG? The reasoning can go both ways.

3. Poverty of the Stimulus: Children acquire knowledge for which there is no evidence in the input.

In this case, Dąbrowska’s arguments are limited to logic-based ones, which I find viable, but not entirely convincing. In line with my interpretation of this premise above, I find ‘poverty of the stimulus’ difficult to refute per se, although the existence of this problem is not a 100% argument for UG.

4. No Negative Evidence: Children know which structures are ungrammatical and do not acquire over-general grammars in spite of the fact that they are not exposed to negative evidence.

Although there is no direct negative evidence, children are corrected by adults (although not reliably) and if not, asked for clarification, which (when repeated) is interpreted by children as correction on the part of the parent. Learners conclude that their productions are incorrect in such circumstances. Otherwise they probabilistically distinguish significant lacks of certain structures from those that are accidental (Scholz & Pullum 2002). Besides, one should not underestimate learning from experience (real world) as one of the main paths of acquisition in children. Also, kids tend to overgeneralise, which means that they take a regular pattern as one applying everywhere. Only then do they learn the exceptions (e.g. -ed ending for all vs. only regular verbs). The latter issue is an interesting one from the point of view of phonological theory construction. Typically, there is a lot of pressure for constructing theories that do not overgeneralise, but this constraint seems unnatural in learning. As this issue is not directly related to the UG topic, I will resist the temptation of pursuing it further 🙂

5. Species Specificity: We are the only species that has language.

Yes, we are the only species that has language in the specialised sense that we usually think of, but this is no argument for nor against UG. Also, it is worth noting that other species do have language, although not in the sense of grammar and vocabulary-based systems. Language as a tool of communication is not uncommon in the non-human world; our brains have evolved to specialise in the computation and use of a perhaps more sophisticated system, but this does not mean that bees, ants or apes have no language or that their interaction systems are not innate. It is also interesting to note that in our sci-fi projections of language as in how other, more evolved species communicate, it is not necessarily speech- or even sign-based… Communication at the level of neural networks is often imagined – a system not unlike the way bacteria have been reported to communicate (Prindle et al. 2015). I would be wary of any narrow(-minded) definitions of language, regardless of the context. Let’s just say that speech characterised by high phonetic complexity combined with high conceptual and structural complexity is the domain of the human species. As our social needs outgrew those of our closest relatives (apes) we had to develop a more sophisticated, fine-grained communication system. Whether this is now imprinted somehow in our genes is another matter and not an easy question to answer. For sure, our capacity to acquire and develop languages is a result of our current genetic makeup. All the machinery is there. But does it include a ready-made grammar? Or a grammar-building model? A framework… guideline… blueprint?

6. Ease and Speed of Child Language Acquisition: Children learn language quickly and effortlessly, on minimal exposure.

Children are very apt at learning a language, but they do need a lot of (good quality) exposure, from birth or even earlier up to more or less 5 years of life, on a daily basis. Also, research suggests that mere exposure is not enough (watching TV, listening to others, etc.). Interaction is crucial for adequate language acquisition – children have to practise and learn from their mistakes in real-life communication.

7. Uniformity: All children acquiring language go through the same stages in the same order.

It is definitely true that certain stages in language acquisition are ‘kind of universal’, but they are due to several perception and production constraints. For instance, parents typically use simplified language which includes short phrases and words of one or two syllables. This helps the child recognise words as units. In longer sentences, they have to identify a set of cues related to rhythm, melody and stress to be able to segment speech. Prominent units are recognised more easily, and more frequent structures are acquired faster. From the point of view of production, the speech apparatus has certain limitations. At birth, children are unable to speak as their larynx is not in the appropriate position. When the larynx lowers and makes space in the supralaryngeal cavities, babies are finally able to utter sounds other than cries. They start to ‘try out’ their speech apparatus. The easiest sounds come out first. The posterior consonant [g] is one of them. Otherwise, sounds produced by the lips are definitely easier to pronounce and do not require such precision as the movement of the tongue in e.g. [d] or [t]. No wonder then that a child typically calls her Mom first!

It is also worth mentioning that there are multiple differences in the acquisition of specific sounds, word structures and grammatical features across languages, depending on the individual characteristics of the native tongue and on the frequency of use of certain patterns (e.g. passive structures, regular/irregular past tense, sound clusters, lexical stress). Furthermore, children differ both in their learning speeds and styles. They can acquire the same language at different paces and in different ways depending on the type and quality of input (speech they are exposed to), social and family circumstances, their individual preferences, number of interactions with other adults vs. children, etc. They can also differ from one another in terms of comprehension vs. expression – some children will be more expressive from the beginning, others will be more hesitant when speaking and focus more on understanding and internalising the utterances of others.

8. Maturational Effects: Language acquisition is very sensitive to maturational factors and relatively insensitive to environmental factors.

There are certain biological limitations to language acquisition. This assumption is based on the so-called ‘window of opportunity’ or ‘critical period hypothesis’ (Lenneberg 1967, Krashen 1976) according to which brain plasticity changes with time and after a certain period (usually around 6 years of age, although some studies indicate 10 years as the absolute upper threshold) it is difficult or even impossible to learn linguistic structures in a native-like fashion (see also Morgan 2014). The second part of Chomskian argumentation, however, goes off course. As duly noted by Dąbrowska (2015), environmental factors are not unimportant in language acquisition. On the contrary – they can influence the process. Copious research has been done on the role of input quality and amount of information a learning child is exposed to. It has been reported, for instance, that children of parents who communicate in a language that is not native to them can fail to acquire a native level of grammatical structure. Also, quite obviously, the process of language acquisition is affected by the change of environment (including linguistic environment, e.g. country or region). Moreover, children of Chinese or Spanish parents living in England won’t just miraculously acquire English until they interact with English speakers (in kindergarten or at school).

In 8, the argument is that the time-locked capacity of children to learn a native language is strictly related to the access to Universal Grammar. This somehow suggests that after the critical period it is difficult or impossible to use our innate capacity to process linguistic structures. Such a situation, however, would be too much of a limitation. Why should we have universal language competence given at birth if we are to be stripped of it after the first years of our lives? Neurobiological evidence is definitely more convincing. Given that our brains are still developing after birth, our neural network plasticity changes with time. After neural connections are well-established and those connections that are no longer needed after the initial fast development phase are simply abandoned (a process known as synaptic pruning**). Thus, learning abilities decrease with age. Moreover, as time goes by, the structure of the neural networks responsible for learning also changes, and we tend to rely more on declarative (explicit) rather than implicit memory, hence the way we acquire and process new pieces of information is different (Ullman 2006, Newport 1990).

9. Dissociations between Language and Cognition: Some clinical populations have (relatively) normal language and impaired cognition; some have impaired cognition and (relatively) normal language.

Evidence for the dissociation between linguistic functions and general cognition in various types of disorders (e.g. aphasia, SLI – Specific Language Impairment, Williams syndrome, etc.) points to the need to acknowledge the existence of a ‘language module’ or ‘language instinct’, as Steven Pinker called it (1994). Nevertheless, this dissociation should be considered partial and particular impairments should be treated as predominantly linguistic or predominantly cognitive rather than definitely so (see Dąbrowska’s examples). Thus, evidence for an absolutely separate language function is inconclusive. This is in line with the current knowledge of how brain works: there are no definitive areas of specialisation that would be dedicated to one type of processing only, be it linguistic or not. Rather, language processing is tightly related to a series of other cognitive processes related to sensory issues (auditory and visual perception, for instance), as well as higher-order functions (combining sound and structure with meaning, semantic and pragmatic processing, decision-making, memory, etc.).

10. Neurological Separation: Different brain circuits are responsible for representing/processing linguistic and non-linguistic information.

This idea is based on outdated knowledge about brain. It is true that certain areas are predominant in language processing, e.g. the so-called Broca’s area and Wernicke’s area, but a) other brain areas also take part in linguistic processing, and b) these areas also take part in non-linguistic processing, which makes their ‘specialisation’ rather dubious, or at least exaggerated. Also, the location of these areas is not firmly predetermined at birth. First, around 90-95% of right-handed people have their ‘language faculties’ located in the left hemisphere. Around 20-30% of left-handed people have their ‘language faculties’ located in the right hemisphere (Corballis 2014, Mazoyer et al. 2014). Other people may have a more scattered distribution of language functions in the brain. Second, the incredible plasticity of our brains makes it possible for one area to take over the functions of a different, e.g. neighbouring area if needed (e.g. after brain injury or a stroke). Thus, one ‘magic’ location is not the correct interpretation of a UG or a ‘language instinct’.

So what is Universal Grammar?

According to Nevins et al. (2009),

“The term Universal Grammar (UG), in its modern usage, was introduced as a name for the collection of factors that underlie the uniquely human capacity for language”.

All true. UG is therefore a general philosophical framework based on the observation that certain features and structures are prevalent across the different languages and might even be universal. Also, learning patterns and linguistic behaviour converge to some extent. Human brains are endowed with a unique capacity of detecting and interpreting incoming data from the speech signal (as well as other modalities), and learning based on them despite the limited time and the restricted contexts in which these data are presented. Although not all the possible combinations of sounds, words and grammatical structures are presented to us in the first years of life, we are able to process them in a systematic way, create meaningful categories and extract rules and constraints as to what is correct or not. We do not only use the different tokens and types of units we acquire in predictable ways, we creatively reconnect and recombine them, and produce novel structures based on our everyday experience, needs and wants. This is facilitated by the neural capacities of our brains, by the reasoning and other cognitive processing skills we evolved in the course of our long life on Earth. We have the necessary hardware and the genetic code needed to operate it, but I seriously doubt that we have a specific program installed via the umbilical cord in the womb. I prefer to see Universal Grammar as an underlying evolutionary mechanism, one of many that make us who we are.

I will conclude with a quote from another popular article commenting the study cited at the beginning of this post:

“The ‘myth’ of language history: languages do not share a single history but different components evolve along different trajectories and at different rates”.

As we can see, the same study can lead to different interpretations, just as the same set of languages can lead to different theories about them. Who said that all aspects of language change to the same degree or in the same way in a given time period? As a linguist I feel obliged to deny that this is the status quo in our research programmes… Quite the contrary is true so the above conclusion is no novelty. And the hypothesis that languages share a common predecessor (mother language) does not mean that they share a single history. With that in mind, I remain sceptical.

* The description was provided here but note that this is only an interpretation of the original paper published here. It is definitely worth reading the original study to avoid misinterpretation of the actual findings.

** Synaptic pruning – a process forming part of brain development. It is estimated that the size of the human brain increases five-fold from birth to adulthood. In early brain development, millions of synaptic connections are created and then eliminated by the onset of puberty. The process is said to be connected with learning (see e.g. Chechik et al. 1998, Low & Cheng 2006, Abitz et al. 2007).


Abitz, Damgaard; et al. (2007). Excess of neurons in the human newborn mediodorsal thalamus compared with that of the adult. Cerebral Cortex 17 (11): 2573–2578.

Chechik, G; Meilijson, I; Ruppin, E (1998). Synaptic pruning in development: a computational account. Neural computation 10 (7): 1759–77.

Chomsky Noam (1975). Reflections on Language. New York, NY: Pantheon.

Chomsky Noam (2000). The Architecture of Language. New Delhi: Oxford University Press.

Corballis, Michael (2014). Left Brain, Right Brain: Facts and Fantasies. PLoS Biology 12(1): e1001767.

Dąbrowska Ewa (2012). Different speakers, different grammars: individual differences in native language attainment. Linguistic Approaches to Bilingualism 2: 219–253.

Dąbrowska, Ewa. (2015). What exactly is Universal Grammar, and has anyone seen it? Frontiers in Psychology 6: e852.

Greenhill, Simon J.; Wu, Chieh-Hsi; Hua, Xia; Dunn, Michael; Levinson, Stephen C.; Gray, Russell D. (2017). Evolutionary dynamics of language systems. PNAS 114(42): E8822–E8829.

Krashen, S. D. (1976). The critical period of language acquisition and its possible bases. In: Aronson, D.R., & Rieber, R.W. (eds.), Developmental Psycholinguistics and Communication Disorders. New York.

Lenneberg, Eric H. (1967). Biological Foundations of Language. New York: John Wiley & Sons, Inc.

Low, L.K.; Cheng, H.J. (2006). Axon pruning: an essential step underlying the developmental plasticity of neuronal connections. Philos Trans R Soc Lond B Biol Sci. 361: 1531–1544.

Mazoyer, Bernard; Zago, Laure; Jobard, Gaël; Crivello, Fabrice; Joliot, Marc; Perchey, Guy; Mellet, Emmanuel; Petit, Laurent; Tzourio-Mazoyer, Nathalie (2014). Gaussian Mixture Modeling of Hemispheric Lateralization for Language in a Large Sample of Healthy Individuals Balanced for Handedness. PLoS ONE 9 (6): e101165. DOI: 10.1371/journal.pone.0101165.

Morgan, G. (2014). Critical Period in Language Development. In P. Brookes & V. Kempe (eds), Encyclopedia of Language Development. Sage press.

Nevins A., Pesetsky D., Rodrigues C. (2009). Pirahã exceptionality: a reassessment. Language 85: 355–404.

Newport E. L. (1990). Maturational constraints on language learning. Cognitive Science 14: 11–28.

Pinker, Steven (1994). The Language Instinct. New York: Morrow.

Prindle, Arthur; Jintao, Liu; Munehiro, Asally; Ly, San; Garcia-Ojalvo, Jordi; Süel, Gürol (2015). Ion channels enable electrical  communication in bacterial communities. Nature. DOI: 10.1038/nature15709.

Scholz B.C., Pullum G.K. (2002). Searching for arguments to support linguistic nativism. Linguistic Review 19: 185–223.

Ullman, M.T. (2006). The declarative/procedural model and the shallow structure hypothesis. Applied Psycholinguistics 27: 97–105.