Is language learned or created?: How communication shapes human languages

Primary Reference: Smith, K., & Culbertson, J. (2025). Communicative pressures shape language during communication (not learning): Evidence from case-marking in artificial languages. Cognition, 263, 106164.

Language is one of the most intricate tools humanity has ever developed—simultaneously instinctive and endlessly complex. We begin learning it before we can walk, rely on it every day to make sense of the world, and now, remarkably, we’re teaching machines to use it too. Nevertheless, there is a great deal that we still do not understand about it. A universe of questions remains about our seemingly innate capacity for language, the structure of the world’s many tongues, and how the two relate. To glean insights into these mysteries, cognitive scientists—especially linguists—often look across languages to see what properties they share, and what properties they differ in. One such property in which surface-level differences have revealed deep similarities is grammatical case—how we encode the subject and object of a sentence.

Atypical Objects and Differential Case Marking

Even if you’ve never heard the term grammatical case, you’ve likely encountered it when talking about your pronouns. When we refer to our pronouns as “she/her/hers” or “they/them/theirs”, we are presenting three cases of each pronoun—the subject (nominal), the object (accusative), and the possessive case, respectively. In English, these case distinctions show up mostly in pronouns. But in other languages, case is marked across a wider range of words—often using prefixes, suffixes, or other lexical features. 

For example, consider the German sentence, “Den Radfahrer mied das Auto.” A word-for-word translation would be: “The cyclist avoided the car”—but that isn’t what it means. Despite its literal translation, the sentence actually means: “The car avoided the cyclist.” Why? Because den marks “the cyclist” as the object, while das marks “the car” as the subject. German uses case markings to identify grammatical roles, allowing it to mix up word order without causing confusion. By contrast, English relies more on word order (typically Subject-Verb-Object, or SVO). So, while German can say either “Den Radfahrer mied das Auto” or “Das Auto mied den Radfahrer” with the same meaning, English can’t freely switch “The cyclist” and “The car” without changing the meaning. 

Many natural languages feature this ability to switch word order around due to the presence of case-markings. In some of these languages, the case-markings are available, but not always required. In such instances, the language is said to exhibit Differential Case Marking (DCM)—a pattern where case markers are used sometimes, depending on what needs to be communicated. 

In languages that exhibit DCM, researchers have found that there is a tendency in speakers to mark “atypical” sentences (i.e., ones that might otherwise be confusing), but leave “typical” sentences unmarked. For example: when an inanimate object like “the car” plays the role of the subject and acts on an animate being like “the cyclist.” That’s not what we usually expect, so languages with DCM often mark these unexpected roles to make the sentence clearer. 
There is some debate regarding the fundamental mechanisms that give rise to this asymmetry. Is it how we learn language, or how we use it during communication? Either way, many researchers agree that case markers are often omitted in typical, easy-to-interpret sentences and added when things get more complex—perhaps in the name of efficiency.

Learning vs. Communication

To untangle this, researchers Fedzechkina and colleagues (2012) conducted an artificial language experiment. They created a language and trained participants to use it across four days. Their goal was to see whether learners prefer communicatively efficient patterns even when they’re just learning the rules, not using the language for real communication. 

Their results were surprising: participants tended to preserve patterns that favored efficiency, even though they weren’t explicitly told to do so. This suggests that the bias toward communicative clarity might be baked into the learning process itself rather than something that happens when we speak.

But this view isn’t universally accepted. Some scholars argue that language learning favors simplicity, even if it makes communication harder at times. Contrary to the results of Fedzechkina et al. (2012), more recent research from Kirby et al. (2015) suggests that linguistic features such as DCM–which improve communicative efficiency but add complexity–are more likely to emerge only after learners start using the language to communicate with others.

Example tasks from the experiments in Smith et al. (2025). During training trials (a–b), participants learned an artificial language. Those who met proficiency criteria advanced to comprehension trials (c–e), which tested their ability to identify sentence subjects and objects. In sentence production trials (f), participants produced sentences to assess their use of case marking in both non-communicative and communicative contexts. Finally, in the matching trial (g), participants acted as listeners, selecting which of two similar options the speaker intended to communicate, based on case-marking cues.

A New Experiment

So, which is it? Is efficiency baked into the way we learn languages? Or do we prefer to learn simpler structures that become more complex as we begin to use them interactively? To resolve these seemingly contradictory findings, Smith et al. (2025) attempted to replicate the results seen in Fedzechkina et al. (2012). However, instead of replicating the experiment exactly, they also aimed to address some shortcomings in the experimental methodology that they had noticed. This largely consisted of hidden biases in the examples used to teach participants the artificial language. Their revised design also included a substantially larger sample size (N = 341, compared to the original N = 20) and included an additional communicative task to test the opposite prediction of the original study.

In their artificial language task, Smith et al. (2025) found that, contrary to Fedzechkina et al. (2012), participants who had just learned the language did not appear to replicate the relationship between animacy and case-marking that was most often seen in real differentially case-marked languages. However, participants did begin to replicate this relationship when they were tasked with communicating with a fictional character using the artificial language. These findings would suggest that communicative efficiency is not baked into language learning. Instead, we prefer simpler rules and structures when we learn a language. The fact that so many of the world’s languages exhibit these kind of complex structures which increase efficiency, however, suggests that communicative context may play a critical role in shaping language structure, even in early stages of language use. 

Results from the communication experiment in Smith et al. (2025). Across all four experimental conditions, participants were more likely to reproduce the asymmetry in marking animate versus inanimate objects—characteristic of Differential Case Marking (DCM)—during interaction (i.e., communication) tasks than during recall (i.e., learning) tasks.

Why It Matters

The findings from Smith et al. (2025) provide strong empirical support for the hypothesis advanced by Kirby et al. (2015): that communicative pressures, rather than learning alone, play a central role in shaping language structure. This has important implications for our understanding of the relationship between language and cognition. Specifically, the results bolster the view that language evolves to balance two competing demands—learnability and communicative efficiency. By showing that structural features like Differential Case Marking emerge more readily through use than through passive learning, this research challenges accounts that attribute language structure primarily to innate constraints. Instead, it adds to a growing body of evidence suggesting that the deep cross-linguistic similarities we observe arise not from hardwired linguistic templates, but from shared patterns in how humans use language to communicate.


Additional References:

Fedzechkina, M., Jaeger, T. F., & Newport, E. L. (2012). Language learners restructure their input to facilitate efficient communication. Proceedings of the National Academy of Sciences, 109(44), 17897–17902. https://doi.org/10.1073/pnas.1215776109

Kirby, S., Tamariz, M., Cornish, H., & Smith, K. (2015). Compression and communication in the cultural evolution of linguistic structure. Cognition, 141, 87–102. https://doi.org/10.1016/j.cognition.2015.03.016

Smith, K., & Culbertson, J. (2025). Communicative pressures shape language during communication (not learning): Evidence from case-marking in artificial languages. Cognition, 263, 106164. https://doi.org/10.1016/j.cognition.2025.106164