Bruce Mannheim on Linguistic and Genetic Groupings: A Rebuttal

Influential linguist Bruce Mannheim commented on the likelihood of overlap between genetic and linguistic patterns of variation in conjunction with the recent resuscitation (all outside of the circle of professional linguists) of Joseph Greenberg’s classification of America Indian languages.

“Let me add that I am skeptical when someone says that a biological genetic grouping corroborates a historical linguistic grouping or vice versa for a simple reason: genetic material and language are transmitted by different mechanisms (I’ll skip my usual joke about this), so in principle a one-to-one correspondence should be surprising.”

I couldn’t disagree more. And not just because languages and genes are, first, transmitted as part of natural social processes (unlike archaeological artifacts that are left behind by natives and collected by scholars) and, second, transmitted by different parts of the same body. Mannheim’s statement rings of the continuing dominance of synchronic linguistics over diachronic linguistics in the way the majority of linguists perceive linguistic material. Mannheim’s skepticism borders on denial of historical linguistics and linguistic classifications into language families altogether.

On a very fundamental level, similarities between languages can come from common descent, borrowing or convergence. Our usage of such terminology as “cognate set,” “parent language,” “linguistic kinship,” “language family,” “language family tree” all reflect the original insight by William Jones, Franz Bopp, Jacob Grimm, August Schleicher and other pioneers of the comparative method in linguistics that languages show a certain pattern of similarity because their speakers are related. Of course, neo-Grammarians didn’t specify what this relationship between speakers means, as they didn’t have the privilege of access to population genetic data. But without this particular deictic reference from linguistic form to Speaker’s origin, which neo-Grammarians described as “linguistic kinship,” we would be left with only typological similarities between languages. (As a side note, in response to William Jones and others, the founder of American anthropology, Lewis H. Morgan, circumscribed a subset of typological similarities between languages, namely similarities found in the semantics of kinship terms, to build a phylogenetic classification of kinship systems as a complement to the genealogical classification of languages. By doing this, he added the dimension of historical continuity between social structures related to biological reproduction to the “relationships between speakers” inferred from phonology, morphology and grammar. Claude Levi-Strauss then built on his method and showed that kinship systems vary in response to the underlying marriage rules. Now we are beginning to realize that these marriage rules affect genetic variation.)

The very existence of “genealogical, ” or “phylogenetic” kinship between languages implies that they derive from a single parent language or “proto-language.” This parent language was spoken by a population that shared a certain geographic territory. This language was different from other ancient languages, although it was related to some of them and it exchanged vocabulary with others. The population of the speakers of that parental language had a certain molecular genetic composition that, on some level, set it apart from other populations. This population must have shared some genetic material with other populations either through common descent or through gene flow. Once a population migrates, it develops isolation from its former neighbors in both genetic and linguistic sense. Isolation breeds differentiation in both genetic and linguistic senses.

What Mannheim has in mind are the situations of language shift, or “extreme borrowing” when a population abandons its parent language(s) and adopts the language of a another, usually politically and economically dominant population. African Americans speak a dialect of American English but they didn’t descend from Europe. While true, this doesn’t imply that American English is unrelated to British English and that North America wasn’t colonized by a migration of English-speaking Europeans after 1492. African Americans shifted away from their West African linguistic heritage, adopted American English just like they absorbed European genes. There’s perfect parallelism between linguistic and genetics in both vertically and horizontally transmitted patterns of variation. But it’s a matter of painstaking reconstruction, rather than of quick visual inspection, to arrive at a correct conclusion.

There are definitely cases in which real parallelism between genetics and linguistics is manipulated and abused to bolster a certain theory derived from scholars’ unique interpretation of either genetic or linguistic data. The case in point is Luca Cavalli-Sforza’s use of linguistic megalocomparison to support his global phylogenetic trees derived from allele frequencies. Or, the continuing use by geneticists of Joseph Greenberg’s tripartite classification of New World languages into Eskimo-Aleut, Na-Dene and Amerind families. Or the attempt by Greenberg to defend his tripartite classification of New World languages by partnering with geneticist Stephen Zegura and odontologist Christy Turner. Or when Gcochran defends Greenberg by referencing genetic data that supposedly sets “Amerinds” apart from Eskimo-Aleuts and Na-Dene. But in reality all the three Greenbergian “linguistic” divisions of New World variation carry the same “Amerindian” or “First American” (in Reich et al.’s terminology) molecular component, thus implying that, when it comes to superset classifications, Greenberg erred in the splitting direction, not in the lumping one. In addition, he erred in misidentifying linguistic markers defining the subgrouping among languages outside of Eskimo-Aleut and Na-Dene. In a word, Greenberg was wrong in all possible ways. (He was nevertheless one of the brightest minds of his generation and I’m happy to have known during my years at Stanford.)

The reason all these attempts to map linguistics on genetics and back failed is not because linguistic and genetic variation are incompatible due to entirely different transmission mechanisms. The reason all these attempts failed is because the respective models were flawed on internal linguistic or genetic grounds. Colin Renfrew (“Introduction: The Nostratic Hypothesis, Linguistic Macro-Families and Prehistoric Studies,” in The Nostratic Macrofamily and Linguistic Palaeontology. Cambridge, U.K., 1998, p. XVII) observes that the patterns of genetic and linguistic variation in the Americas are inherently similar but not because they reflect an underlying homogeneity (implied in Greenberg’s lumping of all languages outside of Na-Dene and Eskimo-Aleut into one single classificatory behemoth, Amerind) but because Amerindians are very fragmented on both genetic and linguistic levels. Linguistically, this heterogeneity manifests itself as the division of New World languages in some 140-150 phylogenetic stocks; genetically it translates into world highest Fst levels (a statistic that measures intergroup differentiation) found among American Indians.

Genetic and linguistic pictures overlap quite substantially everywhere globally and not just when it comes to population divergence but also when it comes to gene flow and borrowings (see on Pygmies and non-Pygmies here and on Indo-Europeans and North Caucasians here and in the comments section here). Just like comparative method in linguistics continues to be successful in classifying world languages. This doesn’t mean that we should just rest on the laurels and always interpret every genetic haplogroup as associated with one language and one language only and every linguistic connection between disparate populations as evidence for their common descent. Instead, we should be applying the principles of comparative method – the careful sorting of signals of horizontal and vertical transmission – to linguistic and genetic material. An example of such methodological cross-pollination is the upcoming study by Joseph Pickrell et al. (2012) in which they factored admixture into a southern African phylogenetic tree and arrived at a new, non-basal position of Khoisans. As I wrote in my earlier post today,

“It appears that population geneticists are only now beginning to realize what has been a cornerstone of historical linguistics for more than a century. Effects of horizontal transmission (gene flow in the case of genetics and borrowing in the case of linguistics) as well as effects of homoplasy or convergence need to be sorted out prior to arriving at a realistic phylogeny.”

(via Gene Expression)