Popular and learnèd in Chinese dialects

Jerry Norman

doi:10.1017/S1356186324000427

Popular and learnèd in Chinese dialects

Published online by Cambridge University Press: 22 April 2025

Jerry Norman

Article contents

Abstract
Conflicts of interest
Footnotes
References

Rights & Permissions

Abstract

This article classifies individual lexemes in Chinese dialects into four categories: popular, learnèd, colloquial, and literary. Popular and learnèd refer to the origins of a word: whether it has been transmitted orally or learned in an educational context. Colloquial and literary refer to usage. The traditional Chinese terms for distinguishing character readings, wén 文 and bái 白, literally ‘written’ and ‘spoken’, do not correspond neatly to the four categories that are proposed here. This article illustrates the differences between all six terms, mainly by using standard Mandarin and Běijīng dialect, and secondarily by using words from Mĭn and other dialects.

Keywords

Chinese linguistics Chinese dialects lexical use

Type: Article
Information: Journal of the Royal Asiatic Society , First View , pp. 1 - 13

DOI: https://doi.org/10.1017/S1356186324000427 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: Copyright © David Prager Branner, 2025. Published by Cambridge University Press on behalf of The Royal Asiatic Society

In treating Chinese dialects, I have proposed a scheme in which lexemes are viewed as belonging to four categories: popular, learnèd, colloquial, and literary. A popular form is one that has been transmitted orally from the earliest times; a learnèd form is one that has been learned in an educational context. Colloquial and literary refer to usage: a colloquial form is one that is used in everyday speech; a literary form is borrowed from written languages and is recognised as such.

In the dialect of Běijīng, the words represented by the character 得 are dēi, dǎi, děi, and dé. Both dēi and dǎi are popular lexemes; they belong to the oldest core of the dialect and have been transmitted orally over many generations. Both dēi and dǎi are used in the sense of ‘to catch’; they seem to be local variants of the same word.Footnote ¹ Děi, another popular form, is found as a modal auxiliary meaning ‘must, have to’. In all literary contexts, the character 得 is read as dé. Though not a popular form, 得 is nonetheless used colloquially in the sense of ‘to obtain’.

Popular forms comprise the very heart of the spoken language and are generally known to all the speakers of a given local dialect. The overwhelming majority of Chinese characters, on the other hand, have only a learnèd or literary pronunciation.

A similar distinction is made in many dialect descriptions in which character readings are differentiated according to whether they are wén 文 or bái 白. This way of looking at things takes Chinese characters as the fundamental units of a Chinese dialect and do not correspond exactly to the distinctions made above. The wén/bái distinction can only be made by a literate person yet, for most of Chinese history, only a small percentage of the population was literate.

Since Westerners first came into contact with the Chinese, they were fascinated by the Chinese writing system, and with good reason. It must have seemed almost inconceivably exotic to them and, as most of these people were themselves literate in one or more Western languages (all of which were written in alphabets), they quite naturally directed much of their attention to the Chinese writing system. The introduction of modern linguistic methods into China in the early parts of the last century did not succeed in changing this way of viewing the Chinese language—that is, essentially, as a large set of graphic forms, each of which was supplied with one or more ‘readings’ or pronunciations. It would not be too misleading to say that, in this era, the Chinese writing system became the Chinese language in the minds of most sinologists, at least in the sense that their attention was directed almost entirely to the written lexicon. Early dialect studies for the most part consisted of listing Chinese characters and telling how they were read in a given locality. The result was that such early dialect studies contained little information on the popular language. Early grammars and dictionaries by missionaries and foreign residents of China still remain an invaluable source for the study of the popular lexicon of many Chinese dialects.

This situation has changed in recent years. Now, descriptions of Chinese dialects regularly contain sections of spoken vocabulary as well as sample sentences, texts, and information on grammar. The same studies, nonetheless, also contain substantial information on the readings of characters. A major contribution to the study of the popular and colloquial lexicon has been the publication of the Hànyŭ fāngyán cíhuì 漢語方言詞彙 by a group from Peking University.Footnote ²

The popular stratum of a dialect may not be very large. Centuries of influence from the literary language and regional koines has in many cases reduced the popular proportion of a given dialect to a small corpus of everyday spoken words—words such as the pronouns and demonstratives, common body parts, basic motions and states, and some flora and fauna designations. The Běijīng dialect, now the basis of the official national language, illustrates the sort of situation that I have in mind. Here, for example, the characters that are used to write many popular forms do not exhibit a regular semantic or phonological correspondence pattern with the lexicographic tradition. Let us examine a few such forms.

wǒ 我 ‘first-person singular pronoun, I, me’. If this form were regular phonologically, then we would expect ě, which was in fact formerly the formal reading pronunciation but is now obsolete.Footnote ³ This form for the first person must then have been transmitted as a part of an oral, popular tradition that did not correspond to the native lexical tradition that is enshrined in rime books such as the Qièyùn 切韻 or the late Nánběi Cháo 南北朝 lexical inventory.

nǐ 你 ‘second-person singular pronoun, you’. Nǐ is commonly considered an irregular development from Early Chinese 爾.

tā 他 ‘third-person singular pronoun, he, she, him, her’. If regular, then we would expect 他 to be pronounced tuō, which in fact is an old, now obsolete pronunciation.Footnote ⁴

zhè 這 ‘this’. In this case, the graph that is used for the words is totally unrelated both phonologically and semantically. Zhè clearly forms a part of the popular lexicon.

nà 那 ‘that’. Here, the graph possesses only a tenuous and problematic relationship with the word. Historically, 那 appears first as an interrogative; furthermore, no qù 去 tone reading exists in early lexica. The more colloquial form nèi represents a fusion of an original *nè plus the numeral yī ‘one’; the main vowel of nè is no doubt influenced by the vowel of zhè ‘this’ by analogy.

Modern Standard Chinese (MSC) has two common negatives that occur before verbs and adjectives: bù 不, which negates intentional actions and adjectives, and méi 沒, which signifies that an action has not taken place or is not taking place. Neither of these negatives exhibits a regular connection to the characters with which they are written. From the medieval lexicographic tradition (e.g. as found in the Qièyùn), we would expect 不 to be pronounced fǒu and not bù. From comparative dialectology, it is clear that the modern pronunciation goes back to a rù-tone reading (cf. Yángzhōu 揚州 pəʔ⁷.Footnote ⁵ The* Guǎngyùn 廣韻 rime dictionary lists 不 under all four tones but it is clear that the rù-tone reading is borrowed from fú 弗. Now, 弗 is thought to be a fusion of 不 and zhī之, a third-person objective pronoun. It looks, then, as if modern bù should actually be associated with fú rather than its usual modern graph. But there is still a problem. Why did the initial of 不 not dentilabialise as one would expect for a word in its phonological placement in the medieval phonological system? Much has been written about this problem, but what it all boils down to is that 不 cannot descend directly from the readings for this character that we find in older rime dictionaries. Bernhard Karlgren actually created a reading *puət to account for the relevant modern forms, but there is no textual basis for this. It very much looks as though what we have here is a popular form for this negative that cannot be traced in any regular fashion to the native lexicographic tradition. We can actually simplify Karlgren's *puət to *put; forms going back to *put occur widely in almost all Mandarin dialects as well as in some Central dialects such as Xiāng 湘 and Gàn 贛. These facts are just one of several reasons why we should not assume that all modern dialect forms go back to some uniform medieval construct. The negative méi presents some similar problems. It also almost certainly goes back to a rù-tone word; compare it with Yángzhōu məʔ⁷. The most likely explanation for Běijīng méi is that is based on the Classical Chinese negative wèi 未. But, here again, we encounter some problems. If my suggestion is valid, then we should have expected the initial m of 未 to dentilabialise. Moreover, 未 is not a rù-tone word, but a qù-tone word. Again, here, what we must have is an old fusion of 未 with 之, yielding a textually unattested *mut that later became *mət with a dissimulation of the vowel. In the Běijīng dialect, this negative occurs most often with a following yǒu 有 and this is probably what accounts for its final ending in an i. Again, what we see is a popular form that deviates from what we would expect from a strict application of the rules for converting a medieval reconstruction like that of Karlgren. Popular forms often require us to recognise developments that are outside the usual rule-based etymology of Karlgren and his adherents.

Another interesting word category in the Běijīng dialect comprises the prepositions. In actuality, almost all of the commonly used prepositions in MSC are borrowings from the learnèd stratum of the language. For zài 在 ‘in, at, etc.’ in the vernacular language of Běijīng, we find popular forms, most of which are etymologically obscure: dài, dǎi, āi, and gēn 跟.Footnote ⁶ Gēn, which also means ‘with’ and sometimes ‘from’, seems to have been a kind of general prepositional form like Classical yú 於. For 從 cóng ‘from, since’, we find in more vernacular language jiē, jiě, qiě, and dǎ 打; all of these forms are etymologically unclear.Footnote ⁷ For modern hé 和 ‘with, and’, we find gēn 跟 in the popular language; gēn is in origin a verb meaning ‘follow’. Hé itself is a relatively late word that first occurred in Sòng 宋 Dynasty texts; the older Classical word was yŭ 與. In modern written Chinese, both hé and yŭ are used, but they are clearly loans from the early vernacular language or from Classical Chinese. The modern instrumental preposition is normally yòng 用; the more popular form is ná 拿, itself a verb of relatively later provenance meaning ‘to take’. From these prepositional forms, we can see that modern written and spoken Standard Chinese mix forms of both popular and learnèd origins. Many such forms are ‘colloquial’ in that they are now common in the spoken language but etymologically represent intrusions from written forms of the language.

Numerous common verbs are also popular forms, often hidden under graphs to which they are etymologically quite unrelated. Below are a few such forms.

Zhàn 站 ‘to stand’. In this sense, zhàn is very late in the written record, occurring first in early vernacular texts of the Míng 明 and Qīng 清 Dynasties. The Classical word was lì 立, which still survives in a few dialects like Sūzhōu 蘇州. Throughout Southern China, we find variations of the word jì 徛.Footnote ⁸ The origin of zhàn in the sense of ‘to stand’ is obscure.

Chī 吃 ~ 喫 ‘to eat’. The first thing to observe is that neither of the graphs with which chī is written is etymologically correct. The graph 吃 properly means ‘to stutter’ and has nothing to do with ‘eat’, and phonologically it cannot be related to chī. The graph 喫 does mean ‘to eat’ but its medieval initial (k^h-) is incompatible with the modern pronunciation. The origin of this popular word for ‘to eat’ is basically a mystery. It is virtually universal in Mandarin dialects and widespread in the dialects of Central China.

diào 掉 ‘fall’. The Classical word, which is still used in many southern dialects, was luò 落; this word survives as a popular form in Běijīng lào, which is lexically restricted to a few common expressions such as làozhěn 落枕 ‘to get a crick in the neck’; the common Běijīng word for ‘to fall’ is diào.Footnote ⁹ This meaning of the graph 掉 is not found in the medieval dictionaries. It is not until Táng 唐 times that we find the meaning ‘to throw down’. Hence, we can say that the use of 掉 to write diào ‘to fall’ is a sort of jiǎjiè 假借 ‘loan-graph’ usage.

The word diào ‘fall’ reminds one of another verb: diū 丟. Looked at historically, there should not be such a syllable in the modern Běijīng dialect. All of the words in which a dental (d or t) is followed by an i should go back a so-called fourth-division rime. These rimes were characterised by a limited number of initial consonants that could co-occur with them; Karlgren, basing himself on an incorrect interpretation of Sino-Korean, believed that these fourth-division rimes had a fully vocalic medial i. Later linguists, including Luó Chángpéi 羅常培, who were basing themselves on Chinese transcriptions of Buddhist terms in medieval texts, rejected Karlgren's medial i and proposed that the fourth-division rimes all had e as their main vowel without a preceding medial. Karlgren's tieu became simply teu in this interpretation. A syllable like teu (Karlgren's tieu) regularly becomes diāo in modern Běijīng speech. An example of such a word is diāo 貂 ‘sable’ from a Medieval Chinese form teu (tieu). Now a sort of dilemma arises: what could the earlier origin of a syllable like diū ‘lose’ be? In fact, it cannot be placed within the conventional categories of Medieval Chinese based on the Qièyùn rime system. The Fāngyán diàochá zìbiǎo of 1955Footnote ¹⁰ provisionally places it in the rime yōu 幽 but the initial of diū cannot occur in this rime because, as Y. R. Chao showed back in the 1940s, yōu is a third-division-type rime.Footnote ¹¹ The upshot is that, from the point of view of conventional Chinese historical phonology, it is an impossible syllable, as there is no place for it in the traditional syllabary that is based on the Qièyùn. Furthermore, diū is a highly colloquial word; it never occurs in literary contexts and is restricted to spoken usage. Hence, diū must be considered a popular word with an unclear etymology.

ná 拿 ‘to take’. An earlier form of this graph was 拏 with which it is homophonous. Ná does not occur in Classical Chinese texts in which a number of other forms were used. It appears, then, that ná is an early word of popular origin; it now occurs widely in Chinese dialects, including those of the Mandarin, Wú, Gàn, Hakka, and Xiāng.Footnote ¹²

gěi 給 ‘to give’. The regular reading pronunciation of this graph is jǐ. In early texts (in which it is still conventionally read jǐ), it means ‘abundant, plentiful’ or ‘to serve’ or ‘to work for’. Somewhat later, jǐ is attested in the meaning of ‘to provide’ and it is probably from this sense that the meaning ‘to give’ comes. However, the meaning here is not the chief problem. 給 is the only graph in MSC that is read gěi; moreover, it is difficult to understand how jǐ (earlier from *kip) could regularly evolve into gěi in the modern language. In general, velars before high front vowels like i and ü palatalise; jǐ then is the regularly expected development of this form. Related forms are pervasive in Mandarin dialects but none of the related forms shows a rù-tone pronunciation. Hence, it seems doubtful that 給 actually is the etymological source for modern gěi. It must clearly be identified as a popular form of uncertain origin.

dǎ 打 ‘to hit’. In the Guǎngyùn, there are two readings for this character: one would yield a modern Běijīng form zhěng and the other dǐng. Clearly, neither of these traditional readings can be the source of the modern dǎ. So where does the form dǎ come from? Examination of other words that are pronounced [ta] and [t^ha] shows that most of them come from entering tone readings. Those that do not, such as 打 and 大, exhibit irregular behaviour when looked at from the point of view of the Qièyùn system. If in fact 打 goes back to an original rù-tone reading, then one strong possibility is that it should be linked to the graph 搭, which itself is a character of Nánběi Cháo vintage from a character meaning ‘hit’—撃 ‘strike, hit’.

It is well known that the popular development of rù-tone words with voiceless initials in Medieval Chinese is either to the first or third tone in Běijīng speech. 搭, then, is a perfectly feasible source for modern Běijīng dǎ. At this point, we encounter two further problems. Forms like Běijīng dǎ are found throughout the Mandarin dialect group, always with a shàng-tone reading, even in dialects such as Tàiyuán, Héféi, and Yángzhōu, in which we might expect rù-tone forms.Footnote ¹³ This suggests that a majority of Mandarin dialects have borrowed this reading from a north-eastern dialect like Běijīng, where the tonal changes described above took place. Readings with nasal codas (agreeing with the Guǎngyùn reading) are found: Sūzhōu taŋ³ and Wēnzhōu tiɛ³ (which derives from an earlier Wu form with a nasal coda).Footnote ¹⁴ However, even in these forms, which seem closer to Guǎngyùn reading, the initial is irregular; from a medieval zhī 知 initial, one would expect an affricate in modern Wu forms.

lā 拉 ‘to pull’. This graph is found in the Shuōwén jiězì with the meaning ‘to break, to snap’. Only in the Táng Dynasty is the meaning ‘pull’ attested in texts. Its fǎnqiè is based on a rù reading that ends in p. In modern Běijīng dialect, rù-tone words with sonorant initials in the vast majority of cases yield a qù-tone pronunciation; hence, we would expect 拉 to be pronounced là instead of lā. How do we explain the tonal anomaly? In the modern standard language, we find a number of rù-tone words that have sonorant initials with the yīnpíng tone. These words may all be considered popular forms; semantically, they mostly seem to refer to actions that are performed with the hands. Some further examples are: lēi ‘tie something tight’, mā ‘rub, wipe’, mō ‘rub’, and niē ‘pinch, hold between the thumb and the finger’. From a strictly neogrammarian point of view, there should be no words in the standard language with sonorant initials in the yīnpíng tonal category, but in fact there are numerous such cases. Almost all such words are of popular origin, as is 拉. Any account of Běijīng historical phonology has to take account of such forms. The first step in any historical analysis would have to be to collect all such forms and then to see what features they may have in common. Here, I am chiefly interested in pointing out that the Běijīng dialect (the basis of the modern standard language) possesses a great many popular forms that are not easily accounted for in the usual historical studies. This means that the history of any Chinese dialect cannot be satisfied with the study of words that belong solely to the learnèd or literary portions of the lexicon. Moreover, such a study would deal with some of the most basic and high-frequency words in any dialect.

It seems clear that every Chinese dialect possesses what I would call a popular lexical layer. This layer consists of forms that have been transmitted orally rather than as a part of a reading system for Chinese characters. When the description of a Chinese dialect consists chiefly of a listing of character readings, most of the forms that are obtained in this way will be learnèd or literary forms. A minority will be popular forms. This is clearest when a character has both a wén 文 and a bái 白 reading. So-called wén readings belong to either the learnèd or the literary strata. A learnèd form can actually be colloquial (as in the case of dé 得 mentioned above); literary forms are part of the traditional written language and, although they may occur when a speaker quotes a literary aphorism, they do not form a part of the colloquial language. An example of a literary form is wù 勿 ‘prohibitive particle’. wù is never used colloquially to issue negative commands; for this meaning, one has to use bié 別 or búyào 不要. On the other hand, wù will be known to a great majority of literate people because it occurs in written phrases that are taken from the old literary language. It is important to remember that Modern Chinese is actually a mixture of popular, colloquial, learnèd, and literary forms.

In eliciting character readings from a native speaker, an investigator, when asking about the character 勿, might be told that it is read wù but that actually we do not normally use it in actual speech—instead, we say bié or búyào.

At this point, the investigator would most likely conclude that wù is not really a component of the spoken language, but purely the reading of a character from the old literary language.

Although Chinese is commonly described as a monosyllabic language, such an observation is based more on the written language than on any actual spoken variety of speech. Běijīng speech possesses a whole host of bisyllabic words: gālár ‘corner’, gēda ‘knot’, zhǎba ‘wink’, gūlu ‘wheel’, hāla ‘rancid’, tērlou ‘slurp’, húlu ‘gourd’, tātar ‘room, place’.Footnote ¹⁵ In all of the forms given above, the second syllable is unaccented or tonally neutral. In a majority of cases, forms like these are hard to pin down etymologically. In a minority of cases, they may be loanwords from non-Hàn languages: tātar, for example, is a loan from Manchu tatara (boo) ‘a place to lodge or camp’. Although there are conventional ways to write many such words with Chinese characters, those that are used rarely shed much light on the actual etymologies of such words. All words of this type are an integral part of the popular language of Běijīng. Although some of them may by now be obsolete (like tātar), others still form a part of everyday speech. We can see, then, that the popular level of speech in Běijīng is very considerable; curiously, there has never been a systematic study of this aspect of the local dialect.

It is not always easy to sort out what is of popular origin and what is not. In my early fieldwork on Mĭn dialects, I slowly began to realise that it was critical to distinguish between those lexical forms that have a popular origin and those that are of learnèd or literary origin. If one wished to reconstruct the early linguistic system from which the modern Mĭn dialects take their origin, it seemed to me imperative to distinguish between forms that had likely been transmitted in an unbroken chain in spoken form from the earliest settlers in the Mĭn regions. In cases in which a character was identified as having a wén and bái reading, clearly the bái was more likely to be of popular origin. But not all of the words in Mĭn dialects can be represented by characters or have to be represented by so-called xùndú 訓讀 readings. In this fashion, for example, Xiàmén 廈門 tshui⁵ ‘mouth’ will be represented by the character zuǐ 嘴 even though, etymologically, it has nothing to do with this character; likewise, one might write Xiàmén kha¹ with the character jiǎo 腳 even though, again, such a written representation is wrong etymologically speaking. (We saw above that this device is also used in the Běijīng dialect—the basis of the modern standard language.)

When recording the Xiàmén dialect from a native speaker, one might be told when asking the pronunciation of the character 嘴 that it is read tsui³ but that the actual spoken word is tshui⁵.Footnote ¹⁶ At this point, the fieldworker might wonder whether this is a case of wénbái yìdú 文白異讀—that is, whether both pronunciations are associated etymologically with the character 嘴 or the second form is a case of xùndú. To solve this question, one must have a general knowledge of how words in the Xiàmén dialect relate to the various medieval rime dictionaries. In this case, provided that one was familiar with rules that relate Xiàmén dialect to Medieval Chinese, it would be clear that the form tshui⁵ could not have the same origin as tsui³; in other words, the pronunciation tshui⁵ is merely a xùndú form and consequently has a different etymological origin from tsui⁵.

In other cases of wénbái yìdú, both of the elicited forms may in fact be from the same etymon—one a reading pronunciation and the other a vernacular or popular pronunciation. The Xiàmén forms for 賊 are tsɪk⁸ and tshat⁸; the first reading is from the literary stratum and is used chiefly when reading texts aloud whereas the second pronunciation of the actual everyday word for ‘thief’ is used in vernacular speech.Footnote ¹⁷ Both pronunciations can be linked etymologically to the character 賊, but only the second form belongs to the popular stratum; that is, it has been transmitted orally over many generations from the language that is ancestral to the modern Mĭn dialects. In the case of tshat⁸, we would say that it has its origin in Common Mĭn—a hypothetical language that represents the oldest stratum of Mĭn vocabulary. The reading tsɪk⁸, on the other hand, must represent a much later form that is derived from the medieval rime-book tradition (cf. Qièyùn dzək).

In other cases, the etymological identity of a given form is not so easy to establish. The general Mĭn word for ‘house’ is Xiàmén tshu⁵ (Fúzhōu tshuo⁵, Jiàn’ōu tshiɔ⁵).Footnote ¹⁸ This word is frequently written with the character 厝, but this cannot represent the true etymological source of this word, as it fails both semantically and phonologically. When looked at comparatively, we can see that it clusters with a number of words in the rime yù 遇 where cuò 厝 is found in the mù 暮 rime; the meaning of cuò is either ‘whetstone’ or ‘put, place’. We can see, then, that the character 厝 semantically is totally unrelated to the meaning ‘house’ and cannot be the etymology of the Mĭn word in question.

Back in the 1970s when I was in Taiwan, I was browsing through dictionaries one day in a bookstore. Quite by accident, I opened one dictionary to a page on which the word shù 戍 occurred; this word is generally a verb in early texts that means ‘to garrison a border area’ (戍邊). In addition, it begins to appear as a noun in Nánběi Cháo 南北朝 texts, meaning ‘border garrison, a fortified camp on the frontier’. It occurred to me that this might be the origin of the Mĭn words for ‘house’. Fújiàn was a border region well into the medieval period and it would not have been unusual if the first buildings to have been built there by Chinese settlers were some sort of fortified structures, which might well have been known by the word 戍. The semantic development would have been ‘frontier outpost’ → ‘building for military use’ → ‘building, house’.

The important thing to notice here is that no Mĭn dialect speaker would connect the character 戍 with the vernacular word for ‘house’; 戍 is a popular word but is not a báidú 白讀, which is a colloquial reading for a particular character. Another example of this sort is the Coastal Mĭn words for ‘kill’: Xiàmén thai² (Fúzhōu thai², Cháozhōu thai²).Footnote ¹⁹ (This word also occurs in Mǐnběi dialects but in the restricted sense of ‘slaughter livestock’.)

I have shown elsewhere that this word is to be associated with the character 治 in its píng-tone reading of chí ‘manage, rule’ but euphemistically also ‘slaughter, clean a slaughtered animal or fish’.Footnote ²⁰ Again, no ordinary Mĭn speaker would equate the word for ‘kill’ with 治 chí etymologically. Therefore, the Mĭn forms for ‘kill’ that are cited above cannot be considered báidú forms of the character 治. The Xiàmén and Fúzhōu words meaning ‘kill’ are popular forms but they do not fit into the scheme of wénbái yìdú.

As indicated earlier, the terms popular and colloquial have to do with usage; both refer to words that are used in everyday language, but the two types differ in terms of their origins. Popular words are transmitted in the spoken language from one generation to another, largely independently of the written language and the traditional reading systems that are found in virtually every dialect in China. Popular words can be compared to those words in modern Romance languages that derive from the earliest times when a variety of spoken Latin (Vulgar Latin) was brought to the area in which the modern dialect or language is spoken. Modern Romance languages also have many colloquial words that do not derive from these early forms of spoken Latin, but represent borrowings from written Latin sources of subsequent ages. Romance linguists keep these two types of words strictly separate. When comparing Chinese dialects, linguists generally do not make such a strict separation, but rather depend on the traditional notion of wén and bái; my problem with these terms is that they refer to Chinese characters and not to words directly. This is probably an artefact of how dialect data are conventionally collected; a fieldworker begins with a list of characters, asking a native speaker how he or she reads these characters. In a minority of cases, the local consultant will point out that, although the reading of the character is such and such, we actually say something else. These bái forms may be etymologically related to the learnèd reading of the character, but they may also be totally unrelated, as the two Mĭn examples given above demonstrate. I began surveying dialects via the traditional method of recording character readings but, over the years, I gradually switched to putting more emphasis on gathering actual vernacular vocabulary by using a survey list of common words rather than characters. Instead of asking how the character 狗 ‘dog’ is read, I would ask, ‘what do you call this particular animal?’ ‘What do you call a male and female dog, a puppy?’ In this way, I think, one obtains a much better idea of what the popular language of a certain place is. One also obtains forms that would be difficult to elicit by simply asking how characters are read. In the past couple of decades, more and more works on Chinese dialects have included extensive sections on vocabulary. I believe that there has been a gradual realisation that the old character-centred approach was largely deficient. In many ways, Chinese dialectology has matured considerably and we can now look forward to a more sophisticated and penetrating analysis of dialect data.

I come to a vexing problem. When beginning to analyze data from a Chinese dialect, how do we actually identify which forms are popular, colloquial, learnèd, and literary? Above, I may have given the impression that this process is not very difficult when, in fact, the difficulties are many. Let us now return to the Běijīng dialect. What are some of the factors that are important in judging the popular or learnèd status of a word? Let us take the word zhŭ 煮 ‘boil, cook’ as an example. It is without question a colloquial word because it is a free form that is used in the everyday spoken language. By free, I mean that it can stand alone. Zhèzhī jī zěnme yàng? Zhŭ. ‘What should we do with this chicken? Boil it.’ A word probably has a greater chance of being a popular form if it can be used freely in this way, although it is not a sufficient reason for identifying a form as popular. Such a form could, for example, be a relatively recent loan from another dialect or from written language and not a form that was inherited from the earliest form of the dialect. For example, is the word shé 蛇 a popular form in Běijīng speech? The more vernacular term is chángchong 長蟲.Footnote ²¹ Could it be, then, that shé in origin is an intrusion from a learnèd source in which it is the usual word for ‘snake’?

What about a word like zhuō 捉 ‘catch, capture’? One can say zéi zhuōle ma? ‘Did they catch the thief?’; one might answer méi zhuō ‘no, they didn't catch him’, in which zhuō behaves like any other free verb. Yet, in truly colloquial language, one is likely to use zhuā 抓 in the sense of ‘catch, capture’. In this case, it appears that zhuō is not a true popular form, but a synonym that was borrowed from a learnèd source. In other cases, it may be difficult to decide on the popular status of a word if it fails to show any peculiarities that would lead one to assign it to the popular stratum.

Consider a word like lái 來 ‘come’. This is very clearly a free form of very high frequency in everyday language, making us feel that it must be a word of popular origin. Yet, phonologically, there is nothing that sets this form apart from other words in the same medieval sound class. When we look to other dialects, we find that the verb meaning ‘come’ corresponds quite regularly to the Běijīng form in most cases. There are some exceptions, however. Fúzhōu li² ‘come’ cannot be derived from the same form as Běijīng lai².Footnote ²² The latter form is related to a medieval *ləi ^平 (落哀切), which is a first-division-type word. Fúzhōu li², on the other hand, must come from a medieval form ljï ^平 (里之切). The Guǎngyùn has only the first reading for 來. The Jíyùn, on the other hand, has a reading 陵之切, which corresponds to the Fúzhōu form. In Early Chinese, then, we must recognise two forms for ‘come’: *’li and *li. As forms that are related to Early Chinese *li are rather rare in Chinese dialects, the alternation must have disappeared fairly early on in the majority of Chinese dialects. (Note that the Shàowŭ 邵武 dialect also has a form that is related to the Fúzhōu word: li².) In a case like this, I would say that we have to recognise two alternating forms of the word ‘come’, both of which belong to the popular stratum in a given dialect.

The word qián 錢 has two meanings. The earliest meaning was ‘a kind of agricultural tool’, but this meaning is found only in archaic texts such as the Shījīng 詩經. Later, it came to be used for a unit of currency—a copper coin with a square hole in the centre. Whether these two meanings are in any way related is a matter for historians of the writing system. The use of qián as a unit of weight is later than that denoting money.Footnote ²³ In Modern Chinese, qián has come to be the ordinary generic term for money in a majority of dialects. How do we decide whether this word is a popular word or a word of learnèd origin? In Běijīng speech, it has a single reading: qián. There is no distinction between a báidú and a wéndú form. It is a common word that is used freely in a syntactic sense. There seems to be no competing, more vernacular term. Hence, one might conclude that qián is most likely to be taken as popular form. From the point of view of medieval phonology, it is a perfectly regular development of Guǎngyùn 昨仙切 (dzjӓn). In cases such as this, in which the historical status of a certain word is uncertain, we can sometimes find clues from other dialects. In the Hànyŭ fāngyīn zìhuì 漢語方音字彙, two readings for 錢 are found in the Sūzhōu dialect: Sūzhōu ziɪ² and diɪ². For Wēnzhōu, only the reading di² is given; the Wēnzhōu reading is clearly related to the Sūzhōu form diɪ².Footnote ²⁴ However, a note reveals that both the Sūzhōu and Wēnzhōu forms in question are xùndú forms and that the two forms with initial d- are actually to be identified etymologically with the character 鈿 (tián ‘a flower-shaped adornment’). Clearly, then, neither of these dialects gives us any hint as to the status of the Běijīng form. The only dialect that shows a true wénbái distinction is Xiàmén 廈門: tsiɛn² (wén) and tsĩ² (bái). In the Xiàmén dialect, forms with nasalised vowels virtually all belong to the bái stratum. But does this mean that the Xiàmén form for ‘money’ is actually a popular word? I have the impression that present-day Xiàmén literary readings have a rather recent origin and that many of them overlay an earlier, more local system or readings. If so, this might then mean that some of the words with bái readings are not actually popular words in the sense that they go back to the protolanguage underlying the present Mĭn dialects, but rather represent forms from an earlier literary reading system.

Why might I think this? Above, the medieval reading for the character 錢 was given; the medieval initial cóng 從 in Fúzhōu and other Mǐndōng 閩東 dialects shows two developments. More vernacular forms have initial s- whereas more learnèd forms have initial ts-. (It should be noted here that Mĭn dialects do not maintain a clear distinction between the medieval initials cóng and xié 邪.)

Acknowledgements

David Prager Branner thanks the Norman family for making it possible to publish this essay posthumously, and Ms. Kathy Yen (Yán Kěxīn 顏可昕) and Prof. R. VanNess Simmons (Shǐ Hàoyuán 史浩元) for assistance during the editing process. The preface was added in response to questions from anonymous reviewers, as Prof. Norman was no longer alive to answer them in the text itself.

Conflicts of interest

The author and the posthumous editor declare none.

Afterword to ‘Popular and learnèd in Chinese dialects’

David Prager Branner

Email: [email protected]

The late Jerry Norman (1936–2012) was a specialist in Chinese dialect and historical phonology and Manchu studies. He left this unpublished article on an idea that was key in his research and it seems to me that the time is overripe for publishing it. It was originally delivered in a panel entitled ‘Local Language in Local Chinese Culture’ at the annual meeting of the Association for Asian Studies on 27 March 1998 in Washington, DC. The present version appears to have been edited in 2011 and 2012.

The scholarly import of this article

Based on his study of Romance and Indo-European historical linguistics, Norman came to feel that Sinology had neglected the idea of a dichotomy between ‘popular’ and ‘learnèd’ language. Sinology does indeed recognise colloquial versus literary registers or styles (‘口語’ versus ‘書面’). In addition, there is a distinctively Chinese contrast between the way in which a given written graph is read—as wén 文 or bái 白 (‘literary’ versus ‘ordinary speech’) pronunciation. But neither register nor pronunciation is quite the same as the split between popular and learnèd, which is a matter of the origin of words. By ‘origin’, we really mean whether a word has been evolving continuously in the mouths of speakers for aeons or has entered speech more recently, from a written form of some sort. There is some overlap among the three dichotomies and Norman argues that all three are important to Chinese historical linguistics, but that neither register nor pronunciation can stand in for the actual origin of words, which he considers the most important of the three.

The special importance of the differing origins of modern words has to do with Norman's criticism of the work of Bernhard Karlgren (1889–1978) on Chinese historical phonology. As Karlgren's own research mainly took place about a century ago, it may seem that this criticism is a needless fixation on a matter of the historical philosophy of language. But Norman's position was that the Karlgren model continued to inform almost all research in the field, despite its age. In a few words, he felt that that influence was unsound.Footnote ²⁵

One of Norman's key ideas was that the historical reconstruction of Chinese should be based on the comparison of actual popular-stratum words in living dialects, whereas Karlgren's model treats philological records as the core evidence, merely supplementing them with evidence of living dialects. The written records are not merely related to philological content; what they have to tell us about Chinese language is also often highly abstract and necessarily subject to interpretation.

In this matter, what Norman advocated is the norm in the practice of comparative-historical linguistics of most well-defined modern language families—in Romance, Altaic, Indo-European, and many others. The much greater emphasis on philological records in Chinese studies is anomalous in comparison.

For Old Chinese, long-range comparison (with Tibeto-Burman languages and others) remains prominent, as does rhyming practice in early texts, the structure of the ancient writing system, and the massive tradition of ancient scholia on written texts. Norman felt that all four of those forms of evidence were tendentious in that they led to the devaluing of the evidence of popular-stratum forms that are found in real, living language. He considered popular-stratum forms to be the primary evidence in reconstructing the phonology of their own ancestors.

It remains true even today that Chinese historical linguistics is carried out by using a model that is much more heavily philological, and based on highly abstract information, than that used for comparative Romance or Indo-European languages. That shows the continuing power of the Karlgren model.

The practical import of this article in the historical-comparative study of language

So much for scholarly import. How does attention on popular language actually affect the way in which we, as sinologists who are interested in language, practise our craft?

Although I studied Sinology for some years through a rigorous philological programme, the component of my education that took place at Norman's hands was unlike that of almost any other traditional sinologist. Popular language is simply not a normal topic of study for philologists of Chinese, who deal primarily with written records of all periods.

Those of my fellow students of Sinology who did doctoral research in Chinese-speaking parts of the world generally lived at universities, often attended lectures and seminars, and interacted on a daily basis with intellectuals. I spent my research years in small cities and sometimes villages, and usually worked with retired schoolteachers, small-town doctors, and sometimes farmers and migrant workers.

My academic cohort were immersed in the names of learnèd books and erudite scholars, styles of literature and bureaucratic titles, and movements in thought and policy among the Chinese intelligentsia of different eras. The Chinese vocabulary of those students was that of the vast high register of educated language. I, in contrast, spent my time asking about numerous concrete movements of the hand and leg in ordinary activities, processes in agriculture and cooking and other traditional activities, objects encountered locally among ordinary people, local place names, and so on. I once went to see a pig being slaughtered at four in the morning so that I could be sure of the names the local dialect of all the internal organs, not as elements of anatomy, but as objects that mattered in the lives of real people.

Many people find words like those interesting, but that is not why the fieldworker collects them.

For the fieldworker, words like these are the adamantine basis of the traditional comparative method. When we collect these words from a capable native speaker, we are hoping to find morphemes that are comparable across geographically separate dialects and languages. It is only by fitting a morpheme into a correspondence set, joining geographically separate tongues in a regular relationship, that we have a basis for concluding that the morpheme may be of popular origin—whether it is of colloquial register or belongs to the báidú 白讀 subset of character readings does not alone guarantee that.

The traditional comparative method is the real context of Norman's article. Emphasis on popular language was surely the most distinctive component of the education that I received from Norman. Yet, in traditional Sinology, an emphasis on real words, of ‘popular’ origin, has essentially no place at all.

A matter of orthography

I have been asked why the word learnèd is spelled here with an accent. The reason is that the presence or absence of the accent distinguishes between meanings that differ, depending on whether they are pronounced in two syllables or one. In two syllables, learnèd means ‘scholarly’ or ‘educated’, but if we write learned, without an accent, some readers may understand the word in one syllable to mean ‘having been acquired through experience or study’. Other words of this variably accented type include agèd, wingèd, blessèd, doggèd, and markèd. The accented forms are a distinct minority, but they are nonetheless attested in actual usage. And, here, where the term is a key element of the whole presentation, I think the accent is appropriate.

David Prager Branner, 16 July 2024, Taipei

Footnotes

²⁵ Some, but only a portion, of Norman's view is apparent in the ‘manifesto’ that he and South Coblin published in 1995; see J. Norman and W. South Coblin, ‘A new approach to Chinese historical linguistics’, Journal of the American Oriental Society 115.4 (1995), pp. 576–584. A much fuller exposition of Norman's thinking, in his own words, is forthcoming in the afterword to R. VanNess Simmons et al., Jerry Norman's Early Chinese and Common Dialectal Chinese: Collected Essays with Representative Syllabaries (Hong Kong).

References

¹ Běijīng words for ‘to catch’: dēi is documented in Gāo Àijūn 高艾軍 and Fù Mín 傅民, Běijīng huà cídiǎn 北京話詞典 (Běijīng, 2013), pp. 212, 193; dǎi in Dǒng Shùrén 董樹人, Xīnbiān Běijīng fāngyán cídiǎn 新編北京方言詞典 (Běijīng, 2010), p. 100.

² Běijīng Dàxué 北京大學, Hànyŭ fāngyán cíhuì 漢語方言詞彙. Běijīng Dàxué 北京大學, Zhōngguó Yŭyán Wénxué Xì Yŭyán Xué Jiàoyán Shì 中國語言文学系語言學教硏室. Wénzì Gǎigé Chūbǎn Shè 文字改革出版社 (Běijīng, 1964).

³ Zhōngguó Dà Cídiǎn Biānzuǎn Chù 中國大辭典編纂處, (ed.), Guóyŭ cídiǎn 國語辭典, (Shànghǎi, 1943), p. 3765.

⁴ Ibid, p. 825.

⁵ Běijīng Dàxué, Hànyŭ fāngyán cíhuì, p. 447.

⁶ [Author's original note: Běijīng Dàxué 北京大學, Hànyŭ fāngyán cíhuì 漢語方言詞彙, 2nd edn (Běijīng, 1995). Běijīng Dàxué 北京大學, Zhōngguó Yŭyán Wénxué Xì Yŭyán Xué Jiàoyán Shì 中國語言文学系語言學教硏室. Yŭwén Chūbǎn Shè 語文出版社. Revised and significantly expanded from the 1964 edition, p. 611.] Note that, unlike Chén Gāng 陳剛, Běijīng fāngyán cídiǎn 北京方言詞典 (Běijīng, 1990) (the author writes that the bulk of his work on the content took place between 1943 and 1958) and Dǒng, Xīnbiān Běijīng fāngyán cídiǎn, material in Běijīng Dàxué, Hànyŭ fāngyán cíhuì is not a report of primary lexicographic fieldwork. Běijīng words for ‘in, at, etc.’: dài is documented in Dǒng, Xīnbiān Běijīng fāngyán cídiǎn, p. 100; dǎi in Chén, Běijīng fāngyán cídiǎn, p. 50; āi in Dǒng, Xīnbiān Běijīng fāngyán cídiǎn, p. 1; gēn in Chén, Běijīng fāngyán cídiǎn, p. 90 and Dǒng, Xīnbiān Běijīng fāngyán cídiǎn, p. 163.

⁷ [Author's original note: Běijīng Dàxué, Hànyŭ fāngyán cíhuì, p. 612.] Běijīng words for ‘from, since’: jiē is documented in Chén, Běijīng fāngyán cídiǎn, p. 132; jiě in Dǒng, Xīnbiān Běijīng fāngyán cídiǎn, pp. 228–229; qiě in Chén, Běijīng fāngyán cídiǎn, p. 226 and Dǒng, Xīnbiān Běijīng fāngyán cídiǎn, p. 374; dǎ in Dǒng, Xīnbiān Běijīng fāngyán cídiǎn, p. 84.

⁸ Běijīng Dàxué, Hànyŭ fāngyán cíhuì, p. 286.

⁹ Làozhěnr ‘因睡眠時頭頸部姿勢不好或感受風寒而致使頸部疼痛 [to get a pain in the neck, brought about by sleeping with the head and neck in a bad position or being affected by wind or cold]’ (Dǒng, Xīnbiān Běijīng fāngyán cídiǎn, p. 270).

¹⁰ Dīng Shēngshù 丁聲樹 and Lǐ Róng 李榮, Fāngyán diàochá zìbiǎo 方言調查字表; published initially as Zhōngguó Kēxué Yuàn Yŭyán Yánjiù Suǒ fāngyán diàochá zìbiǎo 中國科學院語言研究所方言調查字表 (Běijīng, [1955] [1981] 2004).

¹¹ Chao, Yuen Ren, “Distinctive distinctions and non-distinctive distinctions in Ancient Chinese”, appeared as “Distinctions within Ancient Chinese”, Harvard Journal of Asiatic Studies 5.3–4 (1941), pp. 203–233.

¹² Běijīng Dàxué, Hànyŭ fāngyán cíhuì, p. 259.

¹³ Ibid, p. 280.

¹⁴ Ibid, p. 280.

¹⁵ Today, most of these forms tend to be described as colloquial words in Mandarin, as standardised in the twentieth century. Zhōngguó Dà Cídiǎn Biānzuǎn Chù (ed.), Guóyŭ cídiǎn, for instance, lists all of them without indicating that they are local Běijīng words and highly colloquial. Less well known among this list are tērlou and tātar. The former is romanised somewhat inconsistently in various sources, likely because of ambiguity introduced by Pīnyīn when a syllable of unclear etymology has a rhotacised coda -r: tērlou (Zhōngguó Dà Cídiǎn Biānzuǎn Chù (ed.), Guóyŭ cídiǎn, p. 708); tə̄rlou (Chén, Běijīng fāngyán cídiǎn, p. 269); tēirlou’ (Dǒng, Xīnbiān Běijīng fāngyán cídiǎn, p. 451). Tātar is indeed of Manchu origin (Zhōngguó Dà Cídiǎn Biānzuǎn Chù (ed.), Guóyŭ cídiǎn, p. 706; Jīn Shòushēn 金受申, Běijīng huà yŭhuì 北京話語彙 (Běijīng, 1965), p. 196; Chén, Běijīng fāngyán cídiǎn, p. 269; Dǒng, Xīnbiān Běijīng fāngyán cídiǎn, p. 445). But, interestingly, some derived forms have been described in Běijīng dialect, showing that it has evolved productively within Běijīng dialect itself and is not simply a fossilised Manchu loan (Chén, Běijīng fāngyán cídiǎn, p. 354):

•
• tāta yǎnr “窄小的場所（’眼兒’ 為 ‘所在地’） [small, cramped place (‘yǎnr’ means ‘location’)]”;
•
• tātar dā “地頭蛇，地方惡霸＜【滿】tatan i da（地方首領）[local bully or powerful regional despot < Manchu tatan i da (local chief)]”;
•
• tātar fáng “待朝房 [room in which one waits for an official audience with an important person]”.

¹⁶ W. Campbell, A Dictionary of the Amoy Vernacular (Táinán, 1913), p. 797; C. Douglas, Dictionary of the Vernacular or Spoken Language of Amoy (London, 1873), p. 96.

¹⁷ Douglas, Dictionary of the Vernacular or Spoken Language of Amoy, pp. 33, 64.

¹⁸ Běijīng Dàxué, Hànyŭ fāngyán cíhuì, p. 122. The Jiàn’ōu form is from the author's own field notes; see J. Norman, “Three Min etymologies”, Cahiers de linguistique Asie Orientale 13.2 (1984), pp. 175–189, at pp. 176–181.

¹⁹ Běijīng Dàxué, Hànyŭ fāngyán cíhuì, p. 300.

²⁰ J. Norman, “The verb 治—a note on Mǐn etymology”, Fāngyán 方言 1979.3 (1979), pp. 179–181.

²¹ Dǒng, Xīnbiān Běijīng fāngyán cídiǎn, p. 50.

²² R. S. Maclay and C. C. Baldwin, An Alphabetic Dictionary of the Foochow Dialect (Fúzhōu, 1870), p. 480.

²³ Author's original note: Biānjí Wěiyuán Huì 漢語大字典編輯委員會 (ed.), Hànyŭ dà zìdiǎn 漢語大字典 (Chéngdū and Wŭhàn, 1986), vol. 6, pp. 4217–4218.

²⁴ Běijīng Dàxué 北京大學, Hànyŭ fāngyīn zìhuì 漢語方音字彙. Běijīng Dàxué Zhōngguó Yŭyán Wénxué Xì Yŭyán Xué Jiàoyán Shì 北京大學中國語言文学系語言學教硏室. Wénzì Gǎigé Chūbǎn Shè 文字改革出版社 (1989 revised edn; original edn published 1962), p. 252.

Article contents

Popular and learnèd in Chinese dialects

Abstract

Keywords

Acknowledgements

Conflicts of interest

The scholarly import of this article

The practical import of this article in the historical-comparative study of language

A matter of orthography

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests