Introduction
Welsh, like many other languages, shows marked grammatical differences between formal and informal language, meaning that some forms are primarily used in writing or in formal oral situations and others are used mainly or exclusively in informal oral contexts. The use of possessive forms is a case in point, with a colloquial form being left completely out of grammars and textbooks, but very much present in speech although it is considered nonstandard by some (Davies, Reference Davies, Durham and Morris2016; Jones, Reference Jones1998; Thomas, Reference Thomas1988). This paper will establish how the three possible variants pattern for a range of both known and hitherto unexamined internal and external factors, and in doing so, contribute to a better understanding of the feature.
Previous studies examining variation in the choice of possessive forms show that the colloquial variant is used more among younger speakers (e.g., Davies, Reference Davies, Durham and Morris2016). However, they have not focused on sociolinguistic factors such as speech context and home language, so we do not know to what extent these might influence the use of the colloquial. Our study examines young people in Welsh-medium education in both north and south Wales and attempts to (1) examine the linguistic and social influences on the selection of different possessive variants and (2) ascertain the extent to which sociolinguistic competence is acquired, which is completely novel in the Welsh context. In a context such as Wales, where the Welsh Government (2017) hopes to count a million speakers by 2050Footnote 1 and where Welsh-medium schools teach students who have Welsh as their home language and ones who do not in the same classroomFootnote 2, insight into how different variants of a feature are used in different stylistic contexts is vital, as it will allow us to better understand to what extent the children without Welsh as a home language share their peers’ home-acquired patterns.
Previous studies focusing on the acquisition of sociolinguistic competence in immersion classroom situations (Mougeon, Rehner, & Nadasdi, Reference Mougeon, Nadasdi and Rehner2010; Regan, Howard, & Lemée, Reference Regan, Howard and Lemée2009) have tended to find that speakers of a second language (L2) do not fully acquire informal styles and consequently do not demonstrate the full linguistic range that native (L1) speakers have. This is in contrast to studies that have examined less conventional learning contexts (e.g., French in study abroad: Regan et al., Reference Regan, Howard and Lemée2009; acquisition of English following migration: Diskin, Reference Diskin2017; Drummond, Reference Drummond2011, Reference Drummond2013; and English in a lingua franca context: Durham, Reference Durham2014), which have found that learners are able to match L1 speakers’ social and linguistic constraints in some cases. The teaching situation in Wales is different from the immersion contexts studied, however, and this may affect the acquisition of sociolinguistic competence. As noted, students who speak Welsh at home are in the same Welsh-medium classes as children who do not, giving us an opportunity to examine a situation in which the acquisition of stylistic variation might be influenced by both peer-to-peer interaction and classroom instruction. This paper will examine how students at Welsh-medium schools from different linguistic backgrounds compare with respect to stylistic variation, and in doing so, contribute to our wider understanding of the acquisition of sociolinguistic competence.
Possessives in Welsh
There are three ways to form possessive pronouns in Welsh: a literary variant, a sandwich variant, and a colloquial variant (see Davies, Reference Davies, Durham and Morris2016 for a discussion of these terms). Table 1 shows how each variant is formed.
The literary variant is formed by attaching a pronoun to the front of the noun phrase. These can come in the form of prefixed pronouns (e.g., fy mam ‘my mum’) or infixed pronouns (e.g., i’m mam ‘to my mum’), depending on whether the preceding word ends with a vowel (Borsley, Tallerman, & Willis, Reference Borsley, Tallerman and Willis2007:157–158).
The sandwich variant is formed with a prenominal pronoun, like the literary variant, along with a post-nominal pronoun. The suffixed pronouns are described by Awbery (Reference Awbery1994:1) as echo pronouns because they provide information already given in the phrase. The practice of “echoing” the pronoun of the possessor after the noun may have emerged due to the phonological similarity in certain dialects between ei ‘his’/‘her’ and eu ‘their’ (King, Reference King2016:94).Footnote 3
The colloquial variant is formed solely with a post-nominal pronoun. This variant is the newest of the three and has been found to be used increasingly frequently among younger speakers of Welsh. For example, Jones (Reference Jones1998:74), examining Welsh speakers in Rhymney, found that while it was used about 25% of the time by speakers above 60, it represented 40% of instances for those 20–39 and 75% of instances for speakers 7–19. The link between stratified age variation and language change in Jones’ (Reference Jones1998) data is evidence of this variant becoming increasingly frequent.
Despite this variant showing evidence of being increasingly common in Welsh, this form has often been seen as ungrammatical and remains heavily stigmatized. Awbery (Reference Awbery1976:16) argued that it was not permissible for the possessive pronoun to follow the head noun (e.g., ci fe ‘[the] dog [of] him’). King’s (Reference King2016:93–94) Modern Welsh, which is considered a comprehensive reference to colloquial and literary Welsh grammar, notes that “while it is common … [it] is widely regarded as sub-standard.” Similarly, Borsley et al. (Reference Borsley, Tallerman and Willis2007:159) noted that the construction was possible but deemed it nonstandard. They also hypothesized that development of the colloquial variant may be an extension of non-pronominal noun phrases with possession (e.g., car Megan ‘Megan’s car’) where the possessor follows the noun. This is relevant because, as will be discussed further below, the colloquial form has at some points been ascribed to language acquisition processes (in children, but particularly in non-L1 speakers of Welsh; Willis, Reference Willis2016).
Nevertheless, all three variants are found in speech and writing, even if, as shown by Watkins (Reference Watkins1977), the colloquial variant is indeed found more often in speech and the literary variant more in writing. This distinction between mediums is tied to formality, as the literary form is seen as more formal and the colloquial form as more informal. The sandwich construction sits between the two in terms of formality and has previously been described as a way of drawing literary language closer to spoken speech (Watkins, Reference Watkins1977:153).
While there is some research on possessive pronoun use in different styles (Davies, Reference Davies, Durham and Morris2016; Thomas, Reference Thomas1988), across regions (Jones, Reference Jones1998), class (Thomas, Reference Thomas1988), and age (Davies, Reference Davies, Durham and Morris2016), these studies do not consider the relative influence of a number of linguistic and social factors, and none consider home language. This reveals a gap in terms of how speakers, particularly young speakers, vary according to sociolinguistic and linguistic factors such as home language and context of use.
Data and methods
The data presented here belongs to a wider project examining the acquisition of sociolinguistic competence in Welsh immersion classrooms.Footnote 4 To assess the extent to which bilingual Welsh-English children in Welsh immersion classes from different backgrounds and different parts of Wales showed stylistic variation, two schools were sampled: one in the north, in an area where the percentage of Welsh speakers is relatively high, and one in the south, where the percentage of Welsh speakers in the community is relatively low (this factor will be referred to as region in the analysis). Participating students (who were all 16 or 17 years old) in both schools were separated into two groups: those who mostly used Welsh as a home language (WHL) and those who mostly used English as a home language (EHL)Footnote 5 (home language). Students were recorded in three contexts representing different levels of formality (context). Further details on these factors are presented in the following sections. All participants were considered Welsh-English bilinguals on the basis that they had attended Welsh-medium or bilingual secondary education (age 11–16) prior to attending their sixth form, where classes and examinations were held in Welsh as a first language.
Region
The project targeted two areas of Wales: Gwynedd in the north and Cardiff in the south (Figure 1). Gwynedd is generally perceived to be part of the Welsh-speaking “heartlands” (Coupland, Reference Coupland2012), with 64% of people in the area reporting that they are able to speak Welsh (Welsh Government, 2022). Cardiff is the capital city of Wales and is a densely populated metropolitan area. The proportion of Welsh speakers in Cardiff is lower than the national average due to different patterns of language shift across Wales. The Census results for Wales showed that 18% of the country’s population were able to speak Welsh, while in Cardiff, 12% of people reported that they could speak Welsh (Welsh Government, 2022).
In schools in Gwynedd, Welsh is the sole or main medium of instruction for most pupils (9,127 pupils in Welsh-medium schools and 6,178 pupils in bilingual schools where Welsh is the medium of instruction 80%+ of the time, out of a total of 17,038 [i.e., 90%] of pupils) (StatsWales, 2022). We examined a sixth form college in Gwynedd (students aged 16–18 preparing for their final school exams). A total of 10 pupils participated in the study from the school we had selected. They were from a class of 12 following a bilingual Psychology A Level course.
In Cardiff, there are 8,478 pupils in Welsh-medium education out of the total number of pupils, 56,837 (i.e., 15% of pupils) (StatsWales, 2022).Footnote 6 This shows that pupils studying through the medium of Welsh are a minority in the capital. A total of eight pupils participated in the study from the Cardiff school selected. They were from across year 12, studying various A Level courses.
The data in both schools was collected in 2021 during the global COVID-19 pandemic. This delayed the collection process and meant that we recruited fewer students than we had originally planned. All the tasks were conducted online and will be detailed in the Context section.
Home language
In both locations, students were classified by whether they used mostly Welsh or English at home. We did this by asking students what language(s) they used with their parents and other family. Most were either fully Welsh- or English-speaking at home, but some used either Welsh or English with different family members. Students reporting more than 50% Welsh use were classified as WHL and those with less than 50% as EHL. As Table 2 shows, we had roughly similar numbers in the two communities and across EHL and WHL groupsFootnote 7.
Context
In order to offer something to the schools and students in exchange for their help and to deal with the fact that COVID-19 meant that the classroom and peer interactions we were hoping to obtain could not be collected, the first author put together a set of careers training recordings and exercises for students to watch and discuss in peer groups. Following this, she gave each student a mock job interview using the skills they had acquired from the recordings and exercises. A week or two after that, she had a more informal chat with them individually to obtain more casual data. The data collection consisted of two sessions (the conversation and interview contexts) for each participant with the researcher and six workshop sessions with their peers, which participants completed in their own time over the course of 2–3 weeks (the peer-group context). These contexts represent different levels of formality:
1. The conversation context: sociolinguistic interviews with the researcher in which the participants talk about their lives.
2. The peer-group context: participants interact with their peers during six semi-spontaneous workshop sessions.
3. The interview context: participants undertake a mock job interview with the researcher.
The mock job interview was designed to be the most formal context. The students were asked to prepare for job interview questions such as “Tell me about a time when you showed good time management” in the run up to the interview.Footnote 8 This was the first time they had met the interviewer. The other two contexts were relatively informal; the peer-to-peer context and sociolinguistic interviews were loosely structured conversations around set topics (careers training module for peer-to-peer interactions, and home life and hobbies for the sociolinguistic interview). The sociolinguistic interview and job interview with the researcher were conducted and recorded on Zoom. In the peer-to-peer context, the researcher was not present, and the students recorded their own conversations using their mobile phones. We found that the format of the workshops meant that some students were more formal in their language use than in the conversation context, despite speaking solely to their peers, while for others, the sociolinguistic interview was seen as less casual than the peer-to-peer workshops.Footnote 9
We chose these three contexts to establish to what extent the students were able to style-shift in less formal contexts. As noted in the Introduction, previous research examining the acquisition of sociolinguistic competence found that while L2 speakers do acquire some variation patterns (Durham, Reference Durham2014), they are less likely to acquire informal variants as classroom settings often privilege formal variants (Mougeon et al., Reference Mougeon, Nadasdi and Rehner2010; Regan et al., Reference Regan, Howard and Lemée2009). We expect that the situation in Welsh classrooms is somewhat different (not least because Welsh home language children are present too), and having two separate casual contexts will allow us to fully establish what is going on.
The data were collected in June and July 2021; this was judged to be a suitable time for data collection by gatekeepers in the schools as it was after exams and before the summer holidays. Approximately 34 hours of speech data across the different contexts and in both schools were collected from participating students. This comes out to just over a quarter of a million words when transcribed.
Feature extraction
All possible possessive pronouns were extracted from the data. Fixed phrases were excluded, such as yn fy marn i ‘in my opinion’, as were any unclear tokens (n = 2). We also noted earlier that the literary variant can contain prefixed or infixed pronouns; both these types were transcribed as prefixed pronouns, because of their rarity. For example, ‘n chwaer ‘my sister’ was transcribed as fy chwaer (the same is applied to infixed pronouns in the sandwich variant, e.g., ‘n chwaer i was coded as fy chwaer i).
A total of 1,968 possessive pronoun tokens were extracted, 871 were from Gwynedd, and 1,097 from Cardiff. When broken down by home language, 1,226 tokens came from students from Welsh-speaking homes (WHL), while 742 possessive pronoun tokens came from students from English-speaking homes (EHL).
The sections below discuss the social and linguistic factors considered in our analysis and summarize previous findings (where they exist).
Social factors
Due to our focus on 16- to 18-year-olds and the fact that our data are not evenly distributed in terms of gender across the various groups (there are fewer males overall), we do not consider age or gender in our analyses. The colloquial variant has consistently been found to be used at higher rates in younger speakers (Davies, Reference Davies, Durham and Morris2016; Hatton, Reference Hatton and Ball1988; Jones, Reference Jones1998; Thomas, Reference Thomas1988; Watkins, Reference Watkins1977). We thus considered the social factors outlined below.
Context
The main research on style differences in the use of possessive pronoun variants focuses on written and oral differences. Borsley et al. (Reference Borsley, Tallerman and Willis2007:158) found that literary Welsh primarily uses the literary variant, although Watkins noted (Reference Watkins1977:153) that some authors make a “conscious effort” to use the sandwich form in modern literary language. The colloquial construction, as noted above, is considered ungrammatical by many and is found less often in writing. It is instead reported to occur more often in speech although it has been shown to depict L2 speech in literary works (Willis, Reference Willis2016).
In addition, there is variation within the spoken register, as demonstrated in Davies (Reference Davies, Durham and Morris2016:41). Davies used a corpus of “spontaneous, informal speech” (Reference Davies, Durham and Morris2016:33) to analyze possessive pronoun variant use of 151 participants who were predominantly from north-west Wales and found that the three variants were used, even though the context remained consistently informal. Young (Reference Young2019) examined Welsh teachers and focused on their reported use of different features across different formality contexts. Not only did the teachers report significant variation in their own use of the colloquial form in in-classroom and out-of-classroom contexts, but they also reported that they were more likely to correct their students’ use of the colloquial possessive variant as the context of use became more formal. This suggests that the colloquial variant will be used at higher rates in the more informal speech contexts. Using the three different contexts we collected, we aim to establish if this is the case and if all sets of speakers follow this pattern.
Home language differences
There is no previous research focusing on differences in use in possessive pronouns according to home language (or between early and late learners of Welsh), although Robert (Reference Robert2009:104) identified the colloquial variant as a potential indicator of new speaker speech. It is worth noting that learning materials for adult learners (National Centre for Learning Welsh, 2023) tend to focus on the literary and sandwich variants and that, as discussed above, many Welsh grammars do not list the colloquial form as an option. We aim to establish whether Welsh home language speakers are more or less likely to use the colloquial variant than non-Welsh home language speakers.
Regional differences
King (Reference King2016:93) reported that variation in variant use exists from region to region, although he also noted that the literary and sandwich variants tend to be seen as the standard forms even in spoken language. However, much previous work has been conducted at a single location. Awbery (Reference Awbery1994) considered Pembrokeshire in south-west Wales, and Roberts (Reference Roberts and Ball1988) examined Pwllheli in north-west Wales, making it more difficult to compare regions. Jones (Reference Jones1998) conducted the only comparative work thus far, considering Rhymney (south-east Wales) and Rhosllanerchrugog (north-east Wales). She found that the use of the literary form was more common in the south-east Wales cohort, while in north-east Wales, the colloquial variant was more frequent. We aim to establish whether there are differences between the north and the south in our data as well.
Linguistic factors
We considered two linguistic factors that have previously been found to affect the rates of the variants (grammatical person and lexical frequency). Given the lack of comprehensive variationist research on Welsh possessive pronouns, we also considered several other factors that have been found to be relevant for similar features in other languages (language of noun and noun category).Footnote 10
Grammatical person
Davies (Reference Davies, Durham and Morris2016:55) compared rates of the three variants in third-person singular and first-person plural forms and found that the colloquial variant was more common in both for younger speakers, whereas the colloquial variant was less frequent in the third-person singular for older speakers. King (Reference King2016:94) suggested that the sandwich variant may be preferred to the literary variant in the third-person singular in order to distinguish between the prenominal masculine pronoun ei and the feminine pronoun ei (e.g., ei gar o ‘his car’ and ei char hi ‘her car’), which without the mutation and post-nominal pronoun would be identical.
Unlike previous work, we consider all grammatical persons, although overall token numbers mean that we will combine some forms together in the statistical analysis. Examples of each grammatical person with the three possessive variants are presented in Table 3 to demonstrate how this factor was coded in the data. The type of mutation that occurs with the sandwich and literary forms is also given.
Lexical frequency
Davies (Reference Davies, Durham and Morris2016:44) found that certain frequently used possessed nouns (such as tŷ ‘house’) were only used with the colloquial variant and hypothesized that high frequency nouns that show limited variation could point to conventionalized “set phrases” (e.g., tŷ ni ‘[our] home’). Variationist sociolinguistic research has increasingly examined frequency effects on morphosyntactic variation (e.g., Erker & Guy, Reference Erker and Guy2012; Linford & Shin, Reference Linford, Shin, Cabreilli Amaro, Lord, de Prada Perez and Aaron2013), particularly in the field of L1 and L2 acquisition. As there is little research on high frequency nouns in Welsh and conventionalized “set phrases,” we felt it was important to consider this factor in our analysis.
Following previous research, we decided to determine lexical frequency within our own corpus (cf. Erker & Guy, Reference Erker and Guy2012). To do this, we created a list of the most frequent nouns in the corpus (i.e., those occurring at a rate higher than two instances per 10,000 words of the corpus, which was about .02%) and then compared these to the nouns that occurred in our set of possessive pronoun tokens. For ease, instances of plural, singular, masculine, and feminine forms were counted as a single noun. For example, athro ‘male teacher’, athrawes ‘female teacher’, and athrawon ‘teachers’ were all coded as athro.Footnote 14 Nouns that were found both in our possessive pronoun tokens and in the frequency corpus were counted as frequent, and all other nouns were counted as infrequent.
Out of the 362 different noun forms found in the possessive tokens, 51 were found to be frequent and to represent a total of 1,150 tokens, which is 58% of the overall possessive pronoun tokens. In the analysis below, we will establish whether, like in Davies (Reference Davies, Durham and Morris2016), frequent nouns are more likely to be used with the colloquial form.
Language of possessed noun
Welsh contains loanwords and calques from English, and speakers are also known to code-switch and insert English words when speaking Welsh (Deuchar, Donnelly, & Piercy, Reference Deuchar, Donnelly, Piercy, Durham and Morris2016). In order to see if this may have an effect on variant choice, we coded for recent loanwords using the following criteria: where an English loanword appeared in Geiriadur Prifysgol Cymru (the standard historical Welsh language dictionary) as a Welsh word, it was coded as Welsh. Other English words were coded as English. This is a commonly used criterion in codeswitching research in Welsh (Prys, Reference Prys2016).
Possessed noun category
Nouns can be categorized according to their alienability or inalienability. The alienability of possession is determined by “whether the object can exist apart from its possessor” (Nichols, Reference Nichols1988:575), and thus nouns such as calon ‘heart’ and brawd ‘brother’ are examples of inalienable possession. On the other hand, if the possessed object can exist apart from its possessor, it is considered alienable, as in the case of nouns such as ffôn ‘phone’ or swydd ‘job’. Oceanic languages, spoken in Papua New Guinea, Melanesia, Polynesia, and Micronesia, distinguish grammatically between alienable and inalienable possessive constructions (Lichtenberk, Vaid, & Chen, Reference Lichtenberk, Vaid and Chen2011). English does not have such a grammaticalized distinction, and it is unknown whether Welsh does. We thus examined this to determine its potential effect. We initially coded using Licheternberk et al.’s (Reference Lichtenberk, Vaid and Chen2011) subcategories of alienability, but in our results, we present only the broad division between inalienable and alienable.
Statistical analysis
Mixed-effects logistic regression analyses were conducted in order to determine the effect of external and internal factors on the variation of the possessive pronoun variant. Statistical modeling was carried out using the lme4 (Bates, Mächler, Bolker, & Walker, Reference Bates, Mächler, Bolker and Walker2015) and lmerTest (Kuznetsova, Brockhoff, & Christensen, Reference Kuznetsova, Brockhoff and Christensen2017) packages for R (R Core Team, 2021). Fixed effects (such as home language and context) are factors that are replicable in further studies. By including random effects (speaker and word), which are sampled randomly, modeling can account for inter-speaker variation (Johnson, Reference Johnson2009). The mixed-effects analysis will present a two-way breakdown of the variants: the colloquial variants on the one hand and the other two (non-colloquial) variants on the other.
Summary of factors
Table 4 presents each of the random and fixed factors influencing the possessive construction. The levels presented here were built into the statistical model (in italics is the baseline factor).
Results
Factor-by-factor distribution
The overall distribution for the different groups and factors will be discussed individually before moving onto the statistical analysis. A total of 1968 tokens were extracted and coded in the data. As Table 5 shows, the colloquial variant is the most frequently selected form, followed by the literary form. Given the age of the speakers and the fact that previous studies have shown that there are higher rates of the colloquial form in younger speakers, this is not unexpected.
Region
We now turn to region, considering the data from the location in the north (Gwynedd, n = 871) and in the south (Cardiff, n = 1097) separately (Figure 2). Recall that Gwynedd has a higher proportion of Welsh speakers in the community and that previous research had found that the north-west had higher rates of the colloquial form.
Figure 2 shows that while the rates for the colloquial variant are indeed higher in Gwynedd (71% versus 48% in Cardiff), the patterning remains the same. The colloquial form is the most frequent form in both, followed by the literary variant, while the sandwich variant is the least frequent. By examining home language and region together, we will be able to establish whether the difference is due to home language or whether it reflects a genuine difference in region. We will present this alongside context in Table 6.
Context
The three contexts—a mock job interview, a peer-to-peer conversation, and a sociolinguistic interview—are presented from most formal to least formal. Table 6 presents the percentages of the three variants for the four region–home language groups in the three contexts, while Figure 3 presents a two-way split for the variants (colloquial versus literary + sandwich).
In terms of differences based on region and home language, Table 6 makes it clear that while the EHL students in Gwynedd are “extreme” colloquial variant users, the difference between regions is not due solely to them, as the WHL students are also higher than both groups in Cardiff.
More generally, the EHL students show higher rates of the colloquial than the WHL students, but it is nonetheless the most frequent form for all groups, followed by the literary form. The rates of use for the colloquial and literary forms for the Cardiff WHL students are very similar, which puts them at odds with the other three groups. Gwynedd’s EHL students have far lower rates of the literary and sandwich forms due to their near-categorical use of the colloquial form.
Given our findings for region and home language, we present the results of each subsequent factor broken down across these four groups. In terms of context, Table 6 shows that all four groups have the highest rate of colloquial variants in the sociolinguistic interview, but the difference between peer-to-peer and sociolinguistic interview is less marked for the two WHL groups than for the EHL groups. The job interview context is very different from the other two in Cardiff, underlining that the more formal context is triggering a shift away from the colloquial variant in favor of the other two variants (it is important to note that it is not purely a shift toward the literary form as the sandwich form is used roughly a third of the time in those contexts). This is the case for both the WHL children and the EHL children, which underlines that the EHL children have acquired at least some stylistic aspects.
For Gwynedd, the situation is less clear because of the higher rates of the colloquial form overall, but by and large, it does seem the WHL students have the same pattern as the Cardiff students (job interview showing a sharper stratification than peer-to-peer and sociolinguistic interview). The EHL students show a shift, but for them, the division is between sociolinguistic interview versus job interview and peer-to-peer.
We will turn to the internal factors next, but at this point, the main findings are that (1) the colloquial variant is the most frequently used variant overall, (2) there is a substantial difference in use between Gwynedd and Cardiff, and (3) that the EHL students show style shifting depending on the level of formality of the context like their WHL peers, suggesting they have acquired L1-like sociolinguistic competence.
Lexical frequency
Table 7 presents the rates of the three variants for frequent and infrequent nouns for the four groups.
For all four groups (though only marginally for Cardiff Welsh home language students), infrequent nouns are more likely to occur with the literary and sandwich forms. The two Welsh home groups show that the decrease in the colloquial form is mainly linked to an increase in the sandwich variant as the rates of the literary variant barely shift. More frequent nouns appear, then, to show higher rates of colloquial use, and this is the case for both the WHL and the EHL speakers.
Grammatical person
We now move onto the second factor that has previously been studied, grammatical person.Footnote 15 Tables 8 and 9 show the distribution for each group. Recall that Davies (Reference Davies, Durham and Morris2016) had found that rates of the colloquial third-person singular and the first-person plural forms increased in younger speakers.
Low numbers in some categories (especially for the Gwynedd EHL group) mean that some rates may not be fully accurate, but for the two WHL groups, the two categories that Davies (Reference Davies, Durham and Morris2016) studied (third-person singular and firs-person plural) have higher rates of the colloquial form than overall usage. With respect to the third-person singular, this is also the case for the EHL groups. The first-person singular generally has higher rates of the colloquial variant than the remaining pronouns for all four groups.
When considering the other two variants, the patterns seem to be tied to community, more than home language. For the Gwynedd speakers, second-person plural has the highest rates of both the literary and the sandwich forms. Instead, in Cardiff, the rates of the literary form are highest in second-person singular, and for the sandwich form, they are highest in first-person plural. Based on this, the EHL students share some, but not all, of the general patterning with their WHL counterparts, but do share the more local patterns of use.
Although we included this factor in initial models, we found that it was never significant and that removing it improved the fit. We have nevertheless provided the group breakdowns here, because it is one of the few factors that has been previously studied.
Possessed noun category
A total of 590 (30%) alienable constructions and 1,382 (70%) inalienable constructions were identified in the corpus. Table 10 shows the rates of the three possessive pronoun constructions in alienable and inalienable possessum words, once again by region and home language.
Except for Cardiff WHL, inalienable nouns seem to favor the colloquial form slightly more than the alienable contexts.
Possessed noun language
Although all the most frequent possessed nouns in the corpus were Welsh, there were 71 English nouns with possessive forms in the corpus. Most English possessed nouns only appeared once in the corpus, but some were repeated (e.g., cousin [n = 10], job [n = 7], boss [n = 7]). Table 11 presents the possessive variants used by possessed noun language and speaker group.
Before turning to the differences in rates of the variants, it is worth pointing out that the rates of English versus Welsh nouns differ in the two communities. In Gwynedd, English nouns represent 8%–9% of the overall tokens, whereas in Cardiff, they represent 3%–4%. This is important for the overall distribution of the variants in the two communities, as English nouns are more likely to occur with the colloquial form than Welsh nouns. The overall numbers of English nouns are relatively low, however, and all four groups show a lower rate of the colloquial form with Welsh nouns, which demonstrates that the pattern is shared. For the Cardiff WHL group, the literary form is in fact used more frequently (42%) than the colloquial form (41%) with Welsh nouns.
Initial trends
Across the factor groups analyzed, there are differences in rates and patterns across the two communities. Within each community, there are differences in rates for the EHL and WHL students, but for many of the factors considered, the patterning is shared. This suggests that the EHL speakers have acquired the constraints of their peers, but that it is vital to consider this with respect to their local peers and not with how this feature might pattern elsewhere. We now turn to the statistical analysis.
Statistical analysis
In mixed-effects modeling, the colloquial variant was used as the dependent variable, which means that the model shows the likelihood of a speaker producing the colloquial variant (possessed noun + post-nominal pronoun) compared to the literary (prenominal pronoun + possessed noun) and sandwich (prenominal pronoun + possessed noun + post-nominal pronoun) variants. A general-to-specific approach was taken to the statistical modeling (Baayen, Reference Baayen2008:205; Nance, Reference Nance2015:565). The first model included all predictors as shown in the following R code: VARIANT ∼ CONTEXT * HOME LANGUAGE * REGION + NOUN LANG + CATEGORY + FREQUENCY + (1|PARTICIPANT) + (1|WORD). Nonsignificant predictors and interactions were then removed from the model one at a time and compared to the first model using a series of ANOVAs. If the removal of a nonsignificant predictor or interaction improved the fit of the model, then this model was retained. Following the analysis of the entire dataset, four separate models were then conducted, one for each group (Gwynedd EHL, Gwynedd WHL, Cardiff EHL, and Cardiff WHL).
Regression tables for the best-fitting model contain an intercept corresponding to a baseline combination of levels. Results Tables 12–16 show the fixed factors that were significant predictors of colloquial variant use. Regression coefficients (β) (labeled Estimate in these tables) for each term indicate deviations for the intercept and are included alongside z-values and p-values for the levels associated with each factor. A positive significant coefficient suggests that the named factor level was more likely to influence the production of the colloquial variant than the baseline factor level. Conversely, a negative significant coefficient indicates that the named factor level was less likely to result in the production of the colloquial variant than the baseline factor level. Grammatical person was not retained in the final model.
In the overall model, three factors show a significant effect: region, context, and noun language. For region, Gwynedd has higher rates than Cardiff, underlying that there is a clear regional dimension. For context, the job interview is least likely to demonstrate the use of the colloquial form, with the peer-to-peer and particularly the sociolinguistic conversation showing higher rates. With respect to noun language, colloquial possessive constructions are more likely to contain English nouns than Welsh nouns.
The model in Table 12 also shows a three-way interaction between context, home language, and region. The interaction suggests that there is a difference in the effect of the context within the home language and region groups. The effect is negative, meaning that the effect of the sociolinguistic interview on the production of the colloquial variant is less in the Gwynedd WHL group compared to the Cardiff EHL group. In other words, the effect of context on the production of the colloquial variant differs between the region and home language groups, and in some cases, these differences are significant. In order to examine this interaction further, and in order to investigate potential differences in the acquisition of sociolinguistic competence, we examined the data for the home language and region groups separately. Tables 13–16 present the model for each group. Note that in these models, noun category was not significant, and its omission significantly improved the model fit.
While nothing comes up as significant for the Gwynedd EHL group (this may be due to lower overall token numbers), context and noun language are significant for the other three groups and noun frequency is for the Cardiff WHL group. For noun language, the direction of the effect, that is, higher rates of the colloquial form with English nouns than Welsh nouns, is the same across all four groups. This is also the case for noun frequency: all groups have higher rates of the colloquial form for frequent nouns than for infrequent nouns. For context, however, there is a home language effect. Both Gwynedd and Cardiff WHL speakers increase their rates of the colloquial form from job interview to sociolinguistic interview to peer-to-peer, but for Cardiff EHL, the pattern is job interview to peer-to-peer to sociolinguistic interview. For Gwynedd EHL, the pattern is different again, peer-to-peer then job interview then sociolinguistic interview.
It seems then that the EHL speakers share the key internal factors of variation with their WHL peers, and they do vary in terms of context (unlike what was found in previous studies); however, they nonetheless do not completely share the hierarchy of the WHL speakers. Why might this be? We turn to this in the Discussion section.
Discussion
Although we initially expected peer-to-peer to be the most informal context (as no researcher was present), we found that this was not always the case for the students. First, the fact that the students were doing school-related tasks might have added to the formality of the situation. Second, in some of the sociolinguistic interviews, students commented that they did not use Welsh with some of their peers in informal situations. It stands to reason that if the students were using Welsh with peers, but that they speak to each other more naturally in English, then they might end up being slightly more formal than in situations where they would more frequently use casual Welsh (chatting to someone in Welsh). Although the students who reported not using Welsh with peers were from both WHL and EHL backgrounds, the majority of them spoke English at home. We suspect that this may partly account for the hierarchy differences found between the WHL and EHL groups.
On the whole, however, within individual communities, there is a large degree of shared patterns and constraints. The EHL students use the colloquial form at higher rates but share the patterning to at least some extent. The patterns of the WHL students have been acquired. The fact that there are clear regional differences and that they are found in the EHL groups too underlines this.
Other than noun language, none of the internal factors (including those discussed previously) came up as statistically significant. It may be that the generally high rates of the colloquial form left little room to uncover other effects or that none of them were relevant. Analysis of older speakers (who would be expected to have lower rates of the incoming colloquial form) may find that some did once contribute to the variation patterns.
Conclusions
Our aims in this analysis were twofold: to gain greater understanding of ongoing change in the use of the possessive pronoun variants in Welsh and to establish whether students in Welsh-medium schools from English-speaking home environments shared the patterns and constraints of their peers who came from Welsh-speaking homes (and who consequently had more opportunities to use Welsh and acquire sociolinguistic competence).
With respect to ongoing change, we can confirm what previous studies have suggested but also add new aspects to consider. For example, there is a significant difference in the rates of use of the colloquial variant between north and south Wales. Although rates of the colloquial variant were high throughout (58%), Gwynedd speakers used that variant 71% of the time and Cardiff speakers 48%. The overall high rates of the colloquial also serve to confirm that the use of this variant is increasing, especially in the younger age groups (which all our speakers belonged to). The effect of style, which had been previously discussed but not quantitatively analyzed, was confirmed in our analyses, with the most formal situation showing the lowest use of the colloquial form.
The high use of the colloquial variant underlines that despite its exclusion from some dictionaries and grammars (aimed at L2 speakers) and its supposed non-standardness, it is a part of the contemporary Welsh possessive pronoun system and, if the trend continues into future generations, may become the only variant. Currently, the literary and the sandwich variants are the only forms that tend to be discussed with Welsh learners. Our results suggest that it may be necessary to introduce the colloquial variant too, in order to improve the functional competence of students (see Auger, Reference Auger, Gass, Bardovi-Harlig, Magnan and Walz2002 for a discussion on the use of colloquialisms in formal teaching in Canadian immersion). Even if it turns out that it is not used frequently in writing (our analysis did not examine this), it is a very frequent form in many different formalities of speech. Future work could examine the speech of adult and later learners of Welsh, who may not be exposed to the more colloquial forms.
In terms of the sociolinguistic competence perspective, the Welsh speakers with EHL backgrounds show awareness of the fact that some variants are more appropriate in more formal contexts. They also demonstrate the ability to style-shift broadly similarly to their WHL peers. The other factors constraining the use of the colloquial variant are again shared across the groups. This is very different from earlier studies examining classroom acquisition of stylistic differences. It seems then that Welsh-medium classes where children with different backgrounds use Welsh together do enable children to acquire sociolinguistic competence.
There remain some differences between the groups, however; by and large, the EHL students view peer-to-peer settings, at least those of the type in our study, as more formal than WHL students and consequently use less of the colloquial form there than in the sociolinguistic interview. It may be, however, that with more opportunities to use Welsh with peers in non-classroom settings, the EHL would quickly shift to a more WHL pattern.
Competing interests
The authors declare none.