Do bilinguals get the joke? Humor comprehension in mono- and bilinguals

Emilia V. Ezrina; Virginia Valian

doi:10.1017/S1366728922000347

Do bilinguals get the joke? Humor comprehension in mono- and bilinguals

Published online by Cambridge University Press: 07 July 2022

Emilia V. Ezrina

and

Virginia Valian

Show author details

Emilia V. Ezrina*: Affiliation:
The City University of New York, Graduate Center, New York, USA Hunter College, New York, USA
Virginia Valian*: Affiliation:
The City University of New York, Graduate Center, New York, USA Hunter College, New York, USA
*: Address for correspondence: Emilia V. Ezrina or Virginia Valian, Ph.D., Department of Psychology, Hunter College, Room 510 TH, 695 Park Avenue, New York, NY 10065 Email: [email protected], [email protected]
Address for correspondence: Emilia V. Ezrina or Virginia Valian, Ph.D., Department of Psychology, Hunter College, Room 510 TH, 695 Park Avenue, New York, NY 10065 Email: [email protected], [email protected]

Article contents

Abstract
Introduction
General method
Experiment 1
Experiment 2
Experiment 3
General discussion
Conclusion
Competing interests
Data availability
References

Rights & Permissions

Abstract

Understanding jokes may differ between mono- and bilinguals because of differences in lexical access; fluency and sense of humor may also be relevant. Three experiments examined English-language joke comprehension in monolingual (n = 91) and bilingual (n = 111) undergraduates, Russian–English bilinguals (n = 39), and MTurk monolinguals (n = 77). Participants rated jokes and non-jokes in English as funny or not funny. We assessed the effects of bilingualism, language dominance, fluency, sense of humor, experience, and motivation on response time (RT) and sensitivity (d′) in identifying jokes. Bilingualism predicted neither RT nor d′ in mono- and English-dominant bilingual undergraduates; English fluency predicted d′. Russians were slower than English-dominant bilinguals but were more not less sensitive to humor. MTurk monolinguals were faster than undergraduates and equally sensitive; sense of humor predicted sensitivity. Overall, humor processing is alternately affected by fluency, sense of humor, and motivation, depending on the population. Bilingualism per se is not a factor.

Keywords

bilingualism humor processing fluency sense of humor

Type: Research Article
Information: Bilingualism: Language and Cognition , Volume 26 , Issue 1 , January 2023 , pp. 95 - 111

DOI: https://doi.org/10.1017/S1366728922000347 [Opens in a new window]
Copyright: Copyright © The Author(s), 2022. Published by Cambridge University Press

Introduction

What does it take to understand a joke in the dominant and non-dominant language? What other factors, besides language, could affect joke processing? We know relatively little about the mechanism of humor processing and the role of individual differences both across and within groups. In this paper we investigate the effects of a) knowing more than one language, b) fluency in English, c) sense of humor, and d) inferred skill and motivation.

For a joke to be successful, the first part (the setup) must activate one meaning in a listener or reader, while the punch line (the funny part of the joke) activates a different, less expected meaning. Humor is thus a type of mental exercise, a problem-solving task in which listeners and readers first access one set of verbal representations and then access a different set (Attardo, Reference Attardo1994; López & Vaid, Reference López, Vaid and Attardo2017; Suls, Reference Suls1972). The task is to understand the second meaning of the joke without an explicit explanation.

This has been labeled the world's funniest joke (Wiseman, Reference Wiseman2002):

Two hunters are out in the woods when one of them collapses. He doesn't seem to be breathing and his eyes are glazed. The other man whips out his phone and calls the emergency services. He gasps, “My friend is dead! What can I do?” The operator says, “Calm down. I can help. First, let's make sure he's dead.” There is a silence; then a gunshot is heard. Back on the phone, the guy says, “OK, now what?”

At first, we understand “Let's make sure he's dead” to mean ‘check to see if he is really dead’. However, the sound of the gun and the punchline, “OK, now what?” tells us that the hunter interpreted the suggestion differently, as ‘shoot him so that he is definitely dead’. Understanding the joke requires understanding the situation – what can happen in hunting – but it also requires the reader to realize that “let's make sure he's dead” has more than one interpretation. Access to multiple meanings can vary among speakers, and bilinguals may or may not understand all the meanings that a monolingual does.

Cross-language interaction studies show that even the non-dominant language can affect the dominant one (see Kroll, Bogulski & McClain, Reference Kroll, Bogulski and McClain2012 for review). Thus, lack of fluency or familiarity can reduce access to some meanings in the non-dominant language and the presence of the second language can reduce the access in the dominant language. For example, since bilinguals speak each of their two languages less frequently than monolinguals speak one language, lexical connections are weaker, and lexical access may therefore be more difficult (Gollan & Acenas, Reference Gollan and Acenas2004; Gollan, Slattery, Goldenberg, Van Assche, Duyck & Rayner, Reference Gollan, Slattery, Goldenberg, Van Assche, Duyck and Rayner2011). Or, since both of the bilingual's languages appear to be active and compete for selection at any given time, the activation of the second language (L2) may affect access to the first language (L1) (Green, Reference Green1998; Kroll, Bobb & Wodniecka, Reference Kroll, Bobb and Wodniecka2006; Meuter & Allport, Reference Meuter and Allport1999).

Depending on their background and motivation, monolinguals may also vary in what they find funny. Even if people understand that a joke is supposed to be funny, and understand why, they may still not find it funny. Not only do people vary in whether they find a joke funny, but they vary in how quickly they get the punchline. Jokes vary in how much esoteric knowledge they require.

A unified theory of humor would encompass not just cognitive, but also cultural, contextual, pragmatic, social, and motivational factors associated with understanding and appreciating humor (Hull, Tosun & Vaid, Reference Hull, Tosun and Vaid2017; López & Vaid, Reference López, Vaid and Attardo2017). Laughing at jokes is a pleasant and rewarding emotional activity as well as a cognitive one. Here, however, we focus on the cognitive aspect of verbal humor, using short one-liner jokes.

What makes jokes funny: Humor, cognition, and motivation

Since one of the main carriers of verbal humor is semantic incongruity, it can provide an important insight into bilinguals’ semantic access. Incongruity theories (Attardo & Raskin, Reference Attardo and Raskin1991; Raskin, Reference Raskin1987; Suls, Reference Suls1972) present humor processing as a two-stage task. The first stage consists of forming a prediction of the likely outcome of the setup. When that prediction is violated by the punch line, the second stage consists of understanding why the new, unpredicted meaning fits. Incongruity is not sufficient for a text to be funny (Giora, Reference Giora1991). The second meaning needs to be much less accessible, or almost inaccessible, so that the listener will not project it as a possible continuation of the setup. Semantic access to multiple meanings precedes any potential incongruity detection becoming the primary factor of humor processing. This point is particularly relevant for the relation of humor and bilingualism. Not knowing all meanings of a word or having very slow access to word meanings would alter humor processing.

Empirical evidence demonstrates that processing jokes is harder than non-jokes. Eye-tracking data show that reading jokes produces more regressions and requires longer viewing times than non-jokes (Coulson, Urbach & Kutas, Reference Coulson, Urbach and Kutas2006), and the punchline receives more fixations than the last elements of non-jokes (Ozdemir & Uysal, Reference Ozdemir and Uysal2016). ERP work allows attributing these processing difficulties to both semantic processing difficulties and surprise detection (Coulson & Kutas, Reference Coulson and Kutas2001).

Verbal humor thus not only requires higher-level processing compared to neutral text, but also semantic access which may vary across individuals. Additionally, some findings indicate that motivation may alter funniness ratings (Ayçiçeği-Dinn, Şişman-Bal & Caldwell-Harris, Reference Ayçiçeği-Dinn, Şişman-Bal and Caldwell-Harris2018), perhaps because more highly-motivated individuals work harder to understand jokes. Another variable that might independently influence participants’ responses is sense of humor (Martin, Puhlik-Doris, Larsen, Gray & Weir, Reference Martin, Puhlik-Doris, Larsen, Gray and Weir2003). Humor and motivation might be related: having a good sense of humor might motivate a reader to look for, and hence find, a joke (Martin, 1996; Martin & Lefcourt, Reference Martin and Lefcourt1984).

Humor, cognition, and bilingualism

Since jokes require rapid lexical access to less accessible word meanings, bilinguals and second language learners may be at disadvantage. Lower proficiency in a language could either slow humor processing in that language, or to cause failure to understand the joke, or both. Even if a speaker's two languages are equally dominant, semantic representations in one may map imperfectly onto the representations in the other. In that case, accessed meanings could compete with each other, potentially resulting in eventually successful yet slower processing. Note, however, that low fluency on the part of a monolingual could also result in slower processing or failure to understand a joke.

Since, by hypothesis, humor entails incongruity resolution, a cognitive skill in which bilinguals sometimes show an advantage (Bialystok, Reference Bialystok2009; Costa, Hernández & Sebastián-Gallés, Reference Costa, Hernández and Sebastián-Gallés2008; Costa, Hernández, Costa-Faidella & Sebastián-Gallés, Reference Costa, Hernández, Costa-Faidella and Sebastián-Gallés2009), bilinguals might understand jokes faster than monolinguals.

Relatively little is known about bilinguals’ processing of humor. Bilinguals who are reading in their non-dominant language spend disproportionately more time reading jokes than non-jokes compared to monolinguals (Ozdemir & Uysal, Reference Ozdemir and Uysal2016). Joke-reading thus appears to be more difficult for bilinguals, but whether bilinguals find the joke funny is unknown.

Therefore, another question arises: do bilinguals get the joke in their non-dominant language as well as in their dominant language? In general, lower proficiency L2 speakers find jokes funnier in their first language than in their L2 (Ayçiçeği-Dinn et al., Reference Ayçiçeği-Dinn, Şişman-Bal and Caldwell-Harris2018; Erdodi & Lajiness-O'Neill, Reference Erdodi and Lajiness-O'Neill2012, except for L1-dominant English–Hungarian speakers). Joke rating in L2, but not in L1, tends to be associated with subjective ease of understanding and L2 proficiency (Ayçiçeği-Dinn et al., Reference Ayçiçeği-Dinn, Şişman-Bal and Caldwell-Harris2018), suggesting that linguistic competence affects understanding and appreciation of humor in L2. Besides proficiency, L2 humor appreciation may be related to psychological investment in the language: high-proficiency L2 speakers who were teachers and interpreters rated L2 jokes as funnier than L1 jokes (Ayçiçeği-Dinn et al., Reference Ayçiçeği-Dinn, Şişman-Bal and Caldwell-Harris2018), suggesting a role for motivation.

Psychological investment in a language may not be the only non-linguistic factor affecting humor processing. Expertise, motivation, and other individual difference variables may affect performance. For example, researchers often use MTurk as a convenient and fast method to recruit participants (as does this study). Non-naïveté of MTurk participants may improve their performance, because they are more practiced, skilled in responding quickly, and savvy about the structure of experiments (Aguinis, Villamor & Ramani, Reference Aguinis, Villamor and Ramani2021; Harms & DeSimone, Reference Harms and DeSimone2015; Hauser, Paolacci & Chandler, Reference Hauser, Paolacci, Chandler, Kardes, Herr and Schwarz2019). Additionally, since MTurkers receive payment and the desirable masters status (if they are “good” participants), completing more studies in less time at a high level is in their interest. Another individual difference variable is sense of humor. People who pride themselves on having a good sense of humor may be more attuned to and better able to see the (intended) humor in a joke. In this study we include MTurk participants and measure all participants’ sense of humor.

Humor can be considered an example of figurative language, along with metaphor, irony, and sarcasm (Vaid, Reference Vaid2000). Metaphors are better remembered in the dominant language (Harris, Friel & Mickelson, Reference Harris, Friel and Mickelson2006) and are used more by translators that translate into their dominant language (Saygin, Reference Saygin2001). Futhermore, late non-balanced proficient bilinguals can have delayed access to figurative and ironic meanings and prioritize literal meanings in their L2 (Bromberek-Dyzman, Reference Bromberek-Dyzman and Heredia2015; Cieślicka, Reference Cieślicka2006; Matlock & Heredia, Reference Matlock, Heredia, Heredia and Altarriba2002).

In summary, bilingualism studies suggest humor processing differs between a speaker's dominant and non-dominant language (Ayçiçeği-Dinn et al., Reference Ayçiçeği-Dinn, Şişman-Bal and Caldwell-Harris2018; Erdodi & Lajiness-O'Neill, Reference Erdodi and Lajiness-O'Neill2012; Ozdemir & Uysal, Reference Ozdemir and Uysal2016). Humor is processed more slowly in the less dominant language (Ayçiçeği-Dinn et al., Reference Ayçiçeği-Dinn, Şişman-Bal and Caldwell-Harris2018; Ozdemir & Uysal, Reference Ozdemir and Uysal2016) and at a shallower level in the less dominant language. That ultimately results in subjective perception of jokes as less funny in L2 than L1 (Cieślicka, Reference Cieślicka2006; Matlock & Heredia, Reference Matlock, Heredia, Heredia and Altarriba2002). Whether that result is due to speaking multiple languages or to lack of fluency in the language of the joke is not clear. In addition, other variables, such as motivation and sense of humor, may interact with language status.

The present study

This study compares English joke comprehension among four groups of participants to determine the roles of English language proficiency (measured via picture-naming and a verbal fluency task), self-rated sense of humor, and, indirectly, motivation and skill. The four participant groups were: monolingual English-speaking college students, bilingual English-dominant college students, bilingual Russian-dominant adults recruited from Russia (all of whom immersed in a Russian-speaking environment), and monolingual English-speaking Amazon Mechanical Turk (MTurk) adults.

Participants read 40 short passages in English via rapid serial visual presentation (RSVP) and rate each passage as ‘funny’ or ‘not funny’. We measure participants’ response time and sensitivity to the jokes (d′) and assess their sense of humor via a questionnaire based on the Humor Styles Questionnaire (Martin et al., Reference Martin, Puhlik-Doris, Larsen, Gray and Weir2003).

In Experiment 1, we compare the two college student groups in order to detect the role of knowing an additional language in two demographically similar groups; one monolingual and the other comprised of English-dominant bilinguals. The latter might take longer to make their judgments than monolinguals because several meanings and semantic schemata are activated simultaneously through both languages (Ayçiçeği-Dinn et al., Reference Ayçiçeği-Dinn, Şişman-Bal and Caldwell-Harris2018; Kroll & Stewart, Reference Kroll and Stewart1994), complicating semantic access. On the other hand, English-dominant bilinguals may be so skilled at suppressing their non-dominant language that they might either show no difference with monolinguals or even show advantages (Bialystok, Reference Bialystok2009; Costa et al., Reference Costa, Hernández and Sebastián-Gallés2008, Reference Costa, Hernández, Costa-Faidella and Sebastián-Gallés2009).

In Experiment 2 we assess the role of English dominance comparing the English-dominant bilinguals from Experiment 1 with Russian-dominant bilingual adults. For Russian-dominant speakers the accessibility of meaning and semantic schemata is presumably weaker, and could result in longer reaction times and less understanding of jokes. Although our choice of jokes was intended as pan-cultural, to the extent that jokes are culture-bound, their meanings may be less accessible to speakers in another culture as well as language. Some bilinguals may be at a disadvantage regarding the emotional component of meaning (Caldwell-Harris & Ayçiçeği-Dinn, Reference Caldwell-Harris and Ayçiçeği-Dinn2009; Marian & Kaushanskaya, Reference Marian and Kaushanskaya2008; Pavlenko, Reference Pavlenko2002; Rosselli, Vélez-Uribe & Ardila, Reference Rosselli, Vélez-Uribe, Ardila, Ardila, Cieślicka, Heredia and Rosselli2017) and thus less sensitive to humor in English.

Finally, in Experiment 3, we assess the importance of motivation and skill comparing monolingual MTurk participants with the monolingual English students (from Experiment 1). We expect MTurk participants to be faster than college students, since the non-naïveté of MTurkers is a common concern (Aguinis et al., Reference Aguinis, Villamor and Ramani2021; Hauser et al., 2018; Harms & DeSimone, Reference Harms and DeSimone2015). Since both groups share the same culture, we expect the two groups to be equally sensitive to jokes.

General method

Materials

Jokes

The materials consisted of 40 jokes (the average length was 14.6 words) from various internet sources and 40 non-jokes (the average length was 13.9 words). The non-jokes replaced the punch lines with a neutral phrase, eliminating the humor, as the example below demonstrates. By choosing short jokes we could test a relatively large number of jokes and easily create non-funny versions by changing the punch line. We use items that were independently rated as funny, along with non-funny variants of those items. That allows us to tease apart sensitivity (whether people get the joke) and speed of processing (how long does it take to make the judgment).

Example:

Funny: I asked to switch seats on a plane because I was seated next to a crying baby. Apparently, that's not allowed if it's yours.

Not funny: I asked to switch seats on a plane because I was seated next to a crying baby. Apparently, that's not allowed if all the seats are taken.

The jokes were selected from a pool of 79 jokes culled from various sources. Six individuals (five of whom were English native speakers) rated the humor of the jokes on a 1–10-point scale, 1 being not funny and 10 — funny. Jokes with a mean above 5 were selected for the experiments. Possibly offensive jokes and jokes containing cultural references were avoided. Forty jokes fitting our criteria were selected. In the majority of jokes the humor was based on incongruity. However, some funny passages lacked obvious incongruity (e.g., The biggest lie I was told in school was that I wouldn't always have a calculator with me; Just the thought of having insomnia keeps me awake at night). Several jokes relied on violating expected semantic scripts, rather than engaging various meanings of a word. For example, the joke A clean house is a sign of a broken computer creates the following expectation: A clean house is a sign of neat people. The joke violates this expectation using the broken computer punch line replacing neat people with bored people. Semantic access that can vary across individuals and particularly bilinguals is required to understand the humor of all jokes. Comprehension was assessed on a group of Russian-dominant young adults (N = 5) who read the 40 items (20 jokes and 20 non-jokes; none of the testers was presented with the funny and not funny version of the same item.) and decided whether each entry was funny. Their responses were not recorded because their judgments were irrelevant for grammar and vocabulary comprehension assessment. The testers were asked whether they had any problems understanding particular words or grammar. We were able to determine that all our items could be understood, and, consequently, no items were removed from the set of 40. The final list, as well as percent of correct responses by group, is provided in Appendix 1.

Each participant in the experiment proper read 40 items, 20 jokes and 20 non-jokes. For each participant, the 20 jokes were selected randomly and the non-jokes were the complement group. Thus, each item appeared only once per session in either the funny or not funny version. Each session included five practice trials.

Objective measure of English fluency

Participants’ fluency in English was measured by picture naming and semantic verbal fluency. The 36 pictures in the picture-naming task were taken from the Snodgrass and Vanderwart (Reference Snodgrass and Vanderwart1980) standardized picture set. Participants named the items on the picture by typing the name in the appropriate field; in the scoring procedure, each correct response was credited 1 point. The total number of correct responses was used in the analysis. The scoring was carried out using Python code comparing each response to a set of anticipated potential responses. It accounted for potential typos and misspellings as well as alternative names for the pictured objects. The answer was scored correct if it matched one of the predefined answer alternatives.

In the verbal fluency task, participants named as many animals as they could in 60 seconds, by typing the words in the appropriate field; in the scoring procedure each response was credited 1 point. The total number of responses was used in the analysis. Scoring was carried out using code in Qualtrics that provided the count of all responses, assuming that each response was on a separate line. The cases containing more than one word per line or an empty line were assessed manually. The coding of this task thus used extremely liberal criteria.

For further analysis, we calculated a composite measure of fluency by combining the picture naming and verbal fluency scores, based on the positive correlation between the two measures (r = .35 (Pearson), p < .001). First, we normalized each scale using z-scores and calculated each participant's z-score in each task. Then, we averaged those values to obtain a single value representing objective fluency for each participant.

Language background assessment

To assess the demographic and language background, participants completed a questionnaire based on the LHQ (Li, Zhang, Tsai & Puls, Reference Li, Zhang, Tsai and Puls2014) and LEAP-Q (Marian, Blumenfeld & Kaushanskaya, Reference Marian, Blumenfeld and Kaushanskaya2007). First, participants listed the languages they knew in order of dominance. Then, participants rated their subjective fluency in listening, speaking, reading, and writing in English and other languages on a scale of 1- 7; the scores for each skill were averaged. If participants’ average scores in their second language were 3.5 or higher, they were classified as bilingual. The purpose of this assessment was to establish the language status of the participants: bilingual or monolingual, as well as language dominance (the language with the highest average score was classified as dominant). In all cases the participant's indicated order of dominance coincided with dominance assessed through the self-rating. Only this information about bilingualism and dominance was included in the subsequent analysis. In addition, participants provided information about their age, gender, and handedness.

Subjective assessment of sense of humor

Participants filled out a six-question sense of humor questionnaire (see example below; refer to Appendix 2 for full questionnaire) based on the Humor Styles Questionnaire (Martin et al., Reference Martin, Puhlik-Doris, Larsen, Gray and Weir2003) and modified to maximize ease of understanding. Participants used a 0–100 slider scale, anchored at one end with ‘strongly disagree’ and at the other end with ‘strongly agree’. The scale was not visibly numbered. The average of the six questions was used for a humor score.

Example:

I am a naturally funny person – I easily make other people laugh.

Procedure

Tasks were administered as a part of a more extended procedure in which participants also performed three executive function tasks for a different study not reported here. The tasks were presented in the following order: joke judgment task, humor questionnaire, executive function tasks, picture naming, verbal fluency, and language history questionnaire. All tasks were completed on-line in English. The entire procedure took between 40 and 60 minutes. All tasks were implemented as a client-side web application using JavaScript and TypeScript. The humor and the language history questionnaires were administered through Qualtrics.

Stimuli were displayed word by word using the Rapid Serial Visual Presentation (RSVP) method at a (non-rapid) rate of four words per second in the center of the computer screen in large black font (Potter, Reference Potter1993). Thus, all participants received the same presentation speed, had no control over the rate at which each word was presented, and could not backtrack. The paradigm ensured that all participants read at the same rate; it prevented long saccades and regressions. Participants saw each trial and each word only once. Each trial was preceded by a fixation cross for 1000 ms, followed by a joke or its non-joke equivalent. At the end of a trial the participants rated the item as ‘funny’ or ‘not funny’ by pressing the “p” or the “q” keys on the keyboard, respectively. The participants were instructed to respond as fast as they could.

Each response and response time (i.e., time from the last word of the phrase onset until the key press) was recorded in a data frame. Response times and accuracy were used in subsequent analyses. Sensitivity was assessed through d′ – a function of the number of correct and incorrect answers the participant gave in the joke and non-joke conditions – hits, correct rejections, false alarms, and misses. A ‘hit’ occurred if the trial contained a joke and the participant responded that it was funny; a ‘miss’ occurred if the participant responded that the joke was not funny; a ‘false alarm’ occurred if a participant rated a non-funny trial as funny; and a ‘correct rejection’ occurred if the participant rated a non-funny trial as not funny. Our use of signal detection theory in non-objective detection tasks follows existing research. For example, signal detection theory has been used in judgments of grammatical acceptability (Huang & Ferreira, Reference Huang and Ferreira2020), sensitivity to gender cues (Mitrofanova, Urek, Rodina & Westergaard, Reference Mitrofanova, Urek, Rodina and Westergaard2021), and referral rates of general practitioners (Kostopoulou, Nurek, Cantarella, Okoli, Fiorentino & Delaney, Reference Kostopoulou, Nurek, Cantarella, Okoli, Fiorentino and Delaney2019).

Data analysis

Data cleaning for reaction time (RT) removed responses below 200 ms and above 8000 ms (2.5% of data) and incorrect answers. Only accurate responses were included in the RT analysis. Accuracy ranged from 65-71% for jokes and 78-84% for non-jokes across groups. Given the subjectivity of humor and the absence of any “norms” for jokes, we consider this degree of accuracy acceptable. Most participants got most of our jokes and rejected most of our non-jokes; we achieved the desired effect from humor. The one-way ANOVAs comparing hit and correct rejection rates across groups was not significant (F(3, 314) = 1.2, p = .31 and F(3, 314) = 2.24, p = .08 respectively).

Data cleaning for d′ removed overly short or long responses but included both accurate and inaccurate responses. Inaccurate responses are necessary for d′ analysis.

For both RT and d′ we conducted hierarchical regression models to determine whether handedness, age, gender, fluency score, humor score, and language group were significant predictors. This analysis was carried out separately for the three language group pairs: English-dominant bilingual and monolingual college students residing in the US (Experiment 1); English-dominant US bilinguals and non-English-dominant bilinguals who were native speakers of Russian residing in Russia (Experiment 2); monolingual college students and monolinguals from M-Turk residing in the US (Experiment 3).

Descriptive statistics of raw data (means and standard deviations, as well as percentages for categorical variables) for the independent and dependent variables of interest are provided in Table 1. For all subsequent analyses we used log-transformed response time values. This method avoided any potential effects of skewed distributions. The accepted level of significance for all analysis is α = .05.

Table 1. Means (SD) for Reaction Time, d′, Fluency, and Humor in American Mono- and Bilingual Groups in the Four Experimental Groups

Experiment 1

Method

Participants

Initially, 195 participants were recruited from the undergraduate subject pool in the expectation of a sample of appropriate size. Since the sample did not contain enough monolinguals, an additional group of 72 monolingual undergraduate college students from the same pool was recruited, for a total of 267 initial participants. Data of 202 participants were used for further analysis. The excluded participants a) failed to complete the entire task, b) experienced software failure, or c) were non-English-dominant bilinguals. (This experiment specifically targeted English-dominant bilinguals.) All participants reside in the US and are immersed in English.

Participants were divided into monolingual and bilingual groups based on self-assessments of their proficiency in listening, speaking, reading, and writing in their language(s). The participants rated each ability on a scale from 1 to 7, 1 being not fluent at all, and 7 being native. The average score was then used as the criterion for inclusion in the bilingual group. If the average across the four abilities in an individual's L2 was lower than 3.5, the individual was categorized as functionally monolingual. Data of 91 English native speaking monolinguals (35 male, 56 female) and 111 bilinguals (37 male, 74 female) were included. Of the bilinguals, 96 were English-dominant and 16 were lifelong balanced bilinguals. The latter were included because they are immersed in English and are expected to perform on a par in daily life with English-dominant bilinguals. Bilingual participants spoke a variety of second languages; most numerous among them were Spanish (N = 54), Chinese, Mandarin, or Cantonese (N = 27), and Bengali (N = 18). All but eight were immersed in an English academic environment at least since middle school; the eight were immersed in an English academic environment in high school or college. The mean age was 21 (SD = 4.7) for the monolingual group and 20 (SD = 3) for the bilingual group. All students received course credit for their participation, completed the tasks voluntarily, and signed an online informed consent form.

Results

For descriptive statistics for these groups, refer to Table 1. Only correct response times between 200 and 8000 ms were entered into the RT analysis. The overall correct response rate across jokes and non-jokes was 73% for monolinguals and 73% for bilinguals.

Four hierarchical regressions were separately calculated for response time and d′ as dependent variables. The first model included age, gender, and handedness as predictors, the second model added fluency, the third model added humor scores, and the fourth model added language group.

Response time

As shown in Table 2, no model for reaction time was significant, no model had any significant predictors, and no model accounted for any appreciable variance. Monolingual and bilingual college students responded equally quickly to the jokes and non-jokes. Model 1 yielded a significant intercept (p < .001), with adjusted R² = −.001 (F(3, 198) = .9, p = .44); Model 2: F(4, 197) = .7, p = .60, adjusted R² = −.006; Model 3: F(5, 196) = .77, p = .57, adjusted R² = -.006; Model 4: F(6, 195) = .9; p = .58, adjusted R² = −.006. (A negative adjusted R² is possible when R² is very small.).

Table 2. Estimates (St. Errors) of Hierarchical Regressions for response time and d′ in American Mono- and Bilingual Groups.

‘*’, ‘**’, and ‘***’ indicate levels of significance at α = .05, α = .01, and α = .001 respectively.

d′

As shown in Table 2, Model 1 for d′ yielded only a significant intercept, with adjusted R ² = .009 (F(3, 198) = 1.61, p = .19). Model 2, which added fluency, was significant (F(4, 197) = 2.96, p = .02, adjusted R ² = .04); fluency was a significant predictor (t(197) = 2.6, p = .01). Model 3, which added humor score, was significant (F(5, 196) = 2.90, p = .015), adjusted R ² = .05); only fluency was significant (t (196) = 2.66, p = .008). Model 4, which added language group, was significant (F(6, 195) = 2.4, p = .03, adjusted R ² = .04); again, only fluency was significant (t(195) = 2.7, p = .008).

Model 2, which included fluency, significantly improved upon Model 1 F(2, 197) = 6.87, p = .01). Model 3 did not differ from Model 2 (F(3, 196) = 2.55, p = .11), and Model 4 did not differ from Model 3 (F(4, 195) = .12, p = .73). Thus, neither sense of humor nor language status contributed to explaining variance above and beyond fluency.

Discussion

Our comparison of monolingual and bilingual college students showed no response time difference in joke detection, nor did any of our measures account for reaction times. Thus, none of age, sex, handedness, fluency, humor sensitivity, or language group variables played a role in accounting for the variance in response time.

Although fluency did not affect response time, it did predict sensitivity: high fluency was associated with high d′. It may be surprising that fluency played a role, since all participants were college students enrolled in English-language courses. The fluency range was, however, rather large; bilinguals had apparent lower fluency than monolinguals on average (.08 and .20, respectively, but the difference was not significant (t(198.7) = −1.7, p = .07)). The association between fluency and sensitivity might reflect a common mechanism – semantic processing – underlying humor comprehension, picture naming, and verbal fluency. No other variables, including language group, predicted differences in d′.

Our methods allowed us to distinguish between the importance of being bilingual and the importance of being fluent in English. The contribution of fluency suggests that fluency, rather than bilingual status by itself, contributes to joke comprehension. It further suggests that whatever semantic complexity is added by knowing more than one language does not automatically interfere with joke comprehension. Bilingualism neither contributes to nor detracts from joke comprehension in the dominant language. Sensitivity to jokes in English on the part of monolinguals and English-dominant bilinguals appears to be affected only by English fluency. Our results suggest that a bilingual's non-dominant language is either not accessed during English comprehension or is accessed with little cognitive cost.

Based on the results of Experiment 1, in which response time was not influenced by any measured variable, and sensitivity was affected only by fluency, we conclude that sensitivity and response time involve separate mechanisms.

Experiment 2

The lack of difference between English-speaking monolinguals and English-dominant bilinguals might simply reflect the fact that English dominance and living in the United States made our humor easy to detect. Accordingly, we sought a group of participants for whom English was not a dominant language. Experiment 2 examines humor detection in native Russian speakers living in Russia for whom English was a second language. If the US participants’ sensitivity was due to their facility with English and their residence in the US, then Russians should perform less well than American monolinguals, responding more slowly and getting the joke less often.

Method

Participants

Sixty-five Russian native speaker bilinguals and multilinguals participated in the study. The participants were recruited through personal contacts and social media. All but two participants were Russian native speakers: one was a balanced Russian–Ukrainian bilingual, and one was a Russian–English bilingual). All participants were proficient in English and immersed in the Russian language. Data of 26 participants were excluded as incomplete cases, leaving data from 39 participants (23 female, 16 male). The average age of the participants was 26.4 (SD = 5.6). The participants received no compensation, completed the tasks voluntarily, and signed an online informed consent form.

Results

Descriptive statistics (means and standard deviations) for response time and d′ are provided in Table 1. Only correct responses were included in the response time analysis. The rate of correct responses for the Russian participants was 77%. Data from the bilingual participants from Experiment 1 are included for comparison. We compared the Russian bilinguals with the bilingual group of undergraduate students in Experiment 1, using the same four hierarchical models.

Response time

All models were significant (unlike Experiment 1, where no models were significant). Model 1 was significant (F(3, 146) = 4.541, p = .004, adjusted R ² = .07). Age was significant (t(146) = 3.4, p = .0009), with older individuals having longer response times (Pearson r = .29, p = .0003). Model 2, adding fluency, was significant (F(4, 145) = 5.3, p = .0006), adjusted R ² = .10). Both age (t(145) = 2.33, p = .02) and fluency (t(145) = −2.641, p = .01) were significant, with lower fluency associated with longer response times. Model 3, adding humor, was significant (F(5, 144) = 4.25, p = .001), adjusted R ² = .10). Age and fluency remained significant (t(144) = 2.31, p = .022, t(144) = −2.53, p = .01, respectively), but humor did not contribute. Finally, although Model 4, which added language group, was significant (F(6, 143) = 3.65, p = .002), adjusted R ² = .10), there were no significant predictors. A follow-up ANOVA showed that Model 2 – which included fluency – differed significantly from Model 1 (F(2, 145) = 6.79, p = .01). Models 2 and 3 did not differ (F(3, 144) = .32, p = .57), nor did Models 3 and 4 (F(4, 143) = .73, p = .39). These results show that response time in bilinguals is related to fluency, not language group.

d′

For d′, Model 1 was not significant (F(3, 146) = 2.37, p = .08, adjusted R² = .03) but showed a marginally significant effect of age (t(146) = 1.97, p = .051), with older individuals showing more sensitivity to humor. Model 2 was significant (F(4, 145) = 3.69, p = .007, adjusted R² = .07); all predictors except gender were significant (age t(145) = 2.83, p = .005; handedness t(145) = −2.11, p = .037, with higher d′ for left-handed individuals; fluency t(145) = 2.73, p = .007). Model 3 was significant (F(5, 144) = 3.20, p = .01, adjusted R² = .07), but the addition of humor did not improve the model (age t(144) = 2.84, p = .005; handedness t(144) = −2.20, p = .03; fluency t(144) = 2.59, p = .01). Model 4 was significant (F(6, 143) = 4.2, p = .0006), with adjusted R ² = .11. Both fluency (t(143) = 3.7, p = .0003) and language group (t(143) = −2.93, p = .004) were significant. Unexpectedly, the Russian participants had significantly higher d′ values than American college students. Sense of humor was not a significant predictor.

Follow-up ANOVAs showed that Model 2 differed significantly from Model 1 (F(2, 145) = 7.85, p = .006). Model 3 did not differ from Model 2 (F(3, 144) = .54, p = .26). Model 4 improved significantly on Model 3 (F(4, 143) = 8.59, p = .004). The difference between the bilingual American college students and the Russians was not confined to fluency; (lack of) fluency detracted from Russians’ d', while something about being Russian that is not captured by fluency contributed to their being more sensitive than Americans. The hierarchical regressions for Experiment 2 are summarized in Table 3.

Table 3. Estimates (St. Errors) of Hierarchical Regressions for Response Time and d′in Russian and American Bilingual Groups.

‘*’, ‘**’, and ‘***’ indicate levels of significance at α = .05, α = .01, and α = .001 respectively.

We found no evidence for a speed-accuracy trade-off. The analysis showed a significant negative correlation (Pearson r (37) = −.61, p < .001) between response time and d′, suggesting lower sensitivity is associated with longer time. Thus, Russians take longer, but their greater response time is not directly linked to their greater accuracy.

Discussion

Analysis of the response time data in English-dominant and Russian-dominant bilinguals suggests that longer response time for Russians is primarily due to their lower English fluency, as measured by picture naming and verbal fluency tasks. For the American bilingual participants, English was either the dominant language or co-dominant language. In contrast, the Russians’ English experience was limited. The contribution that bilingualism makes to response time seems accounted for by fluency; once fluency is in the mix, participant group is not a predictor.

But d′ was different. Russians – surprisingly – were more sensitive to our jokes than Americans were. In this experiment, as in Experiment 1, English fluency was a significant predictor of d′. Although Americans as a group were more fluent than Russians, Russians nevertheless outperformed American bilinguals. Our Russian participants may have been especially interested in the study or especially highly motivated participants. The Russians’ participation was voluntary and uncompensated, suggesting genuine interest and desire to help research. We excluded a higher percentage of Russians as incomplete cases than of any other group; the remaining Russians may have been particularly highly motivated performers. Highly motivated participants in some studies have shown strong performance (e.g., Ayçiçeği-Dinn et al., Reference Ayçiçeği-Dinn, Şişman-Bal and Caldwell-Harris2018).

One might be tempted to invoke the stereotype of Russians as using humor and satire, euphemisms, and double-speak as playing a role in Russians’ sensitivity to our jokes. Note, however, that Russians did not score higher than Americans on their self-assessed sense of humor, and sense of humor did not play a role in accounting for d′differences. There is thus no empirical support for this speculation and no evidence that Russians in particular look for double meanings in foreign texts.

Taken together, Experiments 1 and 2 suggest that bilinguals indeed get the joke. Bilingualism neither facilitates nor interferes with humor processing in a systematic way. American young adult bilinguals who are (co-)dominant in English are as fast and as sensitive to humor as American young adult monolinguals. Semantic fluency, as measured by picture naming and verbal fluency, is a significant variable in accounting for getting the joke within the American participants.

As would be expected, Russian bilinguals are slower than American bilinguals, a difference accounted for by Russians’ lower fluency. But, although slower than American bilinguals, Russians are more, not less, sensitive to our jokes, even though their lower fluency was a negative contributor. Our jokes are thus not bound to American culture, even though they were taken from American sources. Some property of Russians other than fluency and sense of humor contributes to their superiority in getting the joke.

Experiment 3

Experiment 2 compared two different bilingual groups. In Experiment 3 we compare two different monolingual groups in order to further assess the possible roles of humor and motivation to do a good job and to assess skill at being a participant. The first group is the monolingual undergraduate college students from Experiment 1. They participated in our study for course credit. The participants in the second group were recruited from the Amazon Mechanical Turk (MTurk), who may be highly motivated to perform quickly (in order to make more money faster) and well (in order to achieve status as “good” subjects). MTurk subjects are known to be non-naïve to a variety of experimental procedures, having been involved in many studies. This could also increase their performance. Thus, Experiment 3 is intended to explore the role of factors that are not related to language.

Method

Participants

One hundred and three participants were recruited through Amazon Mechanical Turk (MTurk). The goal was to recruit monolinguals only; 19 participants who were not functionally monolingual were excluded from analysis. Out of the remaining 84 participants, data of an additional five were excluded as incomplete cases. The data of 77 participants (43 female, 34 male) were analyzed. All remaining participants were monolingual English speakers resident in the US. The average age of the participants was 37 (SD = 10.5). All participants received $6.50 for completing the tasks, completed the tasks voluntarily, and signed an online informed consent form.

Results

Descriptive statistics (means and standard deviations) for response time and d′ are provided in Table 1. Only responses for correct answers entered the analysis. The rate of correct responses for the MTurk group was 77%.

Since this group consisted of monolinguals only, it was compared with the monolingual undergraduate student group using the same hierarchical models.

Response time.

For response time, Model 1 was significant (F(3, 164) = 6.11, p = .0006), adjusted R ² = .08). Age was significant, with, surprisingly, older age predicting faster responses (t(164) = −3.95, p = .0001). Model 2 was significant (F(4, 163) = 4.66, p = .001, adjusted R ² = .08), with age again a significant negative predictor (t(163) = −3.91, p = .0001); fluency did not play a role. Model 3 was significant (F(5, 162) = 3.96, p = .002) with adjusted R ² = .08; it again yielded a significant effect of age (t(162) = −3.95, p = .0001); neither fluency nor humor played a role. Model 4 was significant (F(6, 161) = 9.12, p < .001) with adjusted R ² = .23: MTurkers were significantly faster than our monolingual college student participants (t(161) = 5.59 p < .001; righthanders were (surprisingly) significantly faster than lefthanders (t(161) = −2.34, p = .02). In addition, gender emerged as a significant predictor (t(161) = 1.9, p = .05), with women responding faster than men.

A follow-up ANOVA showed that Models 1 and 2 did not differ from each other (F(2, 163) = .047, p = .50), nor did Models 2 and 3 (F(3, 162) = 1.37, p = .24); Model 4 was significantly different from the others (F(4, 161) = 31.21, p < .001), and accounted for the most variance.

d′

For d′, Model 1 was significant (F(3, 164) = 6.85, p = .0002), adjusted R ² = .10). There was a significant effect of age, with older participants more sensitive than younger ones (t(164) = 4.26, p < .001). Model 2 was also significant (F(4, 163) = 5.56, p = .0003), with adjusted R ² = .10. Age was again significant (t(163) = 4.14, p < .001); fluency did not play a role. Model 3 was significant (F(5, 162) = 7.30, p < .001), with adjusted R ² = .16. There were significant effects of age (t(162) = 4.17, p < .001) and humor score (t(162) = 3.55, p = .0005). Model 4 was also significant (F(6, 161) = 6.05, p < .001) with adjusted R ² = .15. There were again significant effects of age (t(161) = 3.0, p = .003) and humor score (t(161) = 3.51, p = 0.0006), but no effect of group.

Models 1 and 2 did not differ significantly (F(2, 163) = 1.72, p = .19), while Model 3 explained more variance than did Model 2 (F(3, 162) = 12.6, p = .0005), showing the importance of humor in this comparison. Model 4 did not differ from Model 3 (F(4, 161) = .01, p = .9), indicating that adding the group predictor did not explain more variance than age and humor. The hierarchical regressions for Experiment 3 are summarized in Table 4.

Table 4. Estimates (St. Errors) of Hierarchical Regressions for Response Time and d′ in MTurkers and Monolingual College Student Groups.

‘*’, ‘**’, and ‘***’ indicate levels of significance at α = .05, α = .01, and α = .001 respectively.

Discussion

The third experiment, comparing two groups of monolinguals, shows that monolinguals from MTurk are remarkably faster than undergraduate student monolinguals, despite their older age. The response time difference cannot be explained by language status, since both groups are monolingual. Nor were fluency or sense of humor a predictor for reaction time. MTurkers’ speed presumably reflects their greater practice with on-line tasks and their motivation to finish a task quickly in order to be paid. (We return to these points in the General Discussion.)

MTurkers and our monolingual college students did not differ in sensitivity, although MTurkers had a numerically higher d′. Instead, both age and sense of humor predicted sensitivity. MTurkers were on average older than our college students, with equivalent humor questionnaire scores. In this combined group of monolinguals, sense of humor and age provided an edge. Sense of humor did not emerge in comparisons between mono- and bilinguals (Experiment 1) or between different groups of bilinguals (Experiment 2), only here in Experiment 3. In addition, fluency did not predict sensitivity in this experiment, unlike in Experiments 1 and 2. That is presumably because, among monolinguals, fluency is less of an issue.

General discussion

This study explored different factors that account for joke comprehension and humor processing. In three experiments, two monolingual and two bilingual groups read and rated jokes and non-funny sentences. The results demonstrate that different factors contribute to a different extent in different groups. Bilingualism can be a factor, but it can interact with or be overshadowed by other variables, depending on the groups that are being compared. Here, depending on the experiment and the measure (response time or d′), age, handedness, fluency, sense of humor, language status, and inferred motivation could all influence participants’ performance. Table 5 summarizes our findings, providing adjusted R² values and lists of predictors for the model that accounted for most variance.

Table 5. Summary of Regression Results

If humor is approached as a cognitive and motivational exercise of incongruity resolution, we might expect to find different contributions of factors like age (Ruch, McGhee & Hehl, Reference Ruch, McGhee and Hehl1990; Ruch, Reference Ruch and Raskin2008; Schaier & Cicirelli, Reference Schaier and Cicirelli1976), language fluency (Bell, Reference Bell2011; Bell & Attardo, Reference Bell and Attardo2010), and sense of humor as a function of the groups that are being compared. Most tasks require the integration of a number of different cognitive processes as well as motivation to perform well (Valian, Reference Valian2015). Performance on any task is the outcome of the joint operation of many factors. Depending on the characteristics of the groups that are being compared, one or another cognitive property may come to the fore as playing an explanatory role. The experiments described here demonstrate the multiplicity of factors that may be implicated in responding to a joke.

In the first experiment, we compared a monolingual and a bilingual group from the same population pool – undergraduate psychology students. We found no response time or sensitivity (d′) differences between the two groups. The bilinguals – who were largely English dominant – did not suffer from having another language available. Apparently, participants’ L2 neither impaired nor helped humor processing. This shows equally strong connections between English words and their meanings for monolinguals and English-dominant bilinguals.

Fluency was a significant predictor of d′ in both comparisons involving monolingual and bilingual participants. That suggests that the same mechanisms might support semantic and lexical access (picture naming and verbal fluency) and humor processing. A regression analysis including age, handedness, fluency, and sense of humor, run on each group separately, showed that fluency was a significant predictor for bilinguals (t(107) = 2.9, p = .008), but not for monolinguals (t(87) = 1.1, p = .29). This finding suggests that semantic access may work differently for individuals with high and low fluency. Thus, for sensitivity to humor, fluency may or may not play a role when more than one language is involved.

It is worth noting the diverse linguistic background of the bilingual participants in Experiment 1. They spoke a plethora of second languages, including but not limited to Spanish, Chinese, Bengali, French, German, Punjabi, Russian, Japanese, Korean, Albanian, Tagalog, etc. This diversity introduces an unmeasured level of noise in the data. However, it reflects the demographic situation in New York City and thus contributes to the ecological validity of this study.

Based on existing research (Ayçiçeği-Dinn et al., Reference Ayçiçeği-Dinn, Şişman-Bal and Caldwell-Harris2018; Ozdemir & Uysal, Reference Ozdemir and Uysal2016; also see Pavlenko, Reference Pavlenko2002, for a review of bilingualism and emotions), we hypothesized that bilinguals would process humor differently in L1 and in L2. We thus expected a difference between our U.S. bilinguals, who are English-dominant, and our Russian bilinguals, who are native Russian speakers and Russian-dominant. For Russians, lexical and semantic access in English could be slower or mediated by Russian (Kroll & Stewart, Reference Kroll and Stewart1994). Therefore, we expected slower reaction times overall from Russians, which we found, and fluency was the main determinant. We also expected lower sensitivity, but that hypothesis was not supported. Instead, Russians were superior to our bilingual U.S. students.

Fluency was also a predictor for d′ among U.S. and Russian bilinguals. Our fluency tasks were designed to measure semantic access. As we conceptualized fluency, it is a combination of being able to name pictures quickly and accurately and being able to generate a set of animal names in 60 seconds. Both require semantic access. A higher fluency score was associated with higher d′, suggesting that non-English language dominance may impede semantic and conceptual access in addition to slowing down the processing.

Nonetheless, the fluency effect was overridden by the effect of language – or nationality – group. Russians were unexpectedly more sensitive to humor, despite their lower proficiency. Russian bilinguals understood our jokes as well as or better than English native speakers, even though college students were the group on which we validated the humor differences between our funny and not funny versions, and even though the Russian participants required more time to access the meaning of the sentences.

If we look at the response time data only, Russian's longer response time shows reduced semantic access, and their lower proficiency scores corroborate this explanation. Yet the sensitivity data paint a different picture. Russians find our jokes are funnier than do college students, despite the fact that the jokes are in their non-dominant less fluent language. Furthermore, the Russians’ high d′ scores suggest successful semantic access. In order to activate the expected and the unexpected meanings in the setup and punch line of the joke, and thus detect and resolve semantic incongruity, semantic access is required.

One possible resolution is that slower semantic access does not entail lack of access: the Russians’ semantic access was slow but not reduced. Our fluency tasks involved more time pressure; if the Russians had had more time they might have been more accurate in picture naming and generated more animal names. Recall as well that we had to exclude a large proportion of our potential Russian pool. The high d′ of the participants who remained may be due to their higher motivation (Ayçiçeği-Dinn et al., Reference Ayçiçeği-Dinn, Şişman-Bal and Caldwell-Harris2018). We speculate that Russians’ high sensitivity to humor may be associated with interest in the task or personal investment in the English language. Since we have no information about participants’ occupations, we do not know if the Russians used English for work, as did Ayçiçeği-Dinn et al.'s high performing participants. However, since the participants had dedicated years to learning English and volunteered to participate in the study, the investment explanation remains a possibility. Another proposed speculation is related to Russian culture in which individuals may look for humor and double meanings. Further research would be required to support that conjecture.

Fluency played no role in the comparison of the two monolingual groups, neither in response time nor in sensitivity. The two groups had almost identical – and high – fluency scores. It is not surprising that fluency would only play a role when comparing groups that differ in fluency. The absence of a role for fluency made possible a role for other factors. In contrast, when comparing bilingual groups or monolingual and bilingual groups, fluency becomes more important as a factor.

In the case of response time, MTurk participants were considerably faster than the monolingual college students. Their older age was not a problem. In fact, the MTurk group outperformed all other groups in response time. As a group they are both highly practiced and highly motivated to complete experiments as quickly as possible. In general, MTurkers show faster response times than other samples (e.g., Hauser et al., Reference Hauser, Paolacci, Chandler, Kardes, Herr and Schwarz2019) and were faster here as well. Skill and practice can outweigh age.

In the case of d′, what came to the fore were age and sense of humor. Being older was associated with greater sensitivity to our jokes. There was a hint of an age advantage in the comparison of the two bilingual groups as well. But the main determinant in our monolingual groups was sense of humor, even though the humor questionnaire scores did not distinguish the two groups. Although we had included sense of humor as a variable because we expected that having a good sense of humor would be an advantage in detecting jokes, we only saw a role for it in the comparison of the two monolingual groups. We conjecture that sense of humor is overridden in the other comparisons by the strength of fluency.

Thus, we suggest that the superiority of MTurkers is mostly due to their expertise, motivation, and attitude. MTurk workers participate in response-time research more often than introductory psychology college students. They are paid for participating in research, motivating them to complete studies more quickly to earn more compensation in a shorter time. They are also interested in providing good quality output in order to have a good record on MTurk. In contrast, college students participate for course credit and receive credit regardless of whether they complete the study. Further, the study was completely voluntary for the MTurk participants and an alternative to a quiz for college students, which could reflect on their respective attitudes.

Limitations

One limitation, as shown in the comparison of the monolingual groups, was the difficulty of matching participants by variables other than language status. With regard to the bilingual groups, the undergraduate group is diverse in their non-English backgrounds; they reported speaking a wide variety of other languages. All Russian participants, on the other hand, spoke English as their non-dominant language, sometimes in addition to other languages. Although we cannot assess the effects of those differences, it is possible that we increased the noise in the data. It would be desirable to use subjects from subject pools that are matched for cultural and language diversity. Matching bilinguals by first and second language could be important. Similarly, it is important to acknowledge the role of language immersion. Considering the broad representation of second languages in the bilingual sample (Experiments 1 and 2), we were unable to control for this variable in the study.

On the other hand, our participant groups, particularly the undergraduate students and Russians, reflect the actual realia they live in. Some groups are more diverse while some are less so, making a study involving those groups ecologically valid. By comparing different groups with different types of internal diversity, one can begin to establish the range of variables that affect language processing.

The jokes we used were culturally neutral, which could be seen as a plus or a minus. On one hand, our data show that jokes designated by the experimenter as culturally neutral actually are; they thus eliminate one source of bias. On the other hand, some jokes are culturally specific and are likely to show cultural differences even if language is kept constant, as generational differences in what seems funny attest. Our jokes also did not typically require knowledge of sensitivity to idioms.

Another limitation to this study is that participant groups differ not only by language status (bilingual, monolingual, English-dominant and non-dominant), but also by type of motivation for participation. The college students received course credit regardless of whether they completed the procedure, so the results may be an underestimate of their response time and sensitivity. Comparing the two student groups levelled this limitation. Russian participants volunteered to take part in their study for no reward, so their results are more likely to reflect genuine interest in research and a fair assessment of their response time and sensitivity. Finally, MTurkers received a monetary reward for their performance, so they were motivated to complete the task fast and provide high quality results since they received the reward only after their results were reviewed. Thus, the comparisons between bilingual college students and Russians and monolingual college students and MTurkers reveal not just linguistic, but also motivation-related differences.

Another possible limitation concerns the materials we used. While the humor of most of the jokes is based on incongruity, in a few of them it is hard, if not impossible, to pin-point. But even the few items potentially lacking incongruity were deemed funny by over half of our participants (53%, 57% and 71% for items 11, 16, and 18 respectively).

Humor in general is very subjective, as demonstrated by participants’ accuracy varying between 65-71% for jokes and 78-84% for non-jokes. This means not only did some participants reject our jokes as not funny, they accepted our non-jokes as funny. This lack of unanimity allowed calculating sensitivity, and, once again, highlights the role of individual variation.

The final limitation is the assessment of fluency by means of typing. Participants differ in typing skill and keyboard layout familiarity. Time sensitive tasks involving typing, especially in L2, can put some individuals at disadvantage. We think that the significant positive correlation (r = .35, p < .001) between scores in the time-sensitive verbal fluency task and not the untimed picture naming task justifies our use of both scores in the composite fluency measure.

Future directions

Our study showed that highly proficient bilinguals are able to process humor in English as successfully as native speakers. The role of proficiency, however, still remains unclear. Bilinguals with a lower proficiency in English might show poorer performance than native speakers and highly proficient bilinguals; bilinguals with even greater English proficiency could “catch up” with native speakers not only in the sensitivity measure but in speed of processing as well; indeed, our English-dominant bilinguals’ results suggest that. Therefore, one possible development of this study is to include groups that differ in English proficiency.

Secondly, language and culture are tightly related, as are culture and humor. Cultural background was not evaluated in this study, and future studies could do so. One possibility is to include some culture variables in the statistical analysis, such as mono- or biculturalism, identification with the culture of the country of residence and with the culture of the home country, or participants’ ethnic group or other. Furthermore, studies could include culturally marked stimuli in the study, as well as controversial subjects, taboos, and strong language. A significant contribution to this line of research would be made by testing the bilingual subjects in each of their two languages and comparing them to two groups of monolingual controls, one for each language.

Conclusion

In three experiments we explored linguistic and other determinants of humor processing. In Experiment 1, we found no difference either in humor sensitivity or decision response time between monolinguals and English-dominant bilinguals. Both groups were proficient enough in English to perform at the same level. Only fluency predicted sensitivity to the jokes (d′).

We also found suggestive evidence of another individual difference variable – the expertise, motivation, and attitude toward the task. We conjecture that some combination of those variables distinguished the undergraduates and MTurkers, who differed in response time. Groups that differ in their motivation, along with language status, might therefore show differences because of the former rather than the latter.

When we compared English-dominant bilinguals with Russian-dominant bilinguals, we found partial support for the hypothesis that participants would have an advantage in humor processing in their dominant language. The bilingual college students performing in English, their dominant language, were faster than Russian participants performing in their non-dominant language. Thus, a more extreme difference in language proficiency is associated with processing speed. One highlight of the study is that despite the differences in response time, language dominance does not seem to affect humor sensitivity (d′). In fact, English non-dominant bilinguals showed higher sensitivity to humor than English-dominant bilinguals, despite their lower fluency. Sense of humor did not play a role here.

Acknowledgements

The authors thank Martin Chodorow, Paul Feitzinger, and Josephine O'Malley for their input in developing this study. We thank Mary C. Potter for discussions on implementing Rapid Serial Visual Presentation (RSVP). We thank Michelle Antonov, Andre Eliatamby, Suzanne van der Feest, Amelia Lambelet, Xiaomeng Ma, and Qihui Xu for their thoughtful feedback and comments throughout the preparation of this study.

Competing interests

The authors declare none.

Data availability

The data that support the findings of this study are openly available in an OSF repository at https://osf.io/vtfwg/.

Appendix 1

Stimuli

Appendix 2

Humor Questionnaire

Sense of Humor Queationnaire

Indicate how much the following statements are applicable to you. Use the slider bar.

Q1 If I am feeling depressed, I can cheer myself up with humor.

Never______ All the time

Q2 I am a naturally funny person – I easily make other people laugh.

Very strongly disagree______ Very strongly agree

Q3 Even when I'm by myself, life's absurdities amuse me.

Never______ All the time

Q4 I laugh and joke with my friends.

Never______ All the time

Q5 I enjoy making other people laugh.

Very strongly disagree______ Very strongly agree

Q6 I don't need to be with other people to feel amused – I find things to laugh about even when I'm by myself.

Never______ All the time

References

Aguinis, H, Villamor, I, and Ramani, RS (2021) MTurk research: Review and recommendations. Journal of Management 47, 823–837. https://doi.org/10.1177/0149206320969787 CrossRef Google Scholar

Attardo, S (1994). Linguistic theories of humor (Vol. 1). Berlin: Mouton de Gruyter. https://doi.org/10.1515/9783110219029 Google Scholar

Attardo, S and Raskin, V (1991). Script theory revis(it)ed: Joke similarity and joke representation model. Humor-International Journal of Humor Research 4, 293–348. https://doi.org/10.1515/humr.1991.4.3-4.293 CrossRef Google Scholar

Ayçiçeği-Dinn, A, Şişman-Bal, S and Caldwell-Harris, CL (2018). Are jokes funnier in one's native language? Humor 31, 5–37. https://doi.org/10.1515/humor-2017-0112 CrossRef Google Scholar

Bell, N D (2011). Humor scholarship and TESOL: Applying findings and establishing a research agenda. TESOL Quarterly 45, 134–159. https://doi.org/10.5054/tq.2011.240857 CrossRef Google Scholar

Bell, N and Attardo, S (2010). Failed humor: Issues in non-native speakers’ appreciation and understanding of humor. Intercultural Pragmatics 7(3), 423–447. https://doi.org/10.1515/iprg.2010.019 CrossRef Google Scholar

Bialystok, E (2009). Bilingualism: The good, the bad, and the indifferent. Bilingualism: Language and Cognition 12, 3–11. https://doi.org/10.1017/s1366728908003477 CrossRef Google Scholar

Bromberek-Dyzman, K (2015). Irony processing in L1 and L2: Same or different? In Heredia, RR (ed.), Bilingual figurative language processing. Cambridge: Cambridge University Press, pp. 268–298. https://doi.org/10.1017/CBO9781139342100.014 CrossRef Google Scholar

Caldwell-Harris, CL and Ayçiçeği-Dinn, A (2009). Emotion and lying in a non-native language. International Journal of Psychophysiology 71, 193–204. https://doi.org/10.1016/j.ijpsycho.2008.09.006 CrossRef Google Scholar PubMed

Cieślicka, A (2006). Literal salience in on-line processing of idiomatic expressions by second language learners. Second Language Research 22, 115–144. https://doi.org/10.1191/0267658306sr263oa CrossRef Google Scholar

Costa, A, Hernández, M and Sebastián-Gallés, N (2008). Bilingualism aids conflict resolution: Evidence from the ant task. Cognition 106, 59–86. https://doi.org/10.1016/j.cognition.2006.12.013 CrossRef Google Scholar PubMed

Costa, A, Hernández, M, Costa-Faidella, J and Sebastián-Gallés, N (2009). On the bilingual advantage in conflict processing: Now you see it, now you don't. Cognition 113, 135–149. https://doi.org/10.1016/j.cognition.2009.08.001 CrossRef Google Scholar PubMed

Coulson, S and Kutas, M (2001). Getting it: Human event-related brain response to jokes in good and poor comprehenders. Neuroscience Letters 316, 71–74. https://doi.org/10.1016/s0304-3940(01)02387-4 CrossRef Google Scholar PubMed

Coulson, S, Urbach, TP and Kutas, M (2006). Looking back: Joke comprehension and the space structuring model. Humor 19, 229–250. https://doi.org/10.1515/humor.2006.013 CrossRef Google Scholar

Erdodi, L and Lajiness-O'Neill, R (2012). Humor perception in bilinguals: Is language more than a code?. Humor 25, 459–468. https://doi.org/10.1515/humor-2012-0024 CrossRef Google Scholar

Giora, R (1991). On the cognitive aspects of the joke. Journal of Pragmatics 16, 465–485. https://doi.org/10.1016/0378-2166(91)90137-m CrossRef Google Scholar

Gollan, TH and Acenas, LAR (2004). What is a TOT? Cognate and translation effects on tip-of-the-tongue states in Spanish-English and Tagalog-English bilinguals. Journal of Experimental Psychology: Learning, Memory, and Cognition 30, 246–269. https://doi.org/10.1037/0278-7393.30.1.246 Google Scholar PubMed

Gollan, TH, Slattery, TJ, Goldenberg, D, Van Assche, E, Duyck, W and Rayner, K (2011). Frequency drives lexical access in reading but not in speaking: The frequency-lag hypothesis. Journal of Experimental Psychology: General, https://doi.org/140(2), 186–209. 10.1037/a0022256CrossRef Google Scholar

Green, DW (1998). Mental control of the bilingual lexico-semantic system. Bilingualism: Language and Cognition 1, 67–81. https://doi.org/10.1017/s1366728998000133 CrossRef Google Scholar

Harms, PD and DeSimone, JA (2015). Caution! MTurk workers ahead—Fines doubled. Industrial and Organizational Psychology 8, https://doi.org/183-190. 10.1017/iop.2015.23CrossRef Google Scholar

Harris, RJ, Friel, BM and Mickelson, NR (2006). Attribution of discourse goals for using concrete-and abstract-tenor metaphors and similes with or without discourse context. Journal of Pragmatics 38, 863–879. https://doi.org/10.1016/j.pragma.2005.06.010 CrossRef Google Scholar

Hauser, D, Paolacci, G and Chandler, J (2019). Common concerns with MTurk as a participant pool: Evidence and solutions. In Kardes, FR, Herr, PM and Schwarz, N (eds.), Handbook of research methods in consumer psychology. New York: Routledge, pp. 319–337. https://doi.org/10.31234/osf.io/uq45c Google Scholar

Huang, Y, and Ferreira, F (2020). The application of signal detection theory to acceptability judgments. Frontiers in Psychology 11:73. https://doi.org/10.3389/fpsyg.2020.00073 CrossRef Google Scholar PubMed

Hull, R, Tosun, S and Vaid, J (2017). What's so funny? Modelling incongruity in humour production. Cognition and Emotion 31, 484–499. https://doi.org/10.1080/02699931.2015.1129314 CrossRef Google Scholar PubMed

Kostopoulou, O, Nurek, M, Cantarella, S, Okoli, G, Fiorentino, F and Delaney, BC (2019). Referral decision making of general practitioners: a signal detection study. Medical Decision Making 39, 21–31. https://doi.org/10.1177/0272989×18813357CrossRef Google Scholar PubMed

Kroll, JF, Bobb, SC and Wodniecka, Z (2006). Language selectivity is the exception, not the rule: Arguments against a fixed locus of language selection in bilingual speech. Bilingualism: Language and Cognition 9, 119–135. https://doi.org/10.1017/s1366728906002483 CrossRef Google Scholar

Kroll, JF, Bogulski, CA and McClain, R (2012). Psycholinguistic perspectives on second language learning and bilingualism: The course and consequence of cross-language competition. Linguistic Approaches to Bilingualism 2, 1–24. https://doi.org/10.1075/lab.2.1.01kro CrossRef Google Scholar

Kroll, JF and Stewart, E (1994). Category interference in translation and picture naming: Evidence for asymmetric connections between bilingual memory representations. Journal of Memory and Language 33, 149–174. https://doi.org/10.1006/jmla.1994.1008 CrossRef Google Scholar

Li, P, Zhang, F, Tsai, E and Puls, B (2014). Language history questionnaire (lhq 2.0): A new dynamic web-based research tool. Bilingualism: Language and Cognition 17, 673–680. https://doi.org/10.1017/S1366728913000606 CrossRef Google Scholar

López, BG and Vaid, J (2017). Psycholinguistic approaches to humor. In Attardo, S (ed.), The Routledge handbook of language and humor. New York: Routledge, pp. 267–281. https://doi.org/10.4324/9781315731162-19 CrossRef Google Scholar

Marian, V and Kaushanskaya, M (2008). Words, feelings, and bilingualism: Cross-linguistic differences in emotionality of autobiographical memories. The Mental Lexicon 3, 72–91. https://doi.org/10.1075/ml.3.1.06mar CrossRef Google Scholar PubMed

Marian, V, Blumenfeld, HK and Kaushanskaya, M (2007). The language experience and proficiency questionnaire (leap-q): Assessing language profiles in bilinguals and multilinguals. Journal of Speech, Language, and Hearing Research 50, 940–967. https://doi.org/10.1044/1092-4388(2007/067)CrossRef Google Scholar PubMed

Martin, RA and Lefcourt, HM (1984). Situational Humor Response Questionnaire: Quantitative measure of sense of humor. Journal of Personality and Social Psychology 47, 145–155. https://doi.org/10.1037/0022-3514.47.1.145 CrossRef Google Scholar

Martin, RA, Puhlik-Doris, P, Larsen, G, Gray, J and Weir, K (2003). Individual differences in uses of humor and their relation to psychological well-being: Development of the humor styles questionnaire. Journal of Research in Personality 37, 48–75. https://doi.org/10.1016/s0092-6566(02)00534-2 CrossRef Google Scholar

Matlock, T and Heredia, RR (2002) Understanding phrasal verbs in monolinguals and bilinguals. In Heredia, RR and Altarriba, B (eds.), Bilingual sentence processing. Amsterdam: Elsevier, pp. 251–274. https://doi.org/10.1016/S0166-4115(02)80014-0 CrossRef Google Scholar

Meuter, RF and Allport, A (1999). Bilingual language switching in naming: Asymmetrical costs of language selection. Journal of Memory and Language 40, 25–40. https://doi.org/10.1006/jmla.1998.2602 CrossRef Google Scholar

Mitrofanova, N, Urek, O, Rodina, Y and Westergaard, M (2021). Sensitivity to microvariation in bilingual acquisition: morphophonological gender cues in Russian heritage language. Applied Psycholinguistics, 1–39. https://doi.org/10.1017/s0142716421000382 Google Scholar

Ozdemir, M and Uysal, H (2016). The time course of meaning activation in jokes: Bilinguals vs. monolinguals. Turkophone 1, 5–19. Retrieved from https://dergipark.org.tr/en/pub/turkophone/issue/18996/200502 Google Scholar

Pavlenko, A (2002). Bilingualism and emotions. Multilingua 21, 45–78. https://doi.org/10.1515/mult.2002.004 CrossRef Google Scholar

Potter, MC (1993). Very short-term conceptual memory. Memory & Cognition 21, 156–161. https://doi.org/10.3758/bf03202727 CrossRef Google Scholar PubMed

Raskin, V (1987). Linguistic heuristics of humor: A script-based semantic approach. International Journal of the Sociology of Language 1987, 11–26. https://doi.org/10.1515/ijsl-1987-6503 CrossRef Google Scholar

Rosselli, M, Vélez-Uribe, I and Ardila, A (2017). Emotional associations of words in L1 and L2 in bilinguals. In Ardila, A, Cieślicka, AB, Heredia, RR and Rosselli, M (eds.), Psychology of bilingualism, Cham: Springer, pp. 39–72. https://doi.org/10.1007/978-3-319-64099-0_3 CrossRef Google Scholar

Ruch, W, McGhee, PE and Hehl, FJ (1990). Age differences in the enjoyment of incongruity-resolution and nonsense humor during adulthood. Psychology and Aging 5, 348–355. https://doi.org/10.1037/0882-7974.5.3.348 CrossRef Google Scholar PubMed

Ruch, W (2008). Psychology of humor. In Raskin, V. (ed.), The primer of humor research. Berlin: de Gruyter, pp. 17–100. https://doi.org/10.1515/9783110198492.17 CrossRef Google Scholar

Saygin, AP (2001, March). Processing figurative language in multi-lingual task: Translation, transfer and metaphor. Paper presented in Proceedings of Corpus-Based and Processing Approaches to Figurative Language Workshop, Corpus Linguistics. Lancaster, UK: Lancaster University.Google Scholar

Schaier, AH, Cicirelli, VG (1976). Age differences in humor comprehension and appreciation in old age. Journal of Gerontology 31, 577–582. https://doi.org/10.1093/geronj/31.5.577 CrossRef Google Scholar PubMed

Snodgrass, JG and Vanderwart, M (1980). A standardized set of 260 pictures: Norms for name agreement, image agreement, familiarity, and visual complexity. Journal of Experimental Psychology: Human Learning and Memory 6, 174–215. https://doi.org/10.1037/0278-7393.6.2.174 Google Scholar PubMed

Suls, JM (1972). A two-stage model for the appreciation of jokes and cartoons: An information-processing analysis. The Psychology of Humor: Theoretical Perspectives and Empirical Issues 1, 81–100. https://doi.org/10.1016/b978-0-12-288950-9.50010-9 CrossRef Google Scholar

Vaid, J (2000). New approaches to conceptual representations in bilingual memory: The case for studying humor interpretation. Bilingualism: Language and Cognition 3, 28–30. https://doi.org/10.1017/s1366728900290111 CrossRef Google Scholar

Valian, V (2015). Bilingualism and cognition. Bilingualism: Language and Cognition 18, 3–24. https://doi.org/10.1037/12324-003 CrossRef Google Scholar

Wiseman, R (2002). Laughlab: The scientific search for the world's funniest joke. Final report retrieved from laughlab.com@ http://laughlab.couk.Google Scholar

Table 1. Means (SD) for Reaction Time, d′, Fluency, and Humor in American Mono- and Bilingual Groups in the Four Experimental Groups

Table 2. Estimates (St. Errors) of Hierarchical Regressions for response time and d′ in American Mono- and Bilingual Groups.

Table 3. Estimates (St. Errors) of Hierarchical Regressions for Response Time and d′in Russian and American Bilingual Groups.

Table 4. Estimates (St. Errors) of Hierarchical Regressions for Response Time and d′ in MTurkers and Monolingual College Student Groups.

Table 5. Summary of Regression Results

Article contents

Do bilinguals get the joke? Humor comprehension in mono- and bilinguals

Abstract

Keywords

Introduction

What makes jokes funny: Humor, cognition, and motivation

Humor, cognition, and bilingualism

The present study

General method

Materials

Jokes

Objective measure of English fluency

Language background assessment

Subjective assessment of sense of humor

Procedure

Data analysis

Experiment 1

Method

Participants

Results

Response time

d′

Discussion

Experiment 2

Method

Participants

Results

Response time

d′

Discussion

Experiment 3

Method

Participants

Results

Response time.

d′

Discussion

General discussion

Limitations

Future directions

Conclusion

Acknowledgements

Competing interests

Data availability

Appendix 1

Appendix 2

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests