Introduction
Migration and health
There is a well-established link between population mobility, migration, and the epidemiology of certain diseases (Gushulak and MacPherson, Reference Gushulak and MacPherson2006). While communicable diseases have traditionally been the focus of attention, the increasing global importance of migration has led to a renewed interest in other aspects of population health, in particular non-communicable diseases that are linked to genetic characteristics.
By combining various analytical methods, available data on Alzheimer’s disease (AD) can be used to identify determinants of health disparities in AD and their impact at the individual, community, and societal levels (Akushevich et al., Reference Akushevich, Kravchenko, Yashkin, Doraiswamy and Hill2023). The determinants may include ethnic, gender, and geographic factors.
Alzheimer’s disease and Volga Germans
AD can be either familial or sporadic. The familial form is autosomal dominant and has an early onset, occurring in individuals under 65 years of age. It accounts for only 1–5% of cases and is caused by mutations in the genes PSEN1 (presenilin 1) (MIM *104311) and PSEN2 (presenilin 2) (MIM *600759). PSEN1 is associated with Alzheimer’s disease type 3 (MIM 607822) and PSEN2 with AD type 4 (MIM 606889). Late-onset AD is classified as occurring in patients over the age of 65. The sporadic form, which accounts for 95% of cases, has no known genetic cause (Andrade-Guerrero et al., Reference Andrade-Guerrero, Santiago-Balmaseda, Jeronimo-Aguilar, Vargas-Rodríguez, Cadena-Suárez, Sánchez-Garibay, Pozo-Molina, Méndez-Catalá, Cardenas-Aguayo, Diaz-Cintra, Pacheco-Herrero, Luna-Muñoz and Soto-Rojas2023).
In 1988, Bird et al. reported on presenile AD in five families in the United States. The disease was confirmed by autopsy and inherited as an autosomal dominant trait, present in both males and females for several generations (Bird et al., Reference Bird, Lampe, Nemens, Miner, Sumi and Schellenberg1988). All five families were descended from immigrants known as Volga Germans who arrived in the United States between 1870 and 1920. In 1992, the authors studied 28 families of Volga German descendants with AD. Eighteen families originated from the villages of Frank and Walter, near Saratov, Russia. The authors suggested that the cause of AD may be a founder effect. However, a common affected ancestor could not be identified (Bird et al., Reference Bird, Nemens, Nochlin, Sumi, Wijsman and Schellenberg1992).
The diaspora of Volga Germans
To understand the context of the migration of Germans to Russia, it is necessary to consider the Seven Years’ War. This international conflict over the control of colonies in North America and India began in 1756 and ended in 1763. It involved several empires, kingdoms, and other political structures established in territories that are now nation-states, including Germany. Catherine II, the Empress of the Russian Empire, aimed to populate the border regions near Asia with settled European peasants. In 1763, she promoted a series of benefits for migrants to make the colonisation of the Volga attractive to the post-war weakened German population. These benefits included freedom of religion, temporary tax exemption, interest-free loans, internal self-government, and permanent exemption from military conscription. Between 1764 and 1769, settlers primarily from Hesse, a state in Germany with its current capital in Wiesbaden, arrived on the lower Volga near the city of Saratov. During this time, 104 colonies were established, with a total population of 22,246 inhabitants (Pohl, Reference Pohl2009).
In 1864, Tsar Alexander II made amendments to the original agreement. By 1874, German immigrants were required to register for military service. This event marked the beginning of the diaspora, particularly in the United States, Canada, Australia, and Brazil (Pohl, Reference Pohl2009). In 1878, 1,100 Volga Germans arrived in Argentina. They settled in the provinces of Entre Ríos, Buenos Aires, Santa Fe, Chaco, and Córdoba, where numerous rural colonies were founded that still exist today. Immigration to Argentina continued until the First World War AufderHeide, Reference AufderHeide2006. Estimates from descendants’ associations suggest that there are 2,000,000 Volga Germans in the country. Individual migration stories are well documented and accessible through the websites of the descendant societies and/or their social networks, including Alemanes del Wolga en Argentina (http://www.alemanesdelwolga.com.ar), Centro Cultural Argentino Wolgadeutsche, Federación Argentina de Descendientes de Alemanes del Volga (http://fadadav.org.ar), and Asociación Argentina de Descendientes de Alemanes del Volga Unser Licht.
This database contains family registers, a list of surnames, and information on the colonies in which they settled.
Epidemiology of Alzheimer’s disease in Argentina
Research on AD has been concentrated on particular geographic regions and groups, as evidenced by studies conducted by Larraya et al. (Reference Larraya, Grasso and Marí2004), Melcon et al. (Reference Melcon, Bartoloni, Katz, Del Mónaco, Mangone, Melcon and Allegri2010), Méndez et al. (Reference Méndez, Calandri, Nahas, Russo, Demey, Martín, Clarens, Harris, Tapajoz, Campos, Surace, Martinetto, Ventrice, Cohen, Vázquez, Romero, Guinjoan, Allegri and Sevlever2018), Itzcovich et al. (Reference Itzcovich, Chrem-Méndez, Vázquez, Barbieri-Kennedy, Niikado, Martinetto, Allegri, Sevlever and Surace2020), Reference García and ComesañaGarcía and Comesaña (2021), and Dalmasso et al. (Reference Dalmasso, de Rojas, Olivar, Muchnik, Angel, Gloger, Sanchez Abalos, Chacón, Aránguiz, Orellana, Cuesta, Galeano, Campanelli, Novack, Martinez, Medel, Lisso, Sevillano, Irureta, Castaño, Montrreal, Thoenes, Hanses, Heilmann-Heimbach, Kairiyama, Mintz, Villella, Rueda, Romero, Wukitsevits, Quiroga, Gona, Lambert, Solis, Politis, Mangone, Gonzalez-Billault, Boada, Tàrraga, Slachevsky, Albala, Fuentes, Kochen, Brusco, Ruiz, Morelli and Ramírez2023). Currently, there is no national institutional registry of individuals diagnosed with AD. However, death certificates can serve as a valuable source of data. The Department of Health Statistics and Information (DEIS) is responsible for preparing periodic statistical reports that include an official registry of all deaths, their immediate causes, and any associated or pre-existing conditions. The International Classification of Diseases (ICD-10) is used to calculate morbidity and mortality statistics. The quality of this information is validated and serves as a critical input, although there may be variations between years or regions.
The aim of this study is to contribute to the epidemiology of AD by analysing the spatial distribution of all deaths related to AD, the spatial distribution of surnames of Volga German origin, and the association between these two phenomena in the country.
Materials and methods
Data sources
Argentina, with a population of 46,654,581 according to the National Institute of Statistics and Censuses (INDEC, 2023), is the second largest country in South America. It is divided into 23 provinces, one federal district, and 529 minor subdivisions known as departments. The provinces are further grouped into five geographical regions (refer to Fig. 1). Northwest (comprising the provinces of Catamarca, Jujuy, La Rioja, Salta, Santiago del Estero, and Tucumán), Northeast (Corrientes, Chaco, Formosa, and Misiones), Cuyo (Mendoza, San Juan, and San Luis), Central or Pampean (Buenos Aires, Córdoba, Entre Ríos, La Pampa, Santa Fe, and the Autonomous City of Buenos Aires), and Patagonia (Río Negro, Neuquén, Chubut, Santa Cruz, and Tierra del Fuego). The Central/Pampean region has the highest population density and income levels. In the 19th and 20th centuries, it was the most common destination for transcontinental migrations.
Since 2012, all Argentine citizens over the age of 16 have the right to vote, and the electoral roll is compiled annually. For this study, the 2015 Electoral Registry and national death records from 2005 to 2017 were used, as they share the same territorial data organisation. We analysed publicly available anonymous information. According to Argentine legislation, ethical approval was not required in these cases.
The Surnames of the Volga Germans
Surnames are sociocultural variables that result from historical and cultural processes. They are an important resource in bioanthropology and human population genetics. The linguistic and/or geographic origin of surnames can serve as a proxy for ethnicity in demographic and spatial analyses (Mateos, Reference Mateos2014; Albeck et al., Reference Albeck, Alfaro, Dipierri and Chaves2017). A list of surnames was compiled using the information available on the websites of the aforementioned associations of Volga German descendants. The study compared the 50 most common Volga surnames (VSs) in Argentina to the 50 most common surnames in the State of Hesse, Germany, as provided by Forebears (https://forebears.io/germany/hesse#surnames), and the 50 most common German surnames (excluding the State of Hesse) based on telephone user records of 30,000,000 individuals used by Rodriguez-Larralde et al. (1978). To ensure accuracy, a conservative approach was adopted for the comparative analysis, due to the potential for incorrect surname recording during migration in the 19th century. The threshold for homonymy was set at the difference of a diacritical mark or accent, such as a dieresis.
Using the Bulsarapp application (Morales et al., Reference Morales, Navarro, Cintas, González-José, Ramallo and Delrieux2021), surnames of Volga origin found in the 2015 electoral register were georeferenced. The frequency of these surnames was calculated by determining the number of individuals with VSs per 1000 voters (VS*1000) at the departmental, provincial, and regional levels.
Deaths due to Alzheimer’s disease
Although the death certificate is an official document signed by a medical professional, it may contain inaccuracies if data on the general state of health prior to death is not well known or would require forensic examinations to be conclusive. Therefore, we have used a broad criterion to select the following ICD 10 codes: G30.0 (AD with early onset, usually before the age of 65), G30.1 (AD with late onset, usually after the age of 65), G30.8 (other AD), G30.9 (Alzheimer disease, unspecified), and G31 (other degenerative diseases of nervous system, not elsewhere classified). The DEIS (2023) provided the time series of the number of deaths reported in Argentina from 2005 to 2017. Specific death rates (SDRs) related to AD were calculated per 1000 deaths (AD*1000) at departmental, provincial, and regional levels for the entire period.
Statistical and spatial analysis
To assess the impact of VS on SDRs, we used a generalised linear model with mixed effects. We employed three distribution types: Poisson (Mod. Poiss), Negative Binomial (Mod. BN), and zero-inflated Poisson (Mod. ZIP). We diagnosed the goodness of fit using the Q-Q statistic and assessed the model’s relative quality using the Akaike information criterion (AIC). We considered the political division (departments within provinces) as a random term. The study calculated the incidence rate ratio by exponentiating the regression beta. Furthermore, the Nagelkerke coefficient of determination was used to assess the proportion of the model’s variance explained.
The study utilised the Global Moran Index and Local Indicator of Spatial Autocorrelation to analyse the spatial distribution of VSs and SDRs, as well as the combination of both variables. Spatial autocorrelation is a statistical method used to determine whether a group of entities, such as regions, provinces, and departments, and their attributes, such as VS and SDR values, exhibit clustered, sparse, or random patterns across a territory. The analysis can aid in identifying spatial relationships and patterns that may not be immediately apparent (Anselin, Reference Anselin1995). Significance was determined at the 0.05 confidence level using the Monte Carlo test (999 permutations) under the null hypothesis of no spatial association. The analyses were conducted using GeoDa 1.14 software and the GeoPandas and PySAL function libraries for the Python programming language.
Results
Table 1 compares the 50 most common VSs in Argentina, Hesse (HS), and Germany (GS). Argentina and HS share 24% of the same surnames, while only 14% of the family names are common to all three lists.
According to the National Institute of Statistics and Censuses (INDEC, 2023), Argentina’s total population in 2015 was 43,131,966. The electoral roll for that year had 30,530,194 registered voters with 373,709 different surnames, representing over 70% of the country’s population. Out of these, 326,922 individuals (1.22%) had a VS, which accounts for a total of 1,109 surnames with this origin.
Table 2 displays the geographic distribution of VS carriers. The province of La Pampa, located in the Central region, had the highest frequency (VS*1000), while San Juan (Cuyo region) had the lowest. Additionally, Table 2 summarises death certificate data, which shows that between 2005 and 2017, there were 4,115,216 recorded deaths in Argentina, with 17,226 (4.19%) attributed to or associated with AD. At the provincial level, La Pampa had the highest SDR, while the province of Santa Cruz had the lowest. Of all deaths related to Alzheimer’s, 68% were women and 31% were men. In the remaining cases, the biological sex of the deceased could not be determined. The most common code in the database was G30.9 (AD, unspecified), appearing in 85% of certificates. For death certificates with code G30.0 (early-onset AD), the mean age was 73.82 for women (SD 14.89) and 68.23 for men (SD 12.01).
All spatial analyses indicate positive autocorrelation with p-values below 0.001. Figure 2 displays choropleth maps that clearly visualise the variability in VS and SDR by department. Several departments in the Central-Pampean region have the highest concentration of voters with surnames of Volga origin (see Fig. 2a). The maps show notably low SDR values in the departments of the NWA, NEA, and southern regions of the country (Fig. 2b). After calculating the bivariate Moran’s I for the two data sources (VS and SDR), three clusters with high and non-random values (Moran’s I = 0.19) were identified. Hot spots are located in the Central/Pampean region and in northern Patagonia, with high specific mortality rates related to AD and a high frequency of VSs.
Conversely, cold spots, or areas with low values, are located in the Northwest and Northeast regions (Fig. 2c). No significant spatial associations were found in the rest of the country.
The Q-Q and residual plots for Poisson, Negative Binomial, and zero-inflated Poisson models are shown in Fig. 3. The AIC is a measure of the quality of data usage in a model that penalises complexity. A model with a lower AIC is considered better than one with a higher AIC. The Negative Binomial distribution model had the lowest AIC values (Table 3) and suggests that for each unit of variation of VS, the SDR increases by 0.4%. The high frequency of VSs alone explains 43.53% of the observed variability of SDR, according to Nagelkerke R square.
Discussion
Diagnosing AD requires a combination of clinical evaluation, neuroimaging, and biomarkers. However, economic constraints often limit comprehensive studies, leading to underrepresentation or a lack of knowledge about certain mutations’ frequency. Molecular studies of genes associated with this disease are infrequent in Argentina. In their study, Bird et al. (Reference Bird, Lampe, Nemens, Miner, Sumi and Schellenberg1988) described families that shared a single N141I mutation in the PSEN2 gene. Although more than 10 additional mutations in PSEN2 have been reported, the N141I mutation has only been found in families of Volga origin, suggesting its specificity to this population group (Blauwendraat et al., Reference Blauwendraat, Wilke, Jansen, Schulte, Simón-Sánchez, Metzger, Bender, Gasser, Maetzler, Rizzu, Heutink and Synofzik2016). Llibre-Guerra et al. (Reference Llibre-Guerra, Li, Allegri, Mendez, Surace, Llibre-Rodriguez, Sosa, Aláez-Verson, Longoria, Tellez, Carrillo-Sánchez, Flores-Lagunes, Sánchez, Takada, Nitrini, Ferreira-Frota, Benevides-Lima, Lopera, Ramírez, Jiménez-Velázquez, Schenk, Acosta, Behrens, Doering, Ziegemeier, Morris, McDade and Bateman2020) conducted a meta-analysis in Latin America to identify pathogenic variants of autosomal dominant AD, including an investigation into the presence of N141I. Twenty-four variants were detected in 3,583 individuals at risk, mostly of European ancestry and typically attributable to founder effects. The frequency of these variants was higher in Colombia, followed by Puerto Rico and Mexico. A meta-analysis of 47 countries that reported variants in at least one of the three genes APP, PSEN1, and PSEN2 showed that the N141I variant is only found in Argentina, Germany, and the United States (Dehghani et al., Reference Dehghani, Bras and Guerreiro2021).
In 2010, Yu et al. identified a haplotype based on six single-nucleotide polymorphisms that cover the PSEN2 gene. The frequency of this haplotype was 0.64 in Volga Germans carrying the N141I mutation, compared to 0.26 in Volga Germans without the mutation and 0.13 in Europeans typed by the Centre d’Etude du Polymorphisme Humain. The study suggests that the N141I mutation in PSEN2 may have occurred before the emigration from the Hesse region to Russia. The discovery of families with this mutation living in Argentina and Germany suggests the possibility of additional cases sharing this common ancestry (Yu et al., Reference Yu, Marchani, Nikisch, Müller, Nolte, Hertel, Wijsman and Bird2010; Muchnik et al., Reference Muchnik, Olivar, Dalmasso, Azurmendi, Liberczuk, Morelli and Brusco2015). The mutations responsible for familial forms have known biochemical consequences that are likely to be at the root of sporadic AD. Early interventions can delay or even prevent dementia in asymptomatic individuals and families at risk, as well as slow progression in those with symptoms (Bateman et al., Reference Bateman, Aisen, De Strooper, Fox, Lemere, Ringman, Salloway, Sperling, Windisch and Xiong2011).
A study conducted at the beginning of the 21st century in Germany provided estimated values on the prevalence and incidence of dementing illness through large-scale epidemiological and meta-analyses (Bickel, Reference Bickel2000). The prevalence of dementia is higher in the states of Baden-Württemberg, Bavaria, Lower Saxony, North Rhine-Westphalia, and Hesse. This region is also the homeland of the Volga Germans. However, in recent times, there has been a change in this trend, particularly due to the increased frequency of cases of late-onset AD in female patients (Ziegler and Doblhammer, Reference Ziegler and Doblhammer2009; Lange et al., Reference Lange, Schulte, Dittmann and Hildebrandt2017). These differences are primarily attributed to women’s longer life expectancy, as advanced age remains the greatest risk factor for AD (Chêne et al., Reference Chêne, Beiser, Au, Preis, Wolf, Dufouil and Seshadri2015). Of all the registered deaths due to Alzheimer’s in Argentina, 68% were women and 31% were men. A significant difference in the distribution of SDR and VS was observed between the Central/Pampean, Cuyo, and Patagonia regions compared to the NWA region. The NWA region comprises various environments, including the Andean foothills, where populations live at altitudes above 2,500 metres. Genomic studies indicate that the NWA populations have the highest proportion of Central Andean ancestry component (Muzzio et al., Reference Muzzio, Motti, Paz Sepulveda, Yee, Cooke, Santos, Ramallo, Alfaro, Dipierri, Bailliet, Bravi, Bustamante and Kenny2018; Luisi et al., Reference Luisi, García, Berros, Motti, Demarchi, Alfaro, Aquilano, Argüelles, Avena, Bailliet, Beltramo, Bravi, Cuello, Dejean, Dipierri, Jurado Medina, Lanata, Muzzio, Parolin, Pauro, Paz Sepúlveda, Rodríguez Golpe, Santos, Schwab, Silvero, Zubrzycki, Ramallo and Dopazo2020). Meanwhile, individuals from the province of Misiones (NEA) have the highest proportion of Central/Northern European ancestry. This aligns with the historical record of settlement of Polish, German, Danish, and Swedish colonies in this province (Luisi et al., Reference Luisi, García, Berros, Motti, Demarchi, Alfaro, Aquilano, Argüelles, Avena, Bailliet, Beltramo, Bravi, Cuello, Dejean, Dipierri, Jurado Medina, Lanata, Muzzio, Parolin, Pauro, Paz Sepúlveda, Rodríguez Golpe, Santos, Schwab, Silvero, Zubrzycki, Ramallo and Dopazo2020).
Geographical factors and the biological history of a population can also have an impact on the risk of developing dementia (Alzheimer’s Association, 2020). The migration of Volga Germans to Argentina should be distinguished from the migration of other German groups. According to Germanic sources, the massive migration took place between 1883 and 1890. However, the majority of the contingents that left through the Port of Bremen until 1870 went to the United States, Canada, and Brazil. Argentina only became a destination of interest after 1870. German migrants left from ports in Hannover and Bremen, and mainly came from cities in northern Germany, but also including the Duchy of Hesse (de Flachs, Reference de Flachs1994). According to the Argentina National Censuses, German migrants represented a minority, comprising no more than 2.35%, 1.70%, and 1.14% of the total population in the 1869, 1895, and 1914 censuses, respectively. The immigrants were mainly concentrated in the same provinces as the Volga Germans, but they also settled in other destinations in Argentina. For example, 2.3% of the German immigrants registered in the aforementioned censuses resided in the NOA region and the majority of them were in the province of Tucumán. Volga migration remained stable in the provinces of Buenos Aires and La Pampa (Central/Pampean region). As shown in Table 3, the highest SDR in the country was recorded in the province of La Pampa (10.41). For this calculation, all deaths were considered according to ICD codes G30.0 (early-onset AD), G30.1 (late-onset AD), G30.8 (other AD), G30.9 (AD, unspecified), and G31 (other degenerative diseases of the nervous system). The rate was also calculated specifically for codes G30.0 and G30.9 to better weigh the impact of deaths from early-onset AD and to account for the possibility of under-diagnosis. Once again, the province of La Pampa had the highest SDR (9.61), with 483 deaths in the period analysed.
German surnames display considerable lexical, phonological, and morphological variation, which is reflected in their distribution across different regions (Dräger and Schmuck, Reference Dräger and Schmuck2009). The comparison of the 50 most common surnames shows that the Volga Germans in Argentina are more closely related to the population of Hesse than to any other German state. This makes them excellent markers for analysing large databases. Several dialects are spoken in Hesse. This linguistic diversity is also reflected in the specificity of family names. According to Rodriguez-Larralde et al. (1978), who based their research on the Lasker isonymic distance between German cities, this state is part of a cluster in southern Germany. The region is home to speakers of Rhine Franconian (including Hessian), Alemannic, and Bavarian (Konig, Reference Konig1978). In Russia, the Volga German colonies were linguistic isolates with limited bilingualism and a situation of internal diglossia. Hipperdinger (Reference Hipperdinger2017) suggests that this sociolinguistic characteristic led to the replacement of Russian with Spanish in Argentina, a situation that lasted until the second half of the twentieth century. The VSs maintained their distinctiveness in both contexts.
Conclusion
AD is a significant health problem, especially due to the ageing population. Familial forms of AD represent a small percentage of cases but are critical to study. This requires a comprehensive understanding of the biology of the population. The combination of historical and official documents, such as electoral rolls, census data, and health information, can be used to analyse population dynamics. Our study uses a methodological approach that provides a coherent view of the spatial distribution of deaths from AD in Argentina and its relationship to migration processes by combining diachronic (surnames) and synchronous (death certificates) data. Nearly 150 years after the establishment of the first colonies, the population descended from the Volga migration remains highly concentrated in the southeastern departments of La Pampa Province and the southwestern departments of Buenos Aires Province. Within the Central Region, these two areas are contiguous. This demographic behaviour has health consequences. These departments are included in a statistically significant cluster with a high frequency of surnames of Volga origin and high SDRs from AD. Tracing surnames by origin is a cost-effective method for distinguishing structures within a seemingly homogeneous social group. These analyses provide a reliable basis for guiding patient recruitment in medical research and reducing sampling error by identifying where and when a pre-existing genetic pattern is likely to persist. This approach may be useful for describing complex migration scenarios in other Latin American countries undergoing similar population processes.
Acknowledgements
We express our gratitude to Dr. Rolando González-José and the authorities of the Cámara Nacional Electoral Argentina for providing us with access to the 2015 Electoral Register.
Funding statement
This work was supported by Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), grant PIP 2021–2023 number 11220200101902.
Competing interests
Authors declare no competing interests.
Ethical standard
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.