AI, big data, and quest for truth: the role of theoretical insight

Tuba Bircan

doi:10.1017/dap.2024.36

AI, big data, and quest for truth: the role of theoretical insight

Published online by Cambridge University Press: 18 October 2024

Tuba Bircan

Show author details

Tuba Bircan*: Affiliation:
BRISPO, Department of Sociology, Vrije Universiteit Brussel, Brussels, Belgium
*: Email: [email protected]

Article contents

Abstract
Policy Significance Statement
Introduction
Big data and AI: beyond pattern recognition
Theoretical frameworks in computational social science: an interdisciplinary synthesis
Integrating theory and reasoning in the digital age: challenges and paradigms
Computational social science and social dynamics
Data, evidence, and knowledge in the age of AI
Conclusion: a call to action
Competing interest
References

Abstract

This paper aims at exploring the dynamic interplay between advanced technological developments in AI and Big Data and the sustained relevance of theoretical frameworks in scientific inquiry. It questions whether the abundance of data in the AI era reduces the necessity for theory or, conversely, enhances its importance. Arguing for a synergistic approach, the paper emphasizes the need for integrating computational capabilities with theoretical insight to uncover deeper truths within extensive datasets. The discussion extends into computational social science, where elements from sociology, psychology, and economics converge. The application of these interdisciplinary theories in the context of AI is critically examined, highlighting the need for methodological diversity and addressing the ethical implications of AI-driven research. The paper concludes by identifying future trends and challenges in AI and computational social science, offering a call to action for the scientific community, policymakers, and society. Being positioned at the intersection of AI, data science, and social theory, this paper illuminates the complexities of our digital era and inspires a re-evaluation of the methodologies and ethics guiding our pursuit of knowledge.

Keywords

AI big data computational social science social theory

Type: Commentary
Information: Data & Policy , Volume 6 , 2024 , e44

DOI: https://doi.org/10.1017/dap.2024.36 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2024. Published by Cambridge University Press

Policy Significance Statement

This research elucidates the transformative role of Big Data in migration studies, offering policymakers a compass for navigating the complexities of contemporary migration flows. It critically examines the integration of innovative data sources, such as mobile phone records and online activity, highlighting their potential to fill gaps in traditional migration statistics and enhance real-time responsiveness. The paper underscores the necessity for ethical stewardship and multidisciplinary collaboration in harnessing these data sources, ensuring that policies are not only informed by robust empirical evidence but are also aligned with ethical imperatives. By doing so, it provides a strategic blueprint for evidence-based policymaking that is attuned to the nuanced realities of migration and respects the dignity of individuals.

1. Introduction

There are truths to be discovered; that knowledge is possible.

Plato

In the rapidly evolving landscape of data, driven by unparalleled technological progress, we are confronted with a pivotal question: Does the deluge of data necessitate a theoretical underpinning for the discovery of truth? The burgeoning fields of Artificial Intelligence (AI) and Big Data are not only redefining our capabilities to gather and analyze vast datasets but are also challenging our traditional understanding of scientific methodologies. This paper aims to dissect the dynamic interplay between these advanced technologies and the enduring relevance of theoretical frameworks in unearthing deeper truths within data-rich environments.

As we navigate this interconnected and data-saturated world, the rapid expansion of data collection outpaces existing regulatory frameworks, which grapple with adapting to novel norms of data accumulation. This phenomenon is observable across a broad spectrum of surveillance tools, including public security apparatus, satellites, and omnipresent Wi-Fi networks, as well as in methods centered on the user, such as mobile phone tracking and social media analytics. These methods significantly enrich the data landscape, offering a comprehensive perspective on complex phenomena. Yet, to harness these vast opportunities effectively, sophisticated computational methodologies, particularly in the domain of AI, are indispensable.

The ascendance of Big Data and AI, at the nexus of data science and social theory, plays a critical role in addressing these multifaceted inquiries. It engenders a symbiotic relationship between empirical data and theoretical insights, which is vital for deciphering and shaping the contours of our digital society. This burgeoning field has reignited a crucial debate within the scientific community regarding the enduring role of theory in scientific methodology. Is it becoming obsolete in the face of AI’s capabilities, or is it more necessary than ever?

Critics of AI, such as Boyd and Crawford (Reference Boyd and Crawford2012), highlight the ethical implications and potential biases embedded within AI systems, underscoring the necessity for a robust theoretical and ethical framework to guide AI development and implementation. Conversely, Burrell (Reference Burrell2016) discusses the opacity inherent in machine learning algorithms, which often conceal biases and decision-making processes, thus advocating for increased transparency through theoretical understanding. A less discussed yet significant concern is the shift in reasoning patterns—away from traditional inductive and deductive reasoning toward Big Data analytics’ emphasis on pattern recognition. This shift prompts a critical examination: In an era dominated by extensive and instantaneous data, does theory retain its significance?

This commentary seeks to explore the complex dynamics between data, theory, and truth in the context of rapid technological advancements. By fostering a critical yet forward-thinking dialogue on how to employ these tools ethically and innovatively, we aim to contribute to the scientific and policy-making communities’ efforts to navigate this complex landscape. We will explore the implications of Big Data and AI beyond mere pattern recognition, examine the integration of theoretical frameworks in Computational social science (CSS), and propose methodologies for integrating theory and reasoning in the digital age. Each phase of our discussion is designed to build upon the last, forming a cohesive argument that champions a synergistic approach to the integration of computational capabilities with theoretical insight.

In doing so, we acknowledge and build upon the seminal works of Astleitner (Reference Astleitner2024) and Cabrera (Reference Cabrera2021), who have robustly argued for the integration of theoretical frameworks in the analysis of Big Data. We propose a novel approach that addresses gaps in the current discourse. Our work uniquely focuses on the practical integration of these theoretical frameworks within computational tools and methodologies, offering new strategies for operationalizing these theories in the face of rapidly evolving AI technologies. We aim to extend the discussion beyond the theoretical necessity in data sciences to include concrete examples of theory-driven data analysis that enhance both ethical oversight and methodological robustness. By doing so, we not only underscore the enduring relevance of theoretical insights in the digital age but also illustrate innovative ways these theories can be pragmatically applied to ensure that technological advancements in AI and Big Data contribute positively to our societal and scientific objectives.

2. Big data and AI: beyond pattern recognition

Computers are useless. They only give you answers.

Picasso

Big Data and AI, prominent forces in the digital revolution, radically redefine our understanding of pattern recognition and data interpretation. This technological renaissance necessitates a critical appraisal of theory’s role in effectively harnessing these tools.

In healthcare, particularly diagnostic imaging, AI algorithms, trained on vast datasets, have showcased their remarkable ability to identify pathologies, occasionally surpassing the discernment of experienced radiologists. This achievement is not merely a testament to the computational might of AI but also illuminates the foundational medical theories that steer both the training of these algorithms and their interpretative processes. For instance, algorithms designed to detect tumors on radiographs must rely on established medical knowledge about tumor characteristics, which shapes the training data and algorithmic parameters. This interplay of data-driven technology and theory emphasizes the essential role of theoretical frameworks in enhancing the accuracy and reliability of AI applications in healthcare.

However, while AI’s capability to identify complex patterns is indispensable, it also brings forth significant challenges. Misinterpretations by AI, driven by data devoid of theoretical underpinning, can lead to significant ramifications, as exemplified by the Cambridge Analytica scandal. This instance serves as a stark reminder of the dire outcomes when AI is employed without a robust theoretical and ethical framework, highlighting the critical need for integrating theory to navigate the ethical implications and biases inherent in AI systems.

The advent of deep learning and neural networks has been pivotal in advancing AI’s capabilities. These technologies, capable of processing vast datasets, have sparked debates concerning their “black box” nature, raising pertinent questions about their interpretability and accountability. Insights grounded in theory regarding algorithmic transparency are indispensable, ensuring these tools are leveraged not only for their computational power but also for their ethical integrity.

In addressing the “black box” issue, Burrell (Reference Burrell2016) emphasizes the opacity of machine learning algorithms and the challenges this poses for accountability and transparency. By integrating theoretical insights into the development and deployment of these technologies, we can begin to peel back the layers of the “black box” to reveal the decision-making processes within, thereby enhancing their interpretability and trustworthiness.

In the quest to exploit AI and Big Data, the indispensability of theory is unequivocal. Theoretical frameworks guide the formulation of hypotheses, the design of algorithms, and the interpretation of outcomes, safeguarding scientific validity and ethical responsibility. The discussion herein advocates for a novel, integrated approach where theory and technology emerge not seen as disparate entities but as symbiotic components in the scientific inquiry. This approach promises not only to amplify our capabilities for data interpretation but also to ensure adherence to scientific rigor and ethical standards.

Therefore, embracing theoretical insight in AI and Big Data is not merely a philosophical stance but a practical necessity. By fostering a robust coaction between data-driven insights and theoretical understanding, we position ourselves to unlock profound discoveries while adhering to the highest standards of scientific integrity and ethical conduct. This section has illustrated the essential role of theory in ensuring that AI and Big Data technologies are not only powerful but also principled in their application, contributing meaningfully to our ongoing quest for knowledge and truth in an increasingly complex world.

3. Theoretical frameworks in computational social science: an interdisciplinary synthesis

The purpose of computing is insight, not numbers.

Richard Hamming

The burgeoning discipline of CSS represents a paradigm shift, where the integration of diverse theoretical frameworks from sociology, psychology, economics, and beyond is crucial for a profound comprehension of social phenomena. This interdisciplinary fusion, foundational to CSS, leverages robust theoretical insights to illuminate the intricate structures and dynamics of societies. For instance, Emile Durkheim’s seminal theory of social integration and anomie (Durkheim, Reference Durkheim and Halls1893/1984) serves as a lens through which we can analyze patterns of social cohesion and fragmentation within digital communities. Furthermore, Daniel Kahneman’s “Dual Process Theory” (Kahneman, Reference Kahneman2011), which distinguishes between intuitive and analytical thought processes, offers invaluable insights into the behavioral patterns manifest in social media interactions. This theory, when applied through CSS methodologies, allows us to dissect the subtle nuances of human decision-making that are amplified in digital environments.

The strategic interactions among rational decision-makers, as explored through John Nash’s game theory (Nash, Reference Nash1951), resonate within CSS by utilizing algorithmic models to simulate and scrutinize scenarios. This application sheds light on collective behaviors and decision-making processes within virtual environments, demonstrating the value of integrating established theories with modern computational techniques. Moreover, the era of Big Data amplifies the significance of integrating ethical theories, ensuring that computational analyses navigate the moral complexities of data handling without perpetuating biases or inequalities (Mittelstadt et al., Reference Mittelstadt, Allo, Taddeo, Wachter and Floridi2016).

However, traditional disciplinary theories often face challenges in encapsulating the multifaceted nature of contemporary society amid relentless technological advancement. The rapid evolution of AI and the proliferation of Big Data illustrate technological shifts that traditional frameworks, established in eras of slower change and data scarcity, struggle to fully apprehend. This discrepancy underscores the need for adaptive theoretical approaches that evolve in tandem with technological innovation and its societal ramifications (Lazer et al., Reference Lazer, Pentland, Adamic, Aral, Barabási, Brewer and Van Alstyne2009).

As noted by Canali (Reference Canali2016), the integration of Big Data and epistemology in projects like EXPOsOMICS demonstrates how Big Data can influence our understanding of causality in complex systems. Such integration necessitates revisiting and potentially revising traditional epistemological theories to better fit the data-rich landscapes we now navigate.

The interconnected nature of global challenges necessitates interdisciplinary approaches, drawing insights from multiple fields to cultivate a more holistic understanding. Classical theories, while foundational, often lack the flexibility to integrate such diverse perspectives, highlighting the value of CSS as a bridge between disciplines (Castells, Reference Castells1996; Mayer-Schönberger and Cukier, Reference Mayer-Schönberger and Cukier2013).

3.1. Extending the theoretical frameworks

Extending these theoretical frameworks within CSS to encompass recent interdisciplinary insights and address the limitations of traditional theories in grappling with digital era challenges will provide a more robust understanding of social phenomena. For example, the field of network science, provides innovative methods for analyzing social structures through network analysis, offering new perspectives on social phenomena (Barabási, Reference Barabási2016). These approaches highlight the dynamic, rapidly evolving nature of digital landscapes and the emerging issues they present, such as digital inequalities and the impact of algorithmic decision-making.

Emerging issues such as digital inequalities, the impact of algorithmic decision-making, and the ethics of AI require theoretical frameworks that are not only interdisciplinary but also flexible and responsive to the pace of digital innovation (O’Neil, Reference O’Neil2016; Eubanks, Reference Eubanks2018). This is crucial for ensuring that CSS does not merely react to technological advancements but actively guides them in a manner that is ethically sound and socially responsible.

In conclusion, the advancement of CSS necessitates a commitment to theoretical rigor and methodological diversity, ensuring that our exploration of social phenomena is both grounded in a rich theoretical tradition and attuned to the complexities of the digital age. The integration of computational methods with established and emerging social theories promises not only to enhance our understanding of societal dynamics but also to navigate the ethical and practical challenges posed by the digital transformation of society. Advocating for a synthesis of the old and the new ensures that our scientific endeavors are both innovative and deeply rooted in a broad spectrum of scholarly knowledge, enhancing the societal relevance and impact of our research.

4. Integrating theory and reasoning in the digital age: challenges and paradigms

Isn’t the science built on questioning what others believed as truth?

Socrates

In this digital age, traditional scientific paradigms face unprecedented challenges. The reciprocity between vast data quantities and advanced computational technologies has necessitated a profound re-evaluation of how theoretical frameworks are integrated and applied. A poignant example of this is the phenomenon of “echo chambers” and “filter bubbles” in online social networks. Pariser’s (Reference Pariser2011) concept of the “filter bubble,” where algorithms determine the information users see online, reinforcing existing beliefs, highlights a challenge not entirely anticipated by traditional media theories. These digital constructs exacerbate social polarization and misinformation, underscoring the need for theories that account for the algorithms’ role in shaping public discourse (Sunstein, Reference Sunstein2017).

Furthermore, the rapid proliferation of gig economy platforms, such as Uber and Airbnb, has challenged traditional economic theories (Rosenblat, Reference Rosenblat2018) which have struggled to adapt the nuances of digital marketplaces and their impact on labor rights and housing markets. These platforms represent a shift in how labor is commodified and managed, raising questions about the adequacy of existing employment laws and economic models that fail to capture the transient and often precarious nature of gig work.

Similarly, the use of AI in predictive policing illustrates the limitations of relying solely on data-driven approaches in areas requiring a nuanced understanding of social contexts. Systems that predict crime hotspots, though intended to enhance public safety, often rely on historical data that may perpetuate biases against marginalized communities, raising ethical concerns not fully addressed by existing theories on crime and policing (Richardson et al., Reference Richardson, Schultz and Crawford2019). This not only highlights deficiencies in traditional criminology theories but also stresses the ethical risks associated with the uncritical application of data analytics in sensitive societal areas.

A particularly striking example of the limitations of purely data-driven approaches is evident in the use of AI for migration management in the EU. In the last decade, many governments and agencies employed AI not just for predicting migration flows but also for managing these movements at a granular level. Applications ranged from processing asylum applications to deploying border surveillance technologies. While these systems were lauded for their efficiency and ability to handle large volumes of data swiftly, they frequently failed to address the complex human realities behind migration. Such predictive models often overlook the nuanced reasons behind migration, such as persecution, conflict, or the impact of climate change, which require deep socio-political understanding to manage ethically and effectively.

These technological applications highlight a critical oversight: while AI can offer logistical support, it lacks the capability to fully understand or address the socio-political implications of migration, such as integration challenges, cultural nuances, and individual human rights concerns. This over-reliance on quantitative data and AI algorithms risks simplifying migration into a series of management problems to be “solved” rather than complex human situations needing compassionate and informed responses. The secretive nature of data use, as critiqued by the European Data Protection Supervisor, further complicates this, raising ethical concerns about privacy and the potential misuse of personal data (European Data Protection Board & European Data Protection Supervisor, 2021).

This scenario calls for an urgent need for theoretical innovation in CSS that transcends traditional disciplinary boundaries, advocating for the development of integrative theories adept at investigating the complexities of the digital age. Such theories should not only reflect the technological underpinnings of societal transformations but also proactively consider the ethical implications of digital innovations to foster a just and equitable digital future.

Addressing the societal challenges of the digital age demands an evolution in our theoretical frameworks. This evolution entails a shift toward more dynamic, adaptive theories capable of grappling with the rapid pace of technological change and its multifaceted impact on society. By integrating insights from data science, ethics, and technology studies, we can forge a theoretical foundation robust enough to navigate the digital landscape’s complexities, ensuring that our pursuit of knowledge leads to equitable and sustainable outcomes for society at large.

The correspondence theory, attributed to Aristotle, Plato, and Socrates, which posits a link between truth and reality, now faces challenges in an age where truth is increasingly fragmented. The rapid technological progression marked by an exponential increase in data volume and complexity prompts a pivotal shift in methodologies.

The advent of supercomputers, AI, and machine learning heralds a new era, where observational inferences can be drawn from data patterns, potentially bypassing established scientific reasoning. This evolution, while presenting unprecedented opportunities, also harbors risks—chief among them, the erosion of the scientific method’s core principles. Hence, maintaining unbiased, falsifiable, and reproducible scientific reasoning is indispensable in the Big Data era, emphasizing the primacy of guiding questions before poring over data analysis.

The ethical ramifications of this paradigm shift are profound and intertwine methodological concerns with ethical considerations. The phenomenon of algorithmic opacity, as highlighted by Burrell (Reference Burrell2016), complicates this further. As AI-driven research delves deeper, not only does it challenge the traditional dichotomy between inductive and deductive reasoning, but it also demands a critical examination of how these algorithms make decisions. This opacity often leaves practitioners grappling with the “black box,” making it imperative to develop methodologies that increase transparency and accountability.

Further compounding the challenge is the phenomenon of algorithmic opacity, as highlighted by Burrell (Reference Burrell2016). As AI-driven research delves deeper, not only does it challenge the traditional dichotomy between inductive and deductive reasoning, but it also demands a critical examination of how these algorithms make decisions. This opacity often leaves practitioners grappling with the “black box,” making it imperative to develop methodologies that increase transparency and accountability.

In response to these challenges, we advocate for a theoretical and methodological renaissance in CSS, aiming to reconcile the digital age’s societal challenges with robust scientific inquiry. We propose a novel integrative approach that combines the rigor of classical scientific methods with the innovative capabilities of modern technology. This approach not only facilitates a deeper understanding of complex data sets but also ensures that our methodologies evolve in line with ethical and social advancements.

For instance, integrating theoretical insights from social sciences with computational methods could involve employing simulation models that reflect both current societal dynamics and theoretical understandings of social behavior. These models can serve as test beds for hypotheses, allowing for iterative refinement in a controlled yet realistic digital environment. Such integration ensures that the insights derived are both empirically valid and theoretically grounded.

Hence, by embracing dynamic, adaptive theories and methodologies, we position ourselves to navigate the complexities of the digital landscape effectively, ensuring our pursuit of knowledge contributes to a just, informed, and equitable digital future. This integrative approach promises not only profound discoveries but also a principled, purpose-driven application of technology in scientific research, aligning with our ethical obligations and the broader goals of societal well-being.

5. Computational social science and social dynamics

Models are to be used, not believed.

Henri Theil

As we probe further the nexus of disciplines that form CSS, it becomes evident that this field uniquely synthesizes computer science, statistics, and social sciences, enabling a comprehensive approach to understanding social dynamics. This amalgamation leverages vast digital datasets and sophisticated computational methods, offering unprecedented insights into human interactions and societal patterns.

Despite the objectivity and factual inferences data-driven methodologies in CSS promise, they often grapple with the intricacies of interpreting complex human behaviors. While computational methods proficiently delineate “what” is present within large datasets, spotting patterns, anomalies, and trends, the deciphering of causality behind these phenomena poses significant challenges. Anderson’s (Reference Anderson2008) assertion that correlations may eclipse causations in data-rich environments provokes a re-evaluation of the role of theoretical frameworks amidst the data deluge. Theoretical insights are indispensable for providing a scaffold that aids in interpreting computational outcomes and ensuring they are scientifically vetted.

Network analysis and simulation models stand out in CSS for their capacity to dissect complex social systems. Network analysis elucidates the architecture and dynamics of social networks, proving indispensable for exploring online interactions, organizational frameworks, and epidemiological spread. Simulation models, on the other hand, enable researchers to probe virtual social systems, generating insights critical for policy development and strategic decision-making. The wide adoption of AI and machine learning has revolutionized pattern recognition, yet their “black box” nature often obscures the reliability and interpretability of findings. This opacity accentuates the indispensable role of theory in CSS, where scientifically vetted theories provide a scaffold for interpreting computational outcomes.

As CSS continues to evolve, embracing methodological pluralism, interpretative approaches, and explanatory models is imperative (Törnberg and Uitermark, Reference Törnberg and Uitermark2021). The synergy between computational techniques and theoretical understanding forms the cornerstone of CSS, enabling a comprehensive understanding of the complex nature of social interactions and phenomena. Theories in CSS offer more than just an explanatory context; they provide a lens to view and understand social dynamics’ complexities. These theories help situate computational findings within the broader narrative of human behavior and societal structures.

For example, theories of social capital and network theory can explain the dynamics observed in online communities or the spread of information in social networks. Real-life examples of this interdisciplinary synergy include studies on political polarization in social media spaces, where CSS methodologies, combined with theories of political communication, unpack the mechanisms driving ideological divides (Bail et al., Reference Bail, Argyle, Brown, Bumpus, Chen, Hunzaker, Lee, Mann, Merhout and Volfovsky2018). Similarly, research into social mobility patterns using Big Data analytics offers insights into economic theories of inequality and opportunity (Chetty et al., Reference Chetty, Hendren, Kline and Saez2014).

In conclusion, the journey of CSS is characterized by both its capacity to manipulate extensive datasets and its dedication to theoretical precision. The fusion of cutting-edge computational methods with foundational social theories is vital, enriching our comprehension of complex social phenomena and anchoring our discoveries in a richer, more contextualized framework. This synthesis not only enhances our understanding but also ensures that our technological advances in data analytics and machine learning are leveraged responsibly and ethically, contributing to the broader objectives of human well-being and societal advancement.

6. Data, evidence, and knowledge in the age of AI

It is a capital mistake to theorise before one has data.

Sir Arthur Conan Doyle, Sherlock Holmes

Scrutinizing the concept of “evidence” in both traditional and computational frameworks for Big Data and AI is paramount. Big Data, often perceived as inherently unstructured and arbitrary, is fundamentally purpose-driven. The mechanisms of data creation, storage, and processing are always influenced by specific objectives, which may inadvertently lead to biased representations and potentially unfair outcomes.

The critical evaluation of AI analytics, dependent on data volume, requires a nuanced understanding that the most extensive datasets do not necessarily equate to the most representative or accurate datasets. Herschel and Miori (Reference Herschel and Miori2017) argue that ethical considerations in Big Data practices must address the complexities of data curation and usage to prevent biases and ensure transparency. This scrutiny is crucial as the assumption that “Big Data is better data” can obscure significant biases, makeshift measurements, lack of transparency, and poor theoretical underpinning.

To address these challenges, a fusion of models and theoretical insights is essential. Theoretical frameworks guide the ethical gathering, curation, and interpretation of data, ensuring that AI and machine learning models are built and utilized responsibly. While the social theory can effectively address key methodological and analytical questions that technical solutions fail to answer (Radford and Joseph, Reference Radford and Joseph2020), the epistemological challenges necessitate a robust integration of theory to interpret data accurately and effectively (Canali, Reference Canali2016).

In the social sciences, evidence goes beyond data. Schools of thought and methodologies transform data into information, and through critical thinking and analysis, this information evolves into knowledge. The essence of truth lies in the veracity of knowledge, an ideal increasingly challenged by the vast swathes of data generated in the digital age. Pietsch (Reference Pietsch2021) emphasizes that while algorithms and data processing techniques offer sophisticated means to handle large datasets, they do not replace the need for a theoretical basis that enhances understanding and ensures the application of knowledge is both meaningful and ethically sound.

As AI and Big Data continue to advance in social sciences, establishing a robust theoretical foundation is imperative, enabling the transformation of Big Data into insightful and meaningful knowledge. Such an approach not only respects the integrity of data sources and processes but also enhances the societal value of the information derived from them. This relationship between data-driven and theory-driven methodologies fosters a synergistic environment where technological advances and theoretical insights mutually enhance each other, leading to more profound and ethically grounded discoveries.

In short, evidence and knowledge demands a rigorous commitment to theoretical integration and ethical consideration. This commitment ensures that our advancements in AI and Big Data not only push the boundaries of what is technologically possible but also adhere to the highest standards of scientific rigor and ethical responsibility, ultimately contributing to a well-informed and equitable society.

7. Conclusion: a call to action

We can’t solve problems by using the same kind of thinking we used when we created them.

Albert Einstein

As we stand on the verge of extraordinary technological breakthroughs in AI and CSS, we must confront the dual challenge and opportunity these advancements present. The profound integration of computational innovation with ethical, theoretical, and socially responsible frameworks is not only advantageous but imperative.

The digital transformation of society demands a robust response from the academic community, policymakers, and practitioners. It compels us to think critically about how we integrate and apply technological innovations. The call to action is clear: we must foster a deep and ongoing collaboration between the realms of technology development and social theory. This synergy will enable us to harness the full potential of AI and Big Data while ensuring that our advancements promote social equity and uphold human dignity.

To achieve this, we propose a strategic framework that emphasizes the following key actions:

7.1. Enhanced interdisciplinary collaboration

Bridging computational techniques with theoretical insights requires continuous dialogue between computer scientists, sociologists, ethicists, and other stakeholders. This collaboration will enrich our understanding of technological impacts and foster innovations that are both socially informed and technologically sound.

7.2. Regulatory and ethical frameworks

As technology evolves, so too must our regulatory and ethical frameworks. This involves not only adapting existing policies but also envisioning new governance structures that anticipate future developments and prevent misuse. Stricter data privacy laws, transparent algorithmic processes, and inclusivity in AI design are essential steps toward accountable and fair technological practices.

7.3. Public engagement and policy advocacy

Effective policy change requires informed public discourse. By engaging with and educating the broader community about the implications of AI and CSS, we can create a well-informed citizenry that actively participates in shaping the technological landscape. Policymakers must be urged to create laws and regulations that reflect the nuanced realities of technological impacts, fostering an environment where innovation flourishes within the bounds of ethical and social responsibility.

7.4. Focused research on ethical AI use

We advocate for targeted research initiatives that explore the ethical dimensions of AI application, particularly in critical areas such as healthcare, criminal justice, migration, and public administration. These efforts should aim to develop models of best practices that ensure AI tools enhance societal welfare without compromising ethical standards.

In conclusion, this paper calls not only for continued innovation in the fields of AI and CSS but also for a profound commitment to the ethical integration of these technologies into society. It is our collective responsibility to ensure that the pursuit of knowledge remains a pursuit of truth, guided by ethical principles and a deep commitment to societal welfare. This vision for the future of AI and CSS is not merely aspirational but a necessary direction to ensure that our technological advancements yield benefits for all segments of society. This conclusion serves to galvanize action toward a more informed just and equitable digital future, ensuring that our technological advances reflect our highest values and aspirations.

Competing interest

The author declares none.

References

Anderson, C (2008) The end of theory: The data deluge makes the scientific method obsolete. Wired Magazine 16(7).Google Scholar

Astleitner, H (2024) We have big data, but do we need big theory? Review-based remarks on an emerging problem in the social sciences. Philosophy of the Social Sciences 54(1), 69–92.CrossRef Google Scholar

Bail, CA, Argyle, LP, Brown, TW, Bumpus, JP, Chen, H, Hunzaker, MF, Lee, J, Mann, M, Merhout, F and Volfovsky, A (2018) Exposure to opposing views on social media can increase political polarization. Proceedings of the National Academy of Sciences 115(37), 9216–9221.CrossRef Google Scholar

Barabási, A-L (2016) Network Science. Cambridge: Cambridge University Press.Google Scholar

Boyd, D and Crawford, K (2012) Critical questions for big data. Information, Communication and Society 15(5), 662–679.CrossRef Google Scholar

Burrell, J (2016) How the machine “thinks”: Understanding opacity in machine learning algorithms. Big Data & Society 3(1), 1–12.CrossRef Google Scholar

Cabrera, F (2021) The fate of explanatory reasoning in the age of big data. Philosophy & Technology 34(4), 645–665.CrossRef Google Scholar

Canali, S (2016) Big data, epistemology and causality: Knowledge in and knowledge out in EXPOsOMICS. Big Data & Society 3(2), 1–11.CrossRef Google Scholar

Castells, M (1996) The Rise of the Network Society. Oxford: Blackwell.Google Scholar

Chetty, R, Hendren, N, Kline, P and Saez, E (2014) Where is the land of opportunity? The geography of intergenerational mobility in the United States. The Quarterly Journal of Economics 129(4), 1553–1623.CrossRef Google Scholar

Durkheim, É. (1893/1984). The Division of Labor in Society (Halls, W. D., Trans.). New York, NY: The Free Press.Google Scholar

Eubanks, V (2018) Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. New York, NY: St. Martin’s Press.Google Scholar

European Data Protection Board & European Data Protection Supervisor (2021). EDPB-EDPS Joint Opinion 03/2021 on the Proposal for a Regulation of the European Parliament and of the Council on European Data Governance (Data Governance Act). Available at: https://www.edpb.europa.eu/system/files/2021-03/edpb-edps_joint_opinion_dga_en.pdf.Google Scholar

Herschel, R and Miori, VM (2017) Ethics & big data. Technology in Society 49, 31–36.CrossRef Google Scholar

Kahneman, D (2011) Thinking, Fast and Slow. New York: Farrar, Straus and Giroux.Google Scholar

Lazer, D, Pentland, A, Adamic, L, Aral, S, Barabási, A-L, Brewer, D, … Van Alstyne, M (2009) Computational social science. Science 323(5915), 721–723.CrossRef Google Scholar

Mayer-Schönberger, V and Cukier, K (2013) Big Data: A Revolution that Will Transform how we Live, Work, and Think. London, UK: John Murray.Google Scholar

Mittelstadt, B, Allo, P, Taddeo, M, Wachter, S and Floridi, L (2016) The ethics of algorithms: Mapping the debate. Big Data & Society 3(2), 2053951716679679.CrossRef Google Scholar

Nash, J (1951) Non-cooperative games . In Annals of Mathematics, Princeton, NJ: Princeton University Press, pp. 286–295.Google Scholar

O’Neil, C (2016) Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. New York, NY: Crown.Google Scholar

Pariser, E (2011) The Filter Bubble: What the Internet Is Hiding from you. London: Penguin UK.Google Scholar

Pietsch, W (2021) Big Data. Cambridge: Cambridge University Press.CrossRef Google Scholar

Radford, J and Joseph, K (2020) Theory in, theory out: The uses of social theory in machine learning for social science. Frontiers in Big Data 3, 18.CrossRef Google Scholar

Richardson, L, Schultz, JM and Crawford, K (2019) Dirty data, bad predictions: How civil rights violations impact police data, predictive policing systems, and justice. New York University Law Review 94, 192–233.Google Scholar

Rosenblat, A (2018) Uberland: How Algorithms Are Rewriting the Rules of Work. Oakland, CA: University of California Press.CrossRef Google Scholar

Sunstein, C. R. (2017). # Republic: Divided Democracy in the Age of Social Media. Princeton, NJ: Princeton University Press.CrossRef Google Scholar

Törnberg, P and Uitermark, J (2021) For a heterodox computational social science. Big Data & Society 8(2), 20539517211047724.CrossRef Google Scholar

Submit a response

Comments

No Comments have been published for this article.

Article contents

AI, big data, and quest for truth: the role of theoretical insight

Abstract

Keywords

Policy Significance Statement

1. Introduction

2. Big data and AI: beyond pattern recognition

3. Theoretical frameworks in computational social science: an interdisciplinary synthesis

3.1. Extending the theoretical frameworks

4. Integrating theory and reasoning in the digital age: challenges and paradigms

5. Computational social science and social dynamics

6. Data, evidence, and knowledge in the age of AI

7. Conclusion: a call to action

7.1. Enhanced interdisciplinary collaboration

7.2. Regulatory and ethical frameworks

7.3. Public engagement and policy advocacy

7.4. Focused research on ethical AI use

Competing interest

References

Comments

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests