Artificial Intelligence and Collective Memory: An Experimental Framework for Algorithmic Preservation of Historical Narratives Using NLP and Deep Learning
Keywords:
Collective memory, artificial intelligence, natural language processing, mBERT, and historical preservationAbstract
In this paper, we propose a new research framework for collective memory preservation by means of an Artificial Intelligence application on historical texts. In two stages of quantitative experiments, initially processing validation for the top 10,000 documents of Europeana 1914–1918 (Europeana Collections, n.d.) dataset in three European languages was conducted and then replicated on 1.2 million records in eight languages (Arabic, English, French, German, Kurdish, Russian, Spanish, Swahili) between 1850 and 2023 (CherLenta's Roubiki dataset, n.d.). The mBERT transformer model was used for sentiment classification and named entity recognition tasks, where the average F1-score for sentiment classification and named entity recognition on the complete dataset was 0.87 and 0.79, respectively. In this paper, we demonstrate that transformer-based models (Devlin et al., 2019) significantly outperform Word2Vec+LSTM-based baseline (Mikolov et al., 2013) for this task. In addition, our temporal analysis of retrieved sentiments was shown to have significant relationships with major historical events in news archives. At the same time, the tested model's performance on low-resource languages (Kurdish and Swahili) remains substantially worse compared to high-resource languages (English, French, and German), which suggests the necessity of further work on extending corpora and processing methods. This paper also provides an integrated methodological framework that combines the interdisciplinary approaches of collective memory theory and social research with computational and applied aspects of its further development and application, as well as addresses important ethical challenges, such as algorithmic bias, transparency, and cultural sensitivity in AI-assisted historiography.