Skip to main content

Research Repository

Advanced Search

Natural Language Processing Algorithms to Improve Digital Marketing Data Quality and its Ethical Implications

Pons, Sergi; Huertas-Garcia, Ruben; Lengler, Jorge; Nascimento, Daniel

Authors

Sergi Pons

Ruben Huertas-Garcia

Daniel Nascimento



Abstract

The ethical implications of personalization in digital marketing are significantly greater when companies adapt their marketing actions to individual consumer preferences. While this approach helps to reduce oversaturation and a sense of irrelevance among consumers, it also raises concerns about privacy and potential algorithmic bias. One form of personalization is self-referencing, where companies use the customer’s name in all communications with that person. For this to be effective, customer data must be accurate and sourced from a high-quality database. This study presents a real case of data mining by a lead generation company, illustrating the sequential process of cleaning a database containing the names and surnames of 100,000 customers. In the final filtering step, we compared the performance of two Natural Language Processing (NLP) algorithms, Levenshtein and RapidFuzz, using ratio tests. The results demonstrate that the Levenshtein algorithm outperformed RapidFuzz, the former achieving a 93.43% clean dataset compared to the latter’s 92.93%. Finally, we discuss the ethical challenges posed by the privacy-personalization paradox, explore the theoretical and managerial implications, and propose future research directions that balance digital marketing interests with consumer privacy.

Citation

Pons, S., Huertas-Garcia, R., Lengler, J., & Nascimento, D. (in press). Natural Language Processing Algorithms to Improve Digital Marketing Data Quality and its Ethical Implications. Psychology and Marketing,

Journal Article Type Article
Acceptance Date Mar 7, 2025
Deposit Date Mar 10, 2025
Journal Psychology and Marketing
Print ISSN 0742-6046
Electronic ISSN 1520-6793
Publisher Wiley
Peer Reviewed Peer Reviewed
Public URL https://durham-repository.worktribe.com/output/3705677