Hybrid Weighted Retrieval of Twitter Users for Temporally Relevant Full-Text Querying in the Media Industry

Hodgson, Ryan; Wang, Jingyun; Cristea, Alexandra I.; Graham, John

doi:10.1109/iiai-aai-winter58034.2022.00018

Hybrid Weighted Retrieval of Twitter Users for Temporally Relevant Full-Text Querying in the Media Industry

Hodgson, Ryan; Wang, Jingyun; Cristea, Alexandra I.; Graham, John

Authors

Ryan Hodgson ryan.t.hodgson@durham.ac.uk
Post Doctoral Research Associate

Dr Jingyun Wang jingyun.wang@durham.ac.uk
Assistant Professor

Professor Alexandra Cristea alexandra.i.cristea@durham.ac.uk
Professor

John Graham

Abstract

Barriers to the delivery of journalistic content to suitable media outlets present difficulties to both journalists and publishing houses. These may take the form of barriers to the identification of key individuals to whom the content is relevant, and who may be influential within their domain topic. To address this problem, we present a methodological approach to automated recommender systems for journalists or contentwriters, which enables the identification of social media accounts, based upon information retrieval techniques. Firstly, this work investigates three concepts, which to our knowledge have yet to be applied to such a domain; firstly, how user-defined descriptions, commonly associated with user profiles on the Twitter platform, may impact upon the perceived accuracy of recommendations. Secondly, the application of a full-text query facility to accept journalistic-content queries. Finally, an investigation into timedecay, as part of the recommendation-ranking methodology. This results in a proposed novel hybrid weighting methodology, which provides advantages to baseline information retrieval techniques, and demonstrates the influence of temporal decay ranking upon model performance. As part of this work, a vectorised tweet dataset is provided, entailing 390,750 unique tweets, for use in wider research. Finally, we demonstrate an implementation in a real-world commercial setting, and provide an anonymised data set to assist in future research into this domain.

Citation

Hodgson, R., Wang, J., Cristea, A. I., & Graham, J. (2022). Hybrid Weighted Retrieval of Twitter Users for Temporally Relevant Full-Text Querying in the Media Industry. . https://doi.org/10.1109/iiai-aai-winter58034.2022.00018

Presentation Conference Type	Conference Paper (Published)
Conference Name	2022 13th International Congress on Advanced Applied Informatics Winter (IIAI-AAI-Winter)
Start Date	Dec 12, 2022
End Date	Dec 14, 2022
Publication Date	2022
Deposit Date	May 2, 2023
Publisher	Institute of Electrical and Electronics Engineers
DOI	https://doi.org/10.1109/iiai-aai-winter58034.2022.00018
Public URL	https://durham-repository.worktribe.com/output/1134886
Publisher URL	https://ieeexplore.ieee.org/xpl/conhome/1801921/all-proceedings