Yaakov HaCohen-Kerner
Author profiling: Gender prediction from Tweets and images: Notebook for PAN at CLEF 2018
HaCohen-Kerner, Yaakov; Yigal, Yair; Shayovitz, Elyashiv; Miller, Daniel; Breckon, Toby
Authors
Yair Yigal
Elyashiv Shayovitz
Daniel Miller
Professor Toby Breckon toby.breckon@durham.ac.uk
Professor
Abstract
Author profiling deals with identification of various details about the author of the text (e.g., age, cultural background, gender, native language, personality). In this paper, we describe the participation of our teams (yigall8 and millerl8, both teams contain the same people, but in another order) in the PAN 2018 shared task on author profiling, identifying authors' gender where for each author, 100 tweets and 10 images are provided. The authors were grouped by the language of their tweets: English, Spanish, and Arabic. In this paper, we describe our pre-processing, feature sets, machine learning methods and accuracy results. The best results using the textual features were achieved using the MLP method after applying the L normalization and using 9, 000 word unigrams for English, 10, 000 word unigrams and one orthographic feature for Spanish, and 7, 000 word unigrams and one orthographic feature for Arabic. We also tried various additional feature sets, including style-based feature sets. In most of the cases, these features did not improve the results and in a few cases even hurt the results. The best result (61.54%) for the visual features was obtained by the LR method using all the features (SIFT & Color & VGG) and the best basic feature set is the VGG. The best result for the combined features was achieved using modeL2 (millerl8) with 0.75 as a weight to the best textual model and a weight of 0.25 for NN Classifier (Keras) using only the 1000 VGG features.
Citation
HaCohen-Kerner, Y., Yigal, Y., Shayovitz, E., Miller, D., & Breckon, T. (2018, October). Author profiling: Gender prediction from Tweets and images: Notebook for PAN at CLEF 2018. Presented at CEUR Workshop Proceedings, Torino, Italy
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | CEUR Workshop Proceedings |
Start Date | Oct 22, 2018 |
Acceptance Date | Jan 1, 2018 |
Publication Date | Jan 1, 2018 |
Deposit Date | Feb 23, 2025 |
Publicly Available Date | Feb 27, 2025 |
Print ISSN | 1613-0073 |
Peer Reviewed | Peer Reviewed |
Volume | 2125 |
Public URL | https://durham-repository.worktribe.com/output/3536319 |
Publisher URL | https://ceur-ws.org/Vol-2125/ |
Files
Submitted Conference Paper
(570 Kb)
PDF
You might also like
Generalized Zero-Shot Domain Adaptation via Coupled Conditional Variational Autoencoders
(2023)
Journal Article
Cross-Domain Structure Preserving Projection for Heterogeneous Domain Adaptation
(2021)
Journal Article