Authorship attribution in twitter: a comparative study of machine learning and deep learning approaches

Aouchiche, Rebeh Imane Ammar; Boumahdi, Fatima; Remmide, Mohamed Abdelkarim; Madani, Amina

doi:10.1007/s41870-024-01788-z

Authorship attribution in twitter: a comparative study of machine learning and deep learning approaches

Original Research
Published: 17 March 2024

(2024)
Cite this article

International Journal of Information Technology Aims and scope Submit manuscript

Rebeh Imane Ammar Aouchiche¹,
Fatima Boumahdi¹,
Mohamed Abdelkarim Remmide¹ &
…
Amina Madani¹

59 Accesses
1 Citation
Explore all metrics

Abstract

As social media platforms gain popularity and influence, content integrity and user accountability issues become more critical. Authorship attribution (AA) is a powerful tool for tackling such issues by accurately determining the real author of online posts. This study proposes an AA approach using machine and deep learning algorithms to accurately predict the author of unknown posts on social media platforms. It introduces Temporal Convolutional Networks (TCN) for short texts, investigates the effectiveness of combining Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNN), and explores the use of an Autoencoder combined with Adaboost classifier. This approach was tested on a Twitter dataset, achieving 52.77% accuracy in AA through multiple experiments across various scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Authorship Attribution Using Capsule-Based Fusion Approach

Contribution of Improved Character Embedding and Latent Posting Styles to Authorship Attribution of Short Texts

Integrating RoBERTa Fine-Tuning and User Writing Styles for Authorship Attribution of Short Texts

Data availability

Not applicable.

References

Khanday AMUD, Khan QR, Rabani ST (2021) Identifying propaganda from online social networks during COVID-19 using machine learning techniques. Int J Inform Technol 13:115–122. https://doi.org/10.1007/s41870-020-00550-5
Article Google Scholar
Akuma S, Lubem T, Adom IT (2022) Comparing bag of words and tf-idf with different models for hate speech detection from live tweets. Int J Inform Technol 14(7):3629–3635. https://doi.org/10.1007/s41870-022-01096-4
Article Google Scholar
Kotiyal B, Pathak H, Singh N (2023) Debunking multi-lingual social media posts using deep learning. Int J Inform Technol. https://doi.org/10.1007/s41870-023-01288-6
Article Google Scholar
Reshi JA, Ali R (2023) Leveraging transfer learning for detecting misinformation on social media. Int J Inform Technol. https://doi.org/10.1007/s41870-023-01541-y
Article Google Scholar
Mendenhall TC (1887) The characteristic curves of composition. Science. https://doi.org/10.1126/science.ns-9.214S.237
Article PubMed Google Scholar
Yule GU (1939) On sentence- length as a statistical characteristic of style in prose: with application to two cases of disputed authorship. Biometrika 30:363–390
Google Scholar
Zipf GK (1932) Selected studies of the principle of relative frequency in language (Harvard University Press, Cambridge, MA and London, England). https://doi.org/10.4159/harvard.9780674434929
Kah AE, Airej AE, Zeroual I (2022) Arabic authorship attribution on twitter: what is really matters? Indonesian J Electric Eng Comput Sci 28:1730–1737. https://doi.org/10.11591/ijeecs.v28.i3.pp1730-1737
Theophilo A, Padilha R, Andaló FA, Rocha A (2022) (Institute of Electrical and Electronics Engineers Inc.) pp. 2909–2913. https://doi.org/10.1109/ICASSP43922.2022.9746262
Rabab’ah A, Al-Ayyoub M, Jararweh Y, Aldwairi M (2016) In: 2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA), pp. 1–6
Suman C, Raj A, Saha S, Bhattacharyya P (2022) Authorship attribution of microtext using capsule networks. IEEE Trans Comput Soc Syst 9:1038–1047. https://doi.org/10.1109/TCSS.2021.3067736
Article Google Scholar
Wang X, Iwaihara M (2021) (Springer Science and Business Media Deutschland GmbH), pp. 413–421. https://doi.org/10.1007/978-3-030-85896-4_32
Schwartz R, Tsur O, Rappoport A, Koppel M (2013) In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (Association for Computational Linguistics), pp. 1880–1891. https://aclanthology.org/D13-1193
Huang W, Su R, Iwaihara M (2020) (Springer Science and Business Media Deutschland GmbH), pp. 261–269. https://doi.org/10.1007/978-3-030-60290-1_20
Shrestha P, Sierra S, González FA, Rosso P, Montes-Y-Gómez M, Solorio T (2017) In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, vol. 2, pp. 669–674
Bhowmik S, Sultana S, Sajid AA, Reno S, Manjrekar A (2023) Robust multi-domain descriptive text classification leveraging conventional and hybrid deep learning models. Int J Inform Technol. https://doi.org/10.1007/s41870-023-01559-2
Article Google Scholar
Yadav V, Verma P, Katiyar V (2023) Enhancing sentiment analysis in hindi for e-commerce companies: a cnn-lstm approach with cbow and tf-idf word embedding models. Int J Inform Technol. https://doi.org/10.1007/s41870-023-01596-x
Article Google Scholar
Zulqarnain M, Alsaedi AK, Sheikh R, Javid I, Ahmad M, Ullah U (2023) An improved gated recurrent unit based on auto encoder for sentiment analysis. Int J Inform Technol. https://doi.org/10.1007/s41870-023-01600-4
Article Google Scholar
Khanday AMUD, Rabani ST, Khan QR, Rouf N, Din Mohi Ud (2020) Machine learning based approaches for detecting COVID-19 using clinical text data. Int J Inform Technol 12:731–739. https://doi.org/10.1007/s41870-020-00495-9
Article Google Scholar

Download references

Author information

Authors and Affiliations

LRDSI, Department of Computer Science, University of Blida 1, Blida, Algeria
Rebeh Imane Ammar Aouchiche, Fatima Boumahdi, Mohamed Abdelkarim Remmide & Amina Madani

Authors

Rebeh Imane Ammar Aouchiche
View author publications
You can also search for this author in PubMed Google Scholar
Fatima Boumahdi
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Abdelkarim Remmide
View author publications
You can also search for this author in PubMed Google Scholar
Amina Madani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rebeh Imane Ammar Aouchiche.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animal participants

This article does not contain any studies with human participants or animals performed by any of the authors.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Aouchiche, R.I.A., Boumahdi, F., Remmide, M.A. et al. Authorship attribution in twitter: a comparative study of machine learning and deep learning approaches. Int. j. inf. tecnol. (2024). https://doi.org/10.1007/s41870-024-01788-z

Download citation

Received: 12 November 2023
Accepted: 13 February 2024
Published: 17 March 2024
DOI: https://doi.org/10.1007/s41870-024-01788-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Authorship attribution in twitter: a comparative study of machine learning and deep learning approaches

Abstract

Access this article

Similar content being viewed by others

Authorship Attribution Using Capsule-Based Fusion Approach

Contribution of Improved Character Embedding and Latent Posting Styles to Authorship Attribution of Short Texts

Integrating RoBERTa Fine-Tuning and User Writing Styles for Authorship Attribution of Short Texts

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animal participants

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Authorship attribution in twitter: a comparative study of machine learning and deep learning approaches

Abstract

Access this article

Similar content being viewed by others

Authorship Attribution Using Capsule-Based Fusion Approach

Contribution of Improved Character Embedding and Latent Posting Styles to Authorship Attribution of Short Texts

Integrating RoBERTa Fine-Tuning and User Writing Styles for Authorship Attribution of Short Texts

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and animal participants

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation