License: arXiv.org perpetual non-exclusive license
arXiv:2401.13782v1 [cs.DL] 24 Jan 2024

Tweets to Citations: Unveiling the Impact
of Social Media Influencers on AI Research Visibility

Iain Xie Weissburg
Department of Electrical and Computer Engineering
University of California, Santa Barbara
Santa Barbara, CA, 93106
ixw@ucsb.edu
Mehir Arora*{}^{*}start_FLOATSUPERSCRIPT * end_FLOATSUPERSCRIPT
Department of Computer Science
University of California, Santa Barbara
Santa Barbara, CA, 93106
Liangming Pan
Department of Computer Science
University of California, Santa Barbara
Santa Barbara, CA, 93106
William Yang Wang
Department of Computer Science
University of California, Santa Barbara
Santa Barbara, CA, 93106
Equal ContributionCorresponding Author
Abstract

As the number of accepted papers at AI and ML conferences reaches into the thousands, it has become unclear how researchers access and read research publications. In this paper, we investigate the role of social media influencers in enhancing the visibility of machine learning research, particularly the citation counts of papers they share. We have compiled a comprehensive dataset of over 8,000 papers, spanning tweets from December 2018 to October 2023, alongside 1:1 matched controls based on publication year, venue, and abstract topics. Our analysis reveals a significant increase in citations for papers endorsed by these influencers, with median citation counts 2-3 times higher than those of the control group. Additionally, the study delves into the geographic, gender, and institutional diversity of highlighted authors. These findings highlight the expanding influence of social media in scholarly communication and underscore the importance of an evolving ecosystem in today’s digital academic landscape.

1 Introduction

Refer to caption
Figure 1: The number of papers accepted to top AI/ML conferences (solid) and shared by influencers (dashed) from 2014-2023 Li (2023).

In the evolving landscape of artificial intelligence and machine learning (AI/ML), the exponential increase in conference papers, as depicted in Figure 1, alongside rapid technological advancements, has significantly transformed the dissemination of scholarly knowledge. A notable aspect of this transformation is the practice of online preprint sharing, with platforms like ArXiv becoming particularly prominent within the AI/ML community. This phenomenon allows for early access to research, often months before official publication, raising questions about the evolving relevance and reception of these papers in traditional academic forums. Most notably, how do people select papers to read in the online age?

This paper aims to explore the changing dynamics of academic discourse within the AI/ML community, emphasizing the constructive role of social media in addressing the challenges posed by the sheer volume of literature. We specifically focus on the case study of two influential 𝕏𝕏\mathbb{X}blackboard_X (formerly Twitter) users, AK (@_akhaliq) and Aran Komatsuzaki (@arankomatsuzaki), to understand how their social media activities aid in the curation and visibility of research. These influencers have emerged as pivotal figures in navigating the flood of information, akin to journalists in civic society, highlighting and contextualizing significant works for the community.

Acknowledging the vital function of these influencers as curators, we examine the impact of their endorsements on the citation counts of shared papers. However, we also underscore the importance of maintaining a balanced research ecosystem. An over-reliance on a select group of curators may inadvertently skew the research landscape, emphasizing certain topics or perspectives over others. Therefore, we advocate for a responsible approach to curation, encouraging influencers to maintain a journalistic integrity that includes showcasing diverse research topics, authors, and institutions.

Our study addresses the challenge of comprehending paper consumption in today’s fast-paced academic environment and seeks to unravel the factors influencing the recognition and impact of academic works amid an abundance of research and evolving norms in paper selection and review. We delve into the dual role of AK and Komatsuzaki in academic communication, observing their activity on platforms like 𝕏𝕏\mathbb{X}blackboard_X and Hugging Face. Our primary aim is to determine whether papers endorsed by these influencers receive statistically higher citations compared to non-endorsed works, thereby shedding light on their impact on paper visibility and potential biases in the field.

Contributions.

To this end, we provide the following three main contributions to the discussion about the increasing impact of academic social media figures in the AI/ML domain:

\bullet Comprehensive Dataset with Control Samples. We collect a dataset of shared papers from AK and Komatsuzaki containing key paper information. We select control samples through precise matching, using publication details and text embeddings as markers of paper quality. We verify that our methodology successfully controls for paper quality, evidenced by similarly distributed conference review scores. From this, we can rule out the common assumption that influencers share "higher-quality" papers, which naturally would generate more citations.

\bullet Thorough Analysis of Citations and Demographics. Our comprehensive analysis aims to determine if papers shared by these influencers receive higher citation counts compared to non-shared papers, offering insights into the impact of social media on academic research dissemination. We recognize the beneficial role of influencers in curating research amidst a deluge of publications, yet we highlight the need for balanced and diverse curation practices.

\bullet Proposals for Future AI/ML Information Sharing. Finally, we propose that the academic community, particularly conference organizers, engage in a future discussion on evolving the conference system. This evolution is necessary to address the core challenge of managing an overwhelming number of submissions, ensuring that quality research is recognized and disseminated effectively. This approach will help ensure a more balanced and comprehensive understanding of research quality and relevance in these changing times.

2 Related Work

For over a decade now, social media platforms–namely 𝕏𝕏\mathbb{X}blackboard_X–have been studied as an influential means of scholarly communication. Darling et al. (2013) discusses the potential role of Twitter in many stages of the ’life-cycle’ of academic authorship and publication. Eysenbach (2011) provides early evidence to support social media sharing as a predictor of higher citation count in medical publications. More recently, studies have shown a statistically significant relationship between Twitter presence and citation counts across fields, especially after the first tweet Peoples et al. (2016); Vaghjiani et al. (2021). Others suggest that the correlation between tweets and citations is insignificant through randomized trials Tonia et al. (2016); Branch et al. (2023).

Thus far, the literature has focused on social media as a forum for multi-user online discussion. In our work, we examine top-down dissemination from singular influencers, with many more followers than previously studied. Additionally, we focus on the AI/ML space, while many previous works examine fields with a much lower volume of repository publications (biology, medicine, etc.). We also contribute a geographic and gender analysis of influencer-sharing patterns and provide recommendations to conferences, influencers, and the wider community on managing the evolving field of Artificial Intelligence research.

3 Data Collection

AK Komatsuzaki
Unique Papers 9,171 1,273
Papers with all attributes 8,267 1,194
Matched Papers 7,462 1,152
Table 1: The number of unique papers tweeted by each influencer, the subset of papers with desired attributes available through the Semantic Scholar API (title, abstract, year of publication, venue of publication, and citation count), and the number of papers finally included in our analysis after matching.
AK Komatsuzaki
Name Freq. Name Freq.
S. Levine 85 L. Zettlemoyer 31
Furu Wei 82 Quoc V. Le 28
Jianfeng Gao 71 Yi Tay 26
L. Zettlemoyer 64 Furu Wei 23
Ziwei Liu 62 M. Dehghani 18
Table 2: The top 5 most common authors shared by each user and the number of papers where they are credited.

We model our analysis on retrospective cohort studies, in which a treatment and control group with identical underlying covariates are compared to determine the average treatment effect. In our case, we assume that a paper’s citation count is most strongly influenced by elapsed time, quality, and topic. While elapsed time is simple to measure, paper quality and topic are difficult to quantify. We use the publication venue and year as a proxy for quality and use a text embedding of the paper’s title and abstract to approximate the topic.

As such, our data collection process consists of three parts: (1) collecting the Target Set, the papers tweeted by @_akhaliq and @arankomatsuzaki, (2) collecting a large dataset of potential papers to match against, and (3) forming the Control Set by matching papers from (1) to papers from (2) with respect to the year of publication, the publication venue, and a text embedding of title and abstract. We detail each step below.

Target Set. The Target Set is collected by searching for document identifiers in the 𝕏𝕏\mathbb{X}blackboard_X feeds of both influencers (identifiers are found in links to arXiv.org or huggingface.co/papers). We use the Semantic Scholar API to query every document’s title, abstract, year of publication, venue of publication, and citation count. We also collect the authors of each paper but do not use this information in our matching. Notably, some papers are either not available on Semantic Scholar or are not freely accessible. Hence, we remove any paper that lacks a required attribute. The number of unique papers and papers with all attributes for each influencer is shown in Table 1.

We find that roughly 90% of papers tweeted by either influencer have all attributes available. We express moderate caution about the minority of papers not included in our analysis, as they may be systematically different from papers we do include. However, we believe the large number of papers remaining in our analysis is sufficient to draw meaningful conclusions.

Refer to caption
Figure 2: Mean OpenReview scores of tweeted papers vs non-tweeted controls from 6 major ML conferences, the kernel density estimate of the joint distribution of scores, and the line y=x𝑦𝑥y=xitalic_y = italic_x. This shows the quality of papers in both sets are more or less equivalent.

Control Set. To build our Control Set, we first collect a large dataset of papers presented at the same venues and in the same years as those in the Target Set. Specifically, for every instance of a paper published in year y𝑦yitalic_y at venue v𝑣vitalic_v, we query the Semantic Scholar API for all papers published in year y𝑦yitalic_y at venue v𝑣vitalic_v. We sample as many papers as the Semantic Scholar Database has available, yielding 247,993 unique papers, and 124,940 papers with all required attributes. This dataset forms the corpus from which we match papers to our target set.

The choice of matching algorithm has been the subject of fierce debate; we follow the recommendations in King and Nielsen (2019) by performing exact matching on our categorical variables (publication venue and topic) and Euclidean distance matching on our continuous variables (topic embedding). Note that Euclidean distance is equivalent to cosine similarity for normalized vectors. Optimal matching can be reduced to the linear sum assignment problem, for which many efficient algorithms exist Crouse (2016). We use the implementation available in SciPy Virtanen et al. (2020). Alternate matching algorithms did not significantly change our results.

Using this methodology, we match each paper in the Target Set to a paper in the Control Set with respect to the gathered covariates. We exclude any paper for which we cannot find an exact match on publication venue or year. For matching on text embeddings, we set a cutoff of 0.6 for cosine similarity, which ensures close topic similarity in matched pairs while retaining 91% of AK’s tweeted papers and 96% of Komatsuzaki’s tweeted papers.

Qualitatively, the matched pairs are very similar in topic, almost always covering the same subfield of research (for example, diffusion models applied to image generation), approaching the same problems, and using similar or the same methods. We note that the modal cosine similarity is 0.7, where performance is much better than at the cutoff value of 0.6, but we show matched pairs across a range of cosine similarities to demonstrate that our method is successful at all levels.

As before, we express caution about dropping papers from the Target Set, since this may introduce bias. However, we find similar results when we do not drop any papers from the Target Set. For more details on dropping papers, alternate matching algorithms, and example matched pairs, see Appendix A.

Review Scores. To ensure that we have successfully controlled for quality through our matching, we will look at the review scores of experimental and control pairs from some selected conferences. We extract the review data from ICLR, AAAI, ICML, NeurIPS, EMNLP, and KDD using OpenReview (2023). Across both paired datasets, we found 991 out of 8525 unique pairs with available scores.

We plot the mean review score of each treatment paper against the mean score of its paired control (Figure 2). For conferences that do not use numbered review scores, we assign numbers based on those of other conferences (7: Accept, 5: Borderline Accept, 3: Borderline Reject, 2: Reject). Using the same three significance tests from Table 3, we do not find sufficient evidence to reject the null hypothesis that the control and experimental scores are from the same distribution (p𝑝pitalic_p-value >0.5absent0.5>0.5> 0.5). Assuming that mean review scores are an accurate measure of paper quality, we can conclude that we have effectively controlled for paper quality in our datasets.

Refer to caption
(a) @_akhaliq Dataset Histogram
Refer to caption
(b) @arankomatsuzaki Dataset Histogram
Refer to caption
(c) @_akhaliq Dataset Violin Plots
Refer to caption
(d) @arankomatsuzaki Dataset Violin Plots
Figure 3: Plots showing the distribution of citations in the two experimental datasets and matched control samples. Citation counts are scaled with the natural logarithm using numpy.log1p. Both comparisons show that papers shared by influencers have attained significantly higher citations for all three quartiles than those in the control sets.

4 Analysis

To answer the central question of our work, we compare the impacts of papers shared by AK and Komatsuzaki against our control set. Then, we conduct multivariate analyses by geographic distributions and author attributes in their selected papers.

Contrasting Analysis. For the contrasting analysis, we will test the following hypotheses for correlation:

  • Null: Influencer-shared papers have the same citation count as others in the same field.

  • Alternative: Influencer-shared papers have a higher citation count than others in the same field.

Refer to caption
(a) @_akhaliq Dataset
Refer to caption
(b) @arankomatsuzaki Dataset
Figure 4: 2-Sample Q-Q Plots comparing the experiment and control set distributions across every quantile. To build the plot, citation counts are log-scaled, normalized to the control distribution (z-scores), sorted, and paired in order. The dotted line shows an equal distribution; any points above the line show a higher experimental quantile, and vice versa. The plots show that both experimental distributions are consistently higher, especially closer to the median.

We compare our paired target and control sets, as described in Section 3. We find that papers tweeted by AK have a median citation count of 24 (95% CI: 23, 25) versus 14 (95% CI: 13, 15) in the control group, and papers tweeted by Komatsuzaki have a median citation count of 31 (95% CI: 27, 34) versus 12 (95 % CI: 10.5, 13.5). Visually, we can see that both experimental set distributions are skewed toward higher citation counts when compared to their corresponding control sets (Figure 3). In the violin plots (Figures 2(c) and 2(d)), the three quartiles and max values are all higher in both of the shared paper distributions compared to the controls. In the 2-Sample Q-Q plots (Figure 4), we can see that the normalized quantiles are consistently higher for the test distributions.

The Q-Q results are further strengthened by the Cliff’s Delta values for each pairwise sample (Table 3). Using the effect sizes proposed by Hess and Kromrey, the Cliff’s Delta shows a large and medium effect for Komatsuzaki’s and AK’s sharing, respectively (2004). This suggests that the impact of influencer sharing is practically significant in altering the outcome variable, which is the citation count of the papers.

Finally, we establish statistical significance with three tests comparing the distributions of the experimental data with that of the control sets, Epps-Singleton (ES), Kolmogorov-Smirnov (KS), and Mann-Whitney U (MWU), none of which assume normal distribution, which is essential for our data (especially controls). Table 3 shows the results, all with p𝑝pitalic_p-values well below even a stringent α=0.001𝛼0.001\alpha=0.001italic_α = 0.001. From this, we can strongly reject the null hypothesis that the citation distributions for the influencer-shared papers and the control papers are the same.

AK Komatsuzaki
Epps-Singleton p<0.0001𝑝0.0001p<0.0001italic_p < 0.0001 p<0.0001𝑝0.0001p<0.0001italic_p < 0.0001
Kolmogorov-Smirnov p<0.0001𝑝0.0001p<0.0001italic_p < 0.0001 p<0.0001𝑝0.0001p<0.0001italic_p < 0.0001
Mann-Whitney U p<0.0001𝑝0.0001p<0.0001italic_p < 0.0001 p<0.0001𝑝0.0001p<0.0001italic_p < 0.0001
Cliff’s Delta 0.227 0.333
Table 3: 2-sample Statistical measures and significance tests for the difference in distributions of the two experiment and control datasets. These tests suggest a statistically significant difference in the experimental and control distributions for both influencer datasets.

Overall, the correlation between influencer tweets and citation count–and not review scores–points to a shift in how the community finds and reads papers. While traditionally, top conference acceptance (i.e. review score) has been a primary indicator of future citation count Lee (2018), we have shown that the sharing practices of far-reaching influencers are now a significant indicator of future research impact through citations.

Geographic Distributions. In exploring the evolving landscape of machine learning (ML) paper dissemination, it is essential to consider the implications of a more centralized curation model, particularly as it relates to geographic and gender diversity in scholarly works. Our approach is to present data and observations that highlight trends and do not attribute intentional bias to the influencers involved.

Refer to caption
(a) affiliation heatmap of AK-shared authors
Refer to caption
(b) affiliation heatmap of Komatsuzaki-shared authors
Figure 5: Geographic heatmaps of the unique affiliations of influencer-shared paper authors. High density (red) on the map represents a large number of unique institutions and not a large number of papers, citations, or authors from the given area. Both influencers shared from institutions around the world, with especially large hotspots in the US and Europe.

Our analysis begins by examining the geographic distribution in the dissemination of ML papers. Given the American affiliations of AK and Aran Komatsuzaki, we explore whether this translates into a geographic skew in the papers they share. To contextualize our findings, we refer to the geographic distribution of AI repository publications from the Stanford HAI 2023 AI Index Report (Figure 6). We choose to view this data in particular, as our selected influencers share papers from repositories (i.e. ArXiV).

To test this, we first collect the geographic data of the shared papers. First, we use Semantic Scholar and dblp (2023) to collect the affiliation data of all listed authors from each test set. We then utilize the Nominatim geocoding API to find the approximate latitude and longitude of each affiliation. However, many of the reported affiliations are not suitable for the API and produce incorrect geocodes. To combat this, we manually adjust visibly inaccurate coordinates using publically available addresses online. From this information, we perform a reverse geocode with Nominatim to find the country of each affiliation, then use majority voting to assign each publication a country. At this point, we can see that both influencers share papers from around the world in Figure 5. Finally, we aggregate these countries into the same geographic areas used in the HAI Report and plot using a similar format (Figure 7).

To account for the discrepancy between date ranges in the HAI Report and our influencers’ activity, we will limit our analysis to the overlap between them. Additionally, we will focus on the range from 2018 to 2021, because of the low sample size of shared papers pre-2018–only 1 for AK and 4 for Komatsuzaki.

During these years, Figure 6 indicates a slight decline in the United States’ share of AI repository publications following its peak. This decrease may suggest a maturation of the AI field, where research is becoming more globally dispersed. Concurrently, the European Union and United Kingdom demonstrate a modest uptick in publications after a consistent decline from 2010-2017, while China’s share continues to rise.

In contrast, the sharing patterns of the influencers from 2018 to 2021, shown in Figure 7, demonstrate a notable deviation from these global trends. Specifically, AK’s shared publications exhibit a sharp decline in the initially overwhelming "Unknown" category, with a concurrent and dramatic rise in the United States’ share. This seems to indicate better affiliation reporting rather than a change in AK’s sharing practices, as the shares from other areas stay relatively constant. Komatsuzaki’s data exhibits a consistent focus on U.S.-affiliated papers, with the notable absence of some geographic regions until later years.

While the global landscape of AI publications suggests an increase in diversity and a more even distribution of research output, our data presents a skewed alignment, favoring the United States. We must note, though, that using solely self-reported affiliations can have an inherent bias toward the United States. For example, many researchers affiliated with multinational organizations are assigned to the United States (where headquarters are located) despite working with a branch in another area. Additionally, we must note the prominence of the "Unknown" category in the data of both influencers, where affiliations were not found.

Nevertheless, our results highlight the potential for centralized individuals to shape the perceived narrative of AI research prominence.

Refer to caption
Figure 6: AI repository publications by geographic area and publication year, 2010–2021. Data extracted from Figure 1.1.19 of the Stanford HAI 2023 AI Index Report Maslej et al. (2023).
Refer to caption
(a) @_akhaliq-shared papers by geographic area per year
Refer to caption
(b) @arankomatsuzaki-shared papers by geographic area per year
Figure 7: Influencer-shared publications by geographic area and publication year, 2018-2023, from Semantic Scholar and DBLP affiliation data.

Gender Distributions. Beyond geographic diversity, gender diversity is crucial in Computer Science and Engineering, fields historically dominated by men. We extract author names and affiliations as described above. In this section, we filter only the first authors of each paper. For gender classification, we used the AMiner Scholar Gender Prediction API, which categorizes authors as "male," "female," or "UNKNOWN" based on name and affiliation–if available. The API uses a majority vote of the results from three sources: google image search and facial recognition, the Facebook Generated Names List Tang et al. (2011), and WebGP Gu et al. (2016).

To ground our view of the overall gender distribution in the field, we reference the Taulbee survey’s reported gender distribution of US Ph.D. awardees and faculty in CS and related fields from 2021-2022 Zweben and Bizot (2023). To match the classifications we have available, we will consider the binary reported genders from the survey.

Our analysis revealed an 80:20 male-to-female ratio among authors with identifiable genders in the @_akhaliq dataset and an 81:19 ratio in the @arankomatsuzaki dataset. These ratios align somewhat with the Taulbee survey’s reported 77:23 ratio in computing Ph.D. awardees and deviated slightly more from the 76:24 ratio in faculty. These deviations may stem from a trend toward increasing female representation in the ML space; the survey is recent data, while our influencer data spans several years into the past.

5 Discussion

Our analysis suggests that ML influencers strongly affect paper visibility, indicating a change in how ideas are propagated through the community. We discuss the downstream implications of influencers on the community and make recommendations on how to help improve the paper curation problem and enhance equity in publication visibility.

Influencers and the ML Community. Influencers serve as pivotal curators in the ML landscape. With the explosive growth of machine learning research, the community increasingly relies on social media to keep up-to-date with new developments. This reliance has naturally led to the rise of centralized influencers. Their role in streamlining the dissemination process is akin to that of journalists in news media, making novel ideas and breakthroughs more accessible. Just as journalists are indispensable to civic society, influencers enable the ML community to stay informed on the bleeding edge.

However, we urge caution against excessive dependence on a limited set of information sources. Research inherently involves bottom-up exploration of a wide array of topics to uncover groundbreaking ideas. Focusing on a handful of individuals’ highlighted papers necessarily offers a narrow view of the research landscape. We encourage the community to keep the online academic space competitive, allowing for a diversity of highly visible ideas. This goal can be achieved through active participation in an open, community-driven curation process, enhancing the variety of prominent ideas.

Additionally, like how we expect journalists to maintain integrity–through highlighting both prominent and underreported stories–influencers have a responsibility to share a diverse array of ideas. This includes showcasing various techniques and subtopics within their areas of focus, thereby exposing community members to new approaches and concepts.

Enhancing Equity. Unquestionably, we would rather have more voices than fewer in the machine learning community. The trend towards influencer-led dissemination in the ML community offers new avenues to enhance equity. Our geographic analysis indicates a bias towards U.S.-centered research in influencer-shared papers. Despite reflecting the current U.S. prominence in AI/ML repository publications, it also reveals the potential to include research from a broader range of regions, including Europe, Asia, Africa, and Latin America. Similarly, Machine Learning sees a dramatic disparity in male versus female authors. Additionally, our gender distribution analysis, though not indicating a bias in influencer-shared content, highlights the general gender disparity within ML. This disparity presents an opportunity to foster gender diversity in the field.

While we specifically examine geography and gender, these principles apply broadly: proactive efforts in the online domain can significantly enhance equity. We encourage influencers to actively promote visibility for diverse groups globally.

The geographical and social background of an influencer could inadvertently influence the diversity of shared papers. We emphasize that this does not necessarily indicate bias among current influencers but serves as a call for vigilance against potential biases as the AI/ML domain expands. This connects back to our recommendation for fostering a diverse online space where everyone can contribute to broadening the spectrum of shared ideas.

6 Conclusion

In this study, we have explored the evolving landscape of academic discourse in AI/ML, focusing on the role of social media influencers in the dissemination and recognition of scholarly works. Our investigation into the activities of AK and Aran Komatsuzaki on 𝕏𝕏\mathbb{X}blackboard_X and Hugging Face has provided crucial insights into how influencers are shaping the visibility and impact of research papers in this rapidly advancing field.

Influencers as Catalysts for Visibility: Our comprehensive analysis reveals that papers shared by AK and Komatsuzaki receive statistically higher citation counts compared to non-endorsed works, confirming the significant role these influencers play in amplifying the reach of specific research. This influence is not merely a reflection of sharing ’higher-quality’ papers, as controlled by our dataset, but a testament to their ability to highlight and contextualize substantial findings for the community.

Influencers as Curators: Influencers serve a key role in the AI/ML space by making novel ideas and breakthroughs more accessible. However, it’s crucial to balance their influence with a commitment to diversity. The reliance on a select few for information could result in a limited perspective on the vast research landscape. Therefore, we advocate for a diverse, competitive online academic space where a wide array of ideas and research can gain visibility.

Implications for Academic Practices: Our findings have significant implications for how the AI/ML academic community approaches the dissemination and reception of research. The rising influence of social media figures in academic communication necessitates a reevaluation of traditional modes of paper selection and review. We propose that conference organizers and academic institutions engage in a dialogue to evolve the conference system and peer-review processes, ensuring they adapt to the changing norms and continue to recognize and disseminate quality research effectively.

Future Directions: This study opens several avenues for future research. One potential area is to investigate the extent to which the trends observed in AI/ML apply to other scientific domains. Additionally, further exploration into the mechanisms behind the influence of social media on academic recognition, such as the role of algorithms and network effects, could provide deeper insights into managing the balance between traditional and modern dissemination channels.

In conclusion, our research highlights the growing interplay between social media and academic research dissemination in the AI/ML community. The influencer phenomenon, while beneficial in curating and highlighting significant research, brings with it challenges that necessitate careful consideration and action from the academic community. By fostering responsible curation practices and evolving traditional academic systems, we can ensure a balanced and inclusive research ecosystem that recognizes and promotes a wide array of scholarly contributions.

References

  • Branch et al. [2023] Trevor A. Branch, Isabelle M. Côté, Solomon R. David, Josh Drew, Michelle LaRue, Melissa C. Márquez, E. Chris M. Parsons, D. Rabaiotti, David Shiffman, David A. Steen, and Alexander L. Wild. Controlled experiment finds no detectable citation bump from Twitter promotion. bioRxiv, 2023. URL https://api.semanticscholar.org/CorpusID:262087567.
  • Crouse [2016] David F. Crouse. On implementing 2D rectangular assignment algorithms. IEEE Transactions on Aerospace and Electronic Systems, 52(4):1679–1696, 2016. doi: 10.1109/TAES.2016.140952.
  • Darling et al. [2013] Emily S. Darling, David Samuel Shiffman, Isabelle M. Côté, and Joshua A. Drew. The role of twitter in the life cycle of a scientific publication. ArXiv, abs/1305.0435, 2013. URL https://api.semanticscholar.org/CorpusID:7583994.
  • dblp [2023] dblp. dblp Computer Science Bibliography: Monthly Snapshot Release of October 2023. https://dblp.org/xml/release/dblp-2023-10-01.xml.gz, 2023. Accessed: 2023-10-01.
  • Eysenbach [2011] Gunther Eysenbach. Can Tweets Predict Citations? Metrics of Social Impact Based on Twitter and Correlation with Traditional Metrics of Scientific Impact. Journal of Medical Internet Research, 13, 2011. URL https://api.semanticscholar.org/CorpusID:2157129.
  • Gu et al. [2016] Xiaotao Gu, Hong Yang, Jie Tang, and Jing Zhang. Web User Profiling Using Data Redundancy. In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM ’16, page 358–365. IEEE Press, 2016. ISBN 9781509028467.
  • Hess and Kromrey [2004] M. R. Hess and J. D. Kromrey. Robust Confidence Intervals for Effect Sizes: A Comparative Study of Cohen’s d and Cliff’s Delta under Non-normality and Heterogeneous Variances. In Annual Meeting of the American Educational Research Association, volume 1. Citeseer, 2004.
  • King and Nielsen [2019] Gary King and Richard Nielsen. Why Propensity Scores Should Not Be Used for Matching. Political Analysis, 27(4):435–454, 2019 2019. URL https://doi.org/10.1017/pan.2019.11.
  • Lee [2018] Danielle H. Lee. Predictive power of conference-related factors on citation rates of conference papers. Scientometrics, 118:281–304, 2018. URL https://api.semanticscholar.org/CorpusID:53247921.
  • Li [2023] Xin Li. Conference-Acceptance-Rate, 2023. URL https://github.com/lixin4ever/Conference-Acceptance-Rate. GitHub repository. Accessed: 2023-12-18.
  • Maslej et al. [2023] Nestor Maslej, Loredana Fattorini, Erik Brynjolfsson, John Etchemendy, Katrina Ligett, Terah Lyons, James Manyika, Helen Ngo, Juan Carlos Niebles, Vanessa Parli, Yoav Shoham, Russell Wald, Jack Clark, and Raymond Perrault. Artificial Intelligence Index Report 2023, 2023.
  • OpenReview [2023] OpenReview. OpenReview API v1. https://api.openreview.net, 2023. Accessed: 2023-10-26.
  • Peoples et al. [2016] Brandon K. Peoples, Stephen R. Midway, Dana K. Sackett, Abigail J. Lynch, and Patrick B. Cooney. Twitter Predicts Citation Rates of Ecological Research. PLoS ONE, 11, 2016. URL https://api.semanticscholar.org/CorpusID:13689703.
  • Tang et al. [2011] C. Tang, K. Ross, N. Saxena, and R. Chen. What’s in a Name: A Study of Names, Gender Inference, and Gender Behavior in Facebook. In J. Xu, G. Yu, S. Zhou, and R. Unland, editors, Database Systems for Advanced Applications, volume 6637 of Lecture Notes in Computer Science, Berlin, Heidelberg, 2011. Springer. doi: 10.1007/978-3-642-20244-5_33.
  • Tonia et al. [2016] Thomy Tonia, Herman Van Oyen, Anke Berger, Christian Schindler, and Nino Künzli. If I tweet will you cite? The effect of social media exposure of articles on downloads and citations. International Journal of Public Health, 61:513–520, 2016. URL https://api.semanticscholar.org/CorpusID:27919192.
  • Vaghjiani et al. [2021] Nilan G Vaghjiani, Vatsal Lal, Nima Vahidi, Ali Ebadi, Matthew Carli, Adam P. Sima, and Daniel H Coelho. Social Media and Academic Impact: Do Early Tweets Correlate With Future Citations? Ear, nose, & throat journal, page 1455613211042113, 2021. URL https://api.semanticscholar.org/CorpusID:237292881.
  • Virtanen et al. [2020] Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020. doi: 10.1038/s41592-019-0686-2.
  • Zweben and Bizot [2023] Stuart Zweben and Betsy Bizot. CRA 2022 Taulbee Survey: Record Doctoral Degree Production; More Increases in Undergrad Enrollment Despite Increased Degree Production. Computing Research News, May 2023.

Appendix A Target-Control Matching

Score: 0.60

Target Title: Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs

Target Abstract: Do language models have beliefs about the world? Dennett (1995) famously argues that even thermostats have beliefs, on the view that a bel…

Control Title: An Interpretability Illusion for BERT

Control Abstract: We describe an"interpretability illusion"that arises when analyzing the BERT model. Activations of individual neurons in the network…

Score: 0.68

Target Title: I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation

Target Abstract: Commonsense capabilities of pre-trained language models dramatically improve with scale, leading many to believe that scale is the…

Control Title: Do language models have coherent mental models of everyday things?

Control Abstract: When people think of everyday things like an egg, they typically have a mental image associated with it. This allows them to correctly judge…

Score: 0.77

Target Title: MViTv2: Improved Multiscale Vision Transformers for Classification and Detection

Target Abstract: In this paper, we study Multiscale Vision Transformers (MViTv2) as a unified architecture for image and video classification, as…

Control Title: Object-Region Video Transformers

Control Abstract: Recently, video transformers have shown great success in video understanding, exceeding CNN performance; yet existing video trans…

Score: 0.78

Target Title: 3D Clothed Human Reconstruction in the Wild

Target Abstract: . Although much progress has been made in 3D clothed human reconstruction, most of the existing methods fail to produce robust results…

Control Title: UNIF: United Neural Implicit Functions for Clothed Human Reconstruction and Animation

Control Abstract: We propose united implicit functions (UNIF), a part-based method for clothed human reconstruction and animation with raw scans and…

Score: 0.80

Target Title: Open-Source Large Language Models Outperform Crowd Workers and Approach ChatGPT in Text-Annotation Tasks

Target Abstract: This study examines the performance of open-source Large Language Models (LLMs) in text annotation tasks and compares it with propr…

Control Title: Can ChatGPT Reproduce Human-Generated Labels? A Study of Social Computing Tasks

Control Abstract: The release of ChatGPT has uncovered a range of possibilities whereby large language models (LLMs) can substitute human intelligence…

Score: 0.85

Target Title: Leveraging Knowledge in Multilingual Commonsense Reasoning

Target Abstract: Commonsense reasoning (CSR) requires models to be equipped with general world knowledge. While CSR is a language-agnostic process…

Control Title: It’s All in the Heads: Using Attention Heads as a Baseline for Cross-Lingual Transfer in Commonsense Reasoning

Control Abstract: Commonsense reasoning is one of the key problems in natural language processing, but the relative scarcity of labeled data holds back the prog…

Table 4: Randomly sampled examples, matched pairs of papers tweeted by AK. Controls show similar topics at all levels of similarity. At high levels of similarity, papers tackle the same problems, use the same methods, and even have similar titles. Score refers to cosine similarity.

Appendix B Additional Analysis Plots

Figure 8: Additional plots showing the distribution of citations in the two experimental datasets and matched control samples. Citation counts are scaled with the natural logarithm using numpy.log1p.
Refer to caption
(a) @_akhaliq Dataset Cumulative Distribution Functions
Refer to caption
(b) @arankomatsuzaki Dataset Cumulative Distribution Functions
Refer to caption
(c) @_akhaliq Dataset Box Plots
Refer to caption
(d) @arankomatsuzaki Dataset Box Plots
Refer to caption
Figure 8: Additional plots showing the distribution of citations in the two experimental datasets and matched control samples. Citation counts are scaled with the natural logarithm using numpy.log1p.
Figure 9: Histogram and kernel distribution estimate of the OpenReview score difference between experiment and control samples in the merged dataset.

Appendix C Limitations

While our study provides valuable insights into the role of influencers in the machine learning (ML) community, there are several limitations that must be acknowledged:

Binary Gender Analysis: Our study utilized a gender prediction API that only outputs binary genders (male and female), limiting our ability to capture the full spectrum of gender identities. This binary approach oversimplifies the complex nature of gender and excludes non-binary, transgender, and gender-nonconforming individuals. Consequently, our findings may not accurately represent the diversity of gender identities within the ML community.

Accuracy with Non-Western Names: The gender prediction API may exhibit lower accuracy when analyzing non-Western names. This limitation potentially introduces bias in our gender distribution analysis, particularly in underrepresenting or misrepresenting authors from diverse cultural and ethnic backgrounds. As a result, the study’s conclusions regarding gender disparity might not fully capture the global diversity of the ML community.

Self-Reported Affiliations: The affiliations in our dataset are self-reported, which can lead to inconsistencies such as misspellings or missing location information. This limitation affects the accuracy of our geographical analysis, as the data may not accurately reflect the true affiliations or locations of the authors. Consequently, our observations regarding geographical concentration and diversity must be interpreted with caution.

Lack of Randomized Control Trial: The study did not include a randomized control trial, often considered the gold standard for establishing causality. Without this, we cannot conclusively determine whether the patterns observed are directly attributable to the influencers’ activities or are coincidental. This limitation means that our findings, while suggestive of trends, cannot definitively establish cause-and-effect relationships between influencer endorsements and paper visibility or citation counts.

In summary, these limitations highlight the need for cautious interpretation of our findings. Future research should consider incorporating more inclusive gender identification methods, improving the accuracy of cultural and geographical representation, and employing more rigorous experimental designs, such as randomized control trials, to strengthen the conclusions drawn from such studies.