Friday, February 26, 2016

Measuring User Influence in Github: The Million Follower Fallacy

[Download Pdf]

Type: Publication (accepted)

Venue: 3rd International Workshop on CrowdSourcing In Software Engineering; CSI-SE 2016
May 16, 2016, Austin, TX, USA

Authors: Ali Sajedi Badashian, Eleni Stroulia
Department of Computing Science, University of Alberta, Canada

Abstract
Influence in social networks has been extensively studied for collaborative-filtering recommendations and marketing purposes. We are interested in the notion of influence in Software Social Networks (SSNs); more specifically, we want to answer the following questions: 1) What does “influence” mean in SSNs? Given the variety of types of interactions supported in these networks and the abundance of centrality-type metrics, what is the nature of the influence captured by these matrics? 2) Are there silos of influence in these platforms or does influence span across thematic boundaries?

To investigate these two questions, we first conducted an in-depth comparison of three influence metrics, number of followers, number of forked projects, and number of project watchers in GitHub1 (the largest code-sharing and versioncontrol system). Next, we examined how the influence of the top software engineering people in GitHub is spread over different programming languages.

Our results indicate (a) that the three influence metrics capture two major characteristics: popularity and content value (code reusability) and (b) that the influence of influentials is spread over more than one programming language, but there is no specific trend toward any two programming languages.

Keywords:
Software engineering, influence, Software repositories, programming languages, software social network analysis, crowdsourcing