Open Source .NET – 2 years later

A little over 2 years ago Microsoft announced that they were open sourcing large parts of the .NET framework and as Scott Hanselman said in his recent Connect keynote, the community has been contributing in a significant way:

Over 60% of the contribution to .NET Core come from the community

You can see some more detail on this number in the talk ‘What’s New in the .NET Platform’ by Scott Hunter:

Connect talk - Community Contributions per month

This post aims to give more context to those numbers and allow you to explore patterns and trends across different repositories.


Repository activity over time

First we are going to see an overview of the level of activity in each repo, by looking at the total number of ‘Issues’ (created) or ‘Pull Requests’ (closed) per month. (Yay sparklines FTW!!)

Note: Numbers in black are from the most recent month, with red showing the lowest and green the highest previous value. You can toggle between Issues and Pull Requests by clicking on the buttons, hover over individual sparklines to get a tooltip showing the per/month values and click on the project name to take you to the GitHub page for that repository.

The main trend I see across all repos is there’s a sustained level of activity for the entire 2 years, things didn’t start with a bang and then tailed off. In addition, many (but not all) repos have a trend of increased activity month-by-month. For instance the PR’s in CoreFX or the Issues in Visual Studio Code (vscode) are clear example of this, their best months have been the most recent.

Finally one interesting ‘story’ that jumps out of this data is the contrasting levels of activity (PR’s) across the dnx, cli and msbuild repositories, as highlighted in the image below:

Comparison of dnx v cli v msbuild

If you don’t know the full story, initially all the cmd-line tooling was known as dnx, but in RC2 was migrated to .NET Core CLI. You can see this on the chart, activity in the dnx repo decreased at the same time that work in cli ramped up.

Following that, in May this year, the whole idea of having ‘project.json’ files was abandoned in favour of sticking with ‘msbuild’, you can see this change happen towards the right of the chart, there is a marked increase in the msbuild repo activity as any improvements that had been done in cli were ported over.


Methodology - Community v. Microsoft

But the main question I want to answer is:

How much Community involvement has there been since Microsoft open sourced large parts of the .NET framework?

(See my previous post to see how things looked after one year)

To do this we need to look at who opened the Issue or created the Pull Request (PR) and specifically if they worked for Microsoft or not. This is possible because (almost) all Microsoft employees have indicated where they work on their GitHub profile, for instance:

David Fowler Profile

There are some notable exceptions, e.g. @shanselman clearly works at Microsoft, but it’s easy enough to allow for cases like this. Before you ask, I only analysed this data, I did not keep a copy of it in stored in MongoDB to sell to recruiters!!

Overall Participation - Community v. Microsoft

This data represents the total participation from the last 2 years, i.e. November 2014 to October 2016. All Pull Requests are Issues are treated equally, so a large PR counts the same as one that fixes a spelling mistake. Whilst this isn’t ideal it’s the simplest way to get an idea of the Microsoft/Community split.

Note: You can hover over the bars to get the actual numbers, rather than percentages.

Issues: Microsoft Community
Pull Requests: Microsoft Community

The general pattern these graphs show is that the Community is more likely to open an Issue than submit a PR, which I guess isn’t that surprising given the relative amount of work involved. However it’s clear that the Community is still contributing a considerable amount of work, for instance if you look at the CoreCLR repo it only has 21% of PRs from the Community, but this stills account for almost 900!

There’s a few interesting cases that jump out here, for instance Roslyn gets 35% of its issues from the Community, but only 6% of its PR’s, clearly getting code into the compiler is a tough task. Likewise it doesn’t seem like the Community is that interested in submitting code to msbuild, although it does have my favourite PR ever:

Fix legacy msbuild issues


Participation over time - Community v. Microsoft

Finally we can see the ‘per-month’ data from the last 2 years, i.e. November 2014 to October 2016.

Note: You can inspect different repos by selecting them from the pull-down list, but be aware that the y-axis on the graphs are re-scaled, so the maximum value will change each time.

Issues: Microsoft Community
Pull Requests: Microsoft Community

Whilst not every repo is growing month-by-month, the majority are and those that aren’t at least show sustained contributions across 2 years.


Summary

I think that it’s clear to see that the Community has got on-board with the new Open-Source Microsoft, producing a sustained level of contributions over the last 2 years, lets hope it continues!

Discuss this post in /r/programming