AboutBlogNewsletterLinks

People in your software supply chain

Published 2022-05-31 by Seth Larson
Reading time: 6 minutes

For many open source consumers the "logical units" being depended on are libraries. However, the libraries themselves are only a product of what consumers are actually depending on: people.

You're likely to recognize the names of popular Python libraries like numpy, Requests, and Django. But could you name the maintainers of any of those libraries? How about the maintainers of transitive dependencies or who triaged your bug report last week?

This isn't a unique situation for the Python ecosystem, all open source ecosystems are structured similarly. Thousands of hours of labor are hidden behind every call to $ pip install installing a package. The modern software supply chain is both miraculous and terrifying.

Production and consumption of open source software is maturing for the better and along the way we shouldn't forget that the leaves in our dependency trees are living and feeling people.

This article uses examples from the Python ecosystem but likely applies to many ecosystems and projects.

requests
urllib3
idna
Nate Prewitt
Seth Michael Larson
chardet
Quentin Pradet
Andrey Petrov
Ian Stapleton Cordasco
Daniel Blanchard
Erik Rose
Mark Pilgrim
Andreas Jung
Kim Davies


Dependencies you won't find in requirements.txt

Invisible dependencies, invisible maintainers

There are many projects and people that go uncounted due to their place in the dependency tree or circumstances of the ecosystem. Projects for tooling and infrastructure around packaging are severely undercounted in terms of "number of dependents".

These are tools like pip, twine, and virtualenv and infrastructure like PyPI. pip itself only has ~80,000 dependent repositories listed on GitHub because pip is not typically listed within a dependency list or lock file. It's obvious to anyone who knows the Python ecosystem that this isn't representative of pip's actual number of dependents which is roughly every person installing any package from PyPI.

The people behind projects are also severely undercounted. Counting "maintainers" is typically done using metadata from package repositories like PyPI because there are no other easy ways to programmatically gather this information. Using this method leaves out many maintainers that aren't listed in package metadata. Maintainers don't exclusively manage releases and write code, they also secure funding, mentor new contributors, triage issues, and review pull requests. None of these tasks are captured by only looking at package metadata and commit authors.

Dependency lists and metrics are becoming important to the companies waking up to the importance of supporting open source software. Organizations are suddenly needing to traverse their dependency tree to see who they've quietly been dependent on for years. When projects and maintainers aren't on the traversal path they're at risk of being left out of discovery for financial support, recognition, and hiring.

This is a disservice to everyone: maintainers lose out and organizations aren't supporting people they're depending on.



Requests dependencies on GitHub listing an archived fork repo instead of pypa/wheel due to github/feedback#6456

Most of the work to improve this situation falls on platforms and organizations like GitHub, PyPA, and the PSF. I'm hopeful the roadmap for these organizations reflect the dire need to implement measures to recognize these less prominent members of the open source ecosystem.

Support maintainers through tightening standards

Being a maintainer of an open source project requires running fast just to stay still. Every project requires security responses with fixes, updates to dependencies, and support for new language versions, features, and platforms.

The list of "must haves" has been growing to include 2FA, password managers, automated publishing with API tokens, and we'll see signed packages and metadata in the near future. All of these are excellent improvements that maintainers should be implementing, but they also take time and are mostly being mandated to large numbers of volunteers.

With every additional hurdle that projects must implement to be up to current standards we're losing time that would have been spent on other tasks like triage, bug fixes, writing documentation, or engaging with the community. When the amount of work demanded from maintainers becomes too much we lose maintainer time to burnout, disinterest, and frustration.

Tidelift's approach to sustainably fostering high quality open source projects is to pay maintainers to configure account security, triage and fix security issues, and keep up-to-date project metadata. Tidelift also provides their own security response team to help maintainers triage security reports and request CVEs.

Dustin Ingram's talk at PyCon US 2022 on securing the open source software supply chain is a must watch for both open source consumers and producers. The talk left me feeling like we're in good hands when it comes to the Python packaging ecosystem. I especially loved the mention of giving away hardware keys to maintainers immediately following the announcement of required 2FA for popular projects.

Advocate on behalf of maintainers

The open source ecosystem is changing dramatically. Maintainers are forced to meet new demands but are not supported nearly enough. If you don't see an initiative for supporting open source maintainers within your organization then now is a great time to start raising awareness. Learn about and support the initiatives happening inside ecosystems your organization depends on or across all of open source.

If your organization is interested in sponsoring packages or maintainers it can be daunting to get started. Below I've listed three initiatives where companies have funded open source to use as inspiration. The common themes among the three examples are:

When doing dependency analysis don't forget to include tools and infrastructure in your search. Your team likely depends on these projects equally if not more than application dependencies.

Spotify's FOSS Fund

The Spotify FOSS fund aims to support projects that aren't corporately backed, align with company values, and are actively maintained. The initial amount for the fund is €100,000 with the goal of supporting even more projects in the future. Projects were selected using dependency data across repositories in addition to nominations from developers.

Sentry's FOSS Fund 155

Sentry published their FOSS Fund 155 which gave nearly $155,000 to open source maintainers. The amount donated was calculated using an article written by Gratipay founder Chad Whitacre which recommends a $2,000 donation per engineer per year. Sentry's donation included a range of donations to foundations like the PSF and projects like Psycopg and Vue.js. The projects receiving donations were determined using dependency analysis and asking for employee nominations and votes.

Indeed's FOSS Contributor Fund

Indeed announced the FOSS Contributor Fund blueprint which aims to help interested organizations democratize financial contributions to open source by allowing employees who contribute to open source to vote on which projects receive funds every quarter. Nomination criteria for projects include being in use by the organization, using an OSI approved license and having an approved mechanism for receiving funds. Example funding mechanisms include Open Collective, Patreon, PayPal, and GitHub Sponsors.

Indeed also sponsors over 100 individuals on GitHub Sponsors. The focus on individual sponsorships in addition to projects and organization aligns with what we know about how open source works today: a small number of contributors make up a large amount of the total contributions per project and typically maintain multiple projects.

Thanks for reading! ♡ Did you find this article helpful and want more content like it? Get notified of new posts by subscribing to the RSS feed or the email newsletter.


This work is licensed under CC BY-SA 4.0