GitHub just released its latest State of the Octoverse report with some astonishing numbers. Unfortunately, some of the numbers—like the claim of 40 million developers—are not just astonishing, they’re wrong.
I’m not suggesting some nefarious intent to deceive. GitHub folks aren’t like that. But by conflating accounts with developers, GitHub isn’t helping us get any closer to accurate data on the developer population.
More importantly, we don’t need to artificially inflate developer numbers in order to establish their importance.
40 million, sure. But 40 million of what?
It’s surprising that GitHub bothers to lead with the 40 million number at all, given that it immediately adds a caveat: 40 million refers to “the total number of non-spammy user accounts on GitHub as of September 30, 2019, regardless of their activity status.” OK, so we’re not talking about developers, but 40 million accounts is still impressive, right?
Analyst Lawrence Hecht was first to rain on the numbers parade, arguing, “Just because someone creates a GitHub account doesn’t mean they should be considered a developer. It is fascinating how many of these accounts become ‘inactive’ within a month of them being created.” Oof.
No, GitHub accounts do not necessarily correspond to a developer number. After all, I have a GitHub account, but I am hardly a developer—and I know plenty of folks in product marketing and product management who are on GitHub but aren’t developing software there or anywhere else.
Not only that, but many of those same accounts immediately go dark. Or sit fallow for years, as is the case with Tom Krazit. Then there are plenty of real, individual developers who have multiple accounts, as with Ian Massingham.
Surely, if we care at all about developers, we should be most interested in those that are actively contributing code. Hecht has lamented that “most of these [Octoverse] figures [represent] inactive people.”
By contrast, other efforts, like Adobe open sourceror Fil Maj’s attempts to measure corporate contribution rankings, do focus on active contributors. This is something GitHub could easily do but doesn’t. GitHub notes active contributors to a variety of projects in the Octoverse report, so clearly they have the data.
Not to worry. Analyst firms have done their best to measure developer populations. For example, IDC pegs the developer counts as follows:
- Five million full-time software developers
- Seven million part-time software developers
- Seven million non-compensated software developers
That makes 24.2 million total software developers globally. This rings true with other estimates like that of Evans Data, which reported 23 million developers in 2018 and expects 27.7 million by 2023.
OK, whatever. But, as Jono Bacon queries, “I am not sure why the average user/developer needs to care” how many developers there are on GitHub or anywhere else.
The developer numbers that matter
Some, like investor Ethan Kurzweil, are investing real money based on estimates of current and future developer populations.
Referencing GitHub’s 40 million number, for example, he declared it a “strong leading indicator that say[s] the market for developer technologies of tomorrow is going to be very bright indeed.” He’s almost certainly right, but not because of the erroneous 40 million number.
For example, GitHub’s Octoverse report lists the first contributions repository as one of the top repositories for contributions (ranking fourth overall) last year. That’s amazing because, as Hecht pointed out, this repo is designed to help beginners learn how to contribute to open source projects.
Given the ever-increasing importance of open source to individuals and organisations, the growth in the number of contributors to that repository (more than 15,000 and rising) is more significant than 40 million total accounts, real or imaginary.
Or, related, how about the 1.3 million first-time contributors in 2019? Or the fact that dramatically more open source contributions come from outside the U.S. than inside (80 per cent outside versus 20 per cent within)?
Or that Asia now accounts for 36 per cent of private repositories in 2019? Indeed, worldwide, Hong Kong, Singapore, and Japan are the fastest-growing countries in terms of contributor growth, while China sits behind only the U.S. in terms of open source use (measured by clones and forks).
These are the numbers that matter, because these kinds of numbers shape industries, yes, but also societies. We don’t have 40 million global developers, but we do have a swelling population of developers, with most of the activity happening outside the U.S. The 40 million marketing number doesn’t matter, but these facts do.