A new major version of Postgres is released each year around
September. I took a
look at which companies contributed the most to the most recent
release. This is not trivial for a few reasons.
First, many people contribute independently of their employer. Through
this article I will say something like “the companies that X” and by
that I mean the people employed by these companies.
Second, measuring contributions is fraught. Lines of code added/removed might
include documentation or generated code. You can consider commits but
different developers commit differently.
Third, this analysis only looked at contributions to the Postgres
server itself and not to the broader ecosystem of key projects for
Postgres users including pgbouncer,
pgjdbc,
pgvector,
pgadmin,
postgrest,
postgis, and so on.
Fourth, who employs who is in flux. This article involved good-faith
effort to identify each contributor’s employer through public data
such as LinkedIn, personal websites, and email domain. Among nearly 260
contributors, I couldn’t identify the employer of 46 of them. There
are likely mistakes in the margins.
Nonetheless, the data tells an interesting story. The companies behind
the most commits (EnterpriseDB, then Microsoft) are not the companies with
the most people who committed (Postgres Professional, then Amazon).
Here are the top 20 companies by commits, as well as lines added and
deleted and number of unique contributors.
| Company | Insertions | Deletions | Commits | Contributors |
|---|---|---|---|---|
| EnterpriseDB | 307947 | 219678 | 709 | 20 |
| Microsoft | 46246 | 17908 | 559 | 16 |
| Amazon | 41026 | 27949 | 537 | 24 |
| Snowflake | 28838 | 18052 | 333 | 4 |
| Databricks | 7244 | 6303 | 116 | 4 |
| NTT | 3523 | 1455 | 110 | 16 |
| ::Unknown:: | 7009 | 2977 | 102 | 46 |
| Fujitsu | 8686 | 2105 | 99 | 11 |
| Supabase | 2172 | 9254 | 51 | 3 |
| Postgres Professional | 9630 | 1760 | 48 | 30 |
| Datadog | 5770 | 1108 | 30 | 2 |
| Apple | 9501 | 2016 | 29 | 2 |
| SRA OSS K.K. | 1271 | 239 | 29 | 3 |
| University of Cambridge | 6280 | 1785 | 28 | 1 |
| ::Freelancer:: | 638 | 249 | 26 | 7 |
| Yandex | 1326 | 411 | 20 | 2 |
| Cybertec | 3241 | 196 | 20 | 6 |
| MotherDuck | 1106 | 114 | 20 | 1 |
| Tiger Data | 1124 | 202 | 16 | 4 |
| Tencent | 202 | 22 | 16 | 1 |
::Unknown:: are the 46 contributors whose employer I couldn’t
find. ::Freelancer:: are contributors who work independently as
freelancers.
Individual commits also tell fun stories. The sole
Intel-affiliated commit was to refactor the way Postgres checks for
SSE4.2
support
in CRC-32C calculations, preparatory work for future optimizations of
the CRC-32C code.
Another interesting commit was by first-time contributor, Sophie
Alpert, who fixed a bug around filtering on ctid range
checks. A
bug that “appears to have been present since the introduction of TID
scans in 1999”.
I’ll continue the exploration of Postgres development in future
posts. Let me know what you’d like to
learn more about!
About the author
Phil is the founder of The Consensus. Before this, he contributed to
Postgres products at EnterpriseDB, cofounded and led marketing at
TigerBeetle, and was an engineering manager at Oracle. He runs the
Software Internals Discord, the NYC Systems Coffee Club, the
Software Internals Email Book Club, and co-runs NYC
Systems. @eatonphil
Notice a mistake? Have a question or comment? Write to the editor.