Apache Kafka: 12-Month Engineering Intelligence Digest
Period: March 2025 through February 2026 Sources: GitHub (PRs, reviews, commits, comments), Jira, dev@kafka mailing list
Summary
The defining story of Kafka's past year is a project in architectural transition: three major protocol rewrites (KIP-848 consumer groups, KIP-932 share groups, KIP-1071 streams groups) shipped concurrently while a remarkably stable corps of 25-30 consistent contributors held the codebase together through stewardship work that dwarfed feature output every single month. Chia-Ping Tsai reviewed more than 2,000 PRs across the year with zero authored PRs, functioning as the project's single most critical bottleneck and its single most important quality gate simultaneously. The project's review capacity is concentrating, not diversifying: the top 5 reviewers absorbed an increasing share of review load each quarter while the total reviewer count declined.
The Year in Numbers
| Quarter | PRs Merged | Unique Authors | Active Reviewers | Stewardship % |
|---|---|---|---|---|
| Mar 2025 | ~207 | 57 | 50+ | 62% |
| Apr-Jun 2025 | ~1,469 | 65+ | 55+ | 62% |
| Jul-Sep 2025 | ~380 | 60+ | 50+ | 65% |
| Oct-Dec 2025 | ~475 | 59 | 51->44 | 68% |
| Jan-Feb 2026 | ~259 | 54 | 46 | 73% |
The stewardship-to-feature ratio climbed steadily from 62% in Q1 to 73% in early 2026. Feature work declined from 21% to under 8% as the project shifted from active KIP implementation toward stabilization.
Highlights
1. Chia-Ping Tsai: Architectural Gravity
chia7712 (Chia-Ping Tsai) reviewed an estimated 2,100+ PRs over 12 months while authoring fewer than 10. No other open-source contributor in this dataset comes close. His review breadth touched 60-70 unique authors per month, spanning every subsystem: core, clients, streams, connect, storage, tools.
His review volume per month:
| Month | PRs Reviewed | Review Comments |
|---|---|---|
| Mar 2025 | 199 | 394 |
| Apr | 116 | 196 |
| May | 103 | 275 |
| Jun | 95 | 245 |
| Jul | 75 | 210 |
| Aug | 89 | 280 |
| Sep | 110 | 202 |
| Oct | 177 | 152 |
| Nov | 218 | 157 |
| Dec | 186 | 155 |
| Jan 2026 | 70 | 130 |
| Feb 2026 | 69 | 120 |
The pattern is not uniform. A noticeable dip occurred July through September (75-110 reviews/month), followed by a sharp spike in October-November (177-218). This correlates with the Kafka 4.2 release cycle ramping up. The Jan-Feb 2026 dip to 69-70 reviews likely reflects the post-release exhale, but this is still 2-3x the volume of any other reviewer in those months.
On the mailing list, Tsai participated in 18 threads and cast 9 votes (5 binding) during a single 30-day window. On Jira, he created 36 tickets and delegated 34 of them to a network of contributors, primarily Taiwanese developers: Ming-Yen Chung (7 tickets), Ken Huang (5), Cheng Yi Chang (3), and others. This is a fully integrated technical leadership profile operating simultaneously across three platforms.
His review comments show consistent probing behavior. On PR #19216, he flagged that throwing RuntimeException instead of LogCleaningException would crash the LogCleaner thread, forcing users to reconfigure: "That could be a terrible issue as users have to reconfigure the LogCleaner to bring them back." On PR #19232, he caught a concurrency bug: "please create a temporary reference for currentImage. otherwise, it may cause concurrent issue."
The risk: Tsai is a single point of failure for review throughput. If his review volume dropped by 50%, the project's merge rate would decelerate significantly. No succession plan for this role is visible in the data.
2. Three Protocol Rewrites, One Year
Kafka simultaneously shipped three major protocol initiatives across this period:
KIP-848 (New Consumer Groups): Completed in March 2025 when dajac deleted 14,831 lines of old group coordinator code (PR #19255). The old coordinator removal was timed to Kafka 4.0.0. This was the culmination of multi-year work.
KIP-932 (Share Groups): The largest sustained feature effort of the year. Active from March through December across 200+ PRs. Key implementers: chirag-wadhwa5, apoorvmittal10, smjn, adixitconfluent, AndrewJSchofield, ShivsundarR, JimmyWang6. The effort peaked in May (71 merged PRs tagged KIP-932) and wound down by December as the implementation shifted to documentation and cleanup. AndrewJSchofield was the primary architectural reviewer, contributing 155+ reviews across Q4 alone.
KIP-1071 (Streams Groups): Built in parallel by lucasbru (server-side) and cadonna (client-side) starting in March. The cadonna-lucasbru bidirectional review relationship was the most distinctive collaboration pattern of the year: cadonna reviewed lucasbru 96 times and lucasbru reviewed cadonna 66 times in March alone. cadonna's last significant activity was in June 2025, after which the streams group work shifted to other contributors. lucasbru continued as the Streams subsystem's most consistent dual-role contributor (authoring + deep review) through February 2026.
The simultaneous execution of three protocol rewrites without significant merge conflicts or blocking incidents is the kind of coordination that doesn't appear in any dashboard.
3. The Review Economy is Concentrating
The number of active reviewers declined while the top reviewers absorbed more load:
| Month | Active Reviewers | Top 5 Share of Reviews |
|---|---|---|
| Mar 2025 | 50+ | ~60% |
| Jun 2025 | 45+ | ~62% |
| Sep 2025 | 45 | ~65% |
| Dec 2025 | 44 | ~70% |
| Feb 2026 | 46 | ~68% |
The Q4 decline from 51 reviewers (October) to 44 (December) was partially offset by new reviewers emerging: Pankraz76 (13 reviews in December, zero before), kamalcph surging from 1 review in November to 23 in December. But these compensated for departures rather than expanding capacity.
Pure reviewers who merit special attention:
-
junrao (Jun Rao): 0 PRs authored all year, approximately 150+ reviews. His comment depth is extraordinary: 8 comments per review in August 2025, 3-4 comments per review consistently. On PR #21065 (producer message corruption), his comment drove the architecture: "Each inflight batch has a pending RPC request in NetworkClient. It's probably better to deallocate the buffer when the RPC request completes." This is governance-level review, highly selective but deeply consequential.
-
jsancio: 40 review comments across 8 reviews in January 2026 (5 comments/review). Focused on KRaft and metadata subsystems with cmccabe.
-
kevin-wu24: Emerged as a consistent pure reviewer in Q4 (14->22->20 reviews), filling a gap in the KRaft/replication review load.
4. cadonna's Disappearance and What It Reveals
Bruno Cadonna (cadonna) was one of the most active contributors in March 2025: 7 PRs merged, 7 reviewed, 132 review comments. His client-side KIP-1071 work was architecturally critical. By June 2025, his GitHub activity had ceased. He did not appear in any month from July 2025 onward.
On Jira (as of Jan-Feb 2026), Cadonna still shows 11 comments, but these may be historical. On the mailing list, he has no visible activity.
His departure left a visible gap: the KIP-1071 client-side implementation that he was building (Streams membership manager, heartbeat request state management) had to be continued by others. The fact that the Streams group effort maintained momentum through mjsax's architectural review and contributions from aliehsaeedii, frankvicky, bbejeck, and Nikita-Shupletsov suggests the project absorbed the departure, but the original server-client partnership between lucasbru and cadonna was never replicated.
5. Mentorship Patterns: Who Is Developing Whom
The review data reveals deliberate, sustained mentorship relationships that persisted across multiple quarters:
chia7712's mentorship network (12-month aggregate):
| Mentee | Reviews Received from chia7712 | Trajectory |
|---|---|---|
| FrankYang0529 | ~150+ | Peaked Apr-Jun, steady reviewer |
| m1a2st | ~150+ | Grew from 3 merged/month to 10 merged/month |
| DL1231 | ~130+ | 4->8->15 merged PRs Oct-Dec (steepest growth) |
| mingyen066 | ~80+ | Steady stewardship contributor |
| frankvicky | ~70+ | Shifted to pure reviewer mid-year |
| TaiJuWu | ~50+ | Module migration specialist |
| Rancho-7 | ~40+ | Tapered off after Q2 |
The chia7712->DL1231 relationship is particularly notable: reviews escalated from 12 (October) to 21 (November) to 41 (December), correlating with DL1231's output growth from 4 to 15 merged PRs. This is the clearest "investment yields returns" pattern in the data.
mjsax's streams mentorship (12-month aggregate):
| Mentee | Reviews from mjsax | Phase |
|---|---|---|
| Nikita-Shupletsov | ~120+ | Consistent across H2 2025 |
| lucasbru | ~60+ | Peer relationship |
| lucliu1108 | ~40+ | Q4 concentration |
| aliehsaeedii | ~50+ | Exploded in Feb 2026 (45 reviews in one month) |
| frankvicky | ~30+ | KIP-1271 implementation |
| gensericghiro | ~20+ | Q4 |
mjsax's February 2026 was extraordinary: 52 PRs reviewed, 379 review comments, driven entirely by the KIP-1271 (Record Headers in State Stores) feature push. He surpassed chia7712 in comment volume for the first time, becoming the single most prolific reviewer for that month.
Other persistent mentorship pairs:
- AndrewJSchofield -> chirag-wadhwa5 (KIP-932, ~100+ reviews)
- apoorvmittal10 -> chirag-wadhwa5 (KIP-932, ~60+ reviews; three senior contributors investing in one person)
- lianetm -> kirktrue (consumer subsystem, 24 reviews in October alone)
- jolshan -> rreddy-22 (transaction subsystem)
- dajac -> brandboat (group coordinator, ~25 reviews in Feb 2026)
6. Stewardship: The Work Nobody Wants
Stewardship work (tests, config cleanup, dependency updates, module migrations, documentation, flaky test fixes) constituted 62-73% of all merged PRs every single month. The contributors doing this work consistently are holding the project together:
sjhajharia: The year's stewardship champion. 22 merged cleanup PRs in Q3 alone, systematically working through every module (Connect, Tools, Server, Trogdor, JMH-Benchmarks, Storage). Net code deleter: removed 5,144 lines in August while adding 3,424. Nearly zero review activity, meaning sjhajharia consumes review capacity without contributing back to it, but the cleanup work they do is precisely the work that nobody else volunteers for.
clolov: 91 maintenance PRs across April-June 2025 (Mockito migration, JUnit 5 migration, test cleanup). Continued through Q4 with module moves. Zero high-discussion PRs, minimal review attention received. The single highest-volume stewardship contributor of the year.
mingyen066: 11 merged PRs in February 2026, all stewardship (Jetty CVE fixes across multiple branches, Python CI, test migration). Zero reviews given. A pure steward who appeared in chia7712's mentorship network (receiving 10+ reviews/month).
dejan2609: Drove the Gradle 8->9 migration (PR #19513), which generated 110+ comments across three consecutive months (April-June). Build infrastructure upgrades are among the most thankless work in open source.
mumrah: 15 merged PRs in March 2025, all build/CI infrastructure. Multiple hotfixes for branch protection, flaky test marking, CI configuration. This is the person who keeps the build green.
7. The KIP-1271 Sprint (February 2026)
February 2026 saw a concentrated feature push for KIP-1271 (Record Headers in State Stores) that reshaped the month's contributor landscape. 40 related PRs shipped from 6 authors, totaling +35,474/-3,820 lines:
- frankvicky: 10 merged PRs (WindowStore implementation), plus 20 reviews and 87 review comments
- aliehsaeedii: 7 merged PRs (KeyValueStore implementation), plus 24 reviews and 135 review comments
- bbejeck: 6 merged PRs (SessionStore implementation)
- mjsax: Architectural gatekeeper, 379 review comments
The three implementers actively cross-reviewed each other's work: aliehsaeedii reviewed frankvicky 31 times, frankvicky reviewed aliehsaeedii 14 times. This triangular review pattern (implementer peers + senior gatekeeper) is a model for how large features can ship without bottlenecking on a single reviewer.
The foundational PR #21408 (ValueTimestampHeaders) by frankvicky generated 96 review comments. aliehsaeedii's review probed thread safety: "What if multiple threads see that (headers == null && rawHeaders != null) and they all do the deserialization here?" and correctness: "if headers==null, then every call to headers() creates a new RecordHeaders() instance. This violates the expected behavior." High probing ratio on foundational code, exactly as it should be.
8. The Kafka 4.2.0 Release: Four Release Candidates
The Kafka 4.2.0 release required four release candidates between January 23 and February 16, 2026, the most visible governance event of the period. Each RC failure meant a blocking bug was found:
- RC1 (Jan 23): Started by Christo Lolov
- RC2 (Jan 27-30): Issues with KAFKA-16505, KAFKA-19571
- RC3 (Feb 2-9): Blocked by KAFKA-20115 (group coordinator metadata unload failure, fixed by brandboat in PR #21396)
- RC4 (Feb 10-16): Finally passed with 3 binding +1 votes
The 24-day, 4-RC cycle for a major release is notable. Cross-referencing with GitHub: the KAFKA-20115 fix that unblocked RC3->RC4 was brandboat's group coordinator metadata unload fix, which required 43 discussion items and was reviewed by dajac (13 reviews) and chia7712 (7 reviews). This is a case where a relatively junior contributor (brandboat had ~12 PRs total across Q4) shipped a release-blocking fix under intense review pressure.
9. Role Trajectories: Who Changed
Several contributors underwent significant role shifts over the 12-month period:
frankvicky: Shifted from active author (7-10 merged PRs/month in Q1) to pure reviewer (0 merged, 34 reviews in June) and then back to heavy authoring in February 2026 (10 merged for KIP-1271). This oscillation between authoring and reviewing phases is a healthy pattern.
DL1231: The steepest growth curve. From 0 merged PRs in April to 15 in December. The chia7712 mentorship investment (12->21->41 reviews) correlates directly with DL1231's output escalation. By December, DL1231 was the most prolific author, with 14 of 15 PRs being stewardship work (module migration, test cleanup, remote storage refinement). A new stewardship leader emerging.
dajac: From 19 merged PRs in March (release manager role for 4.0) to near-zero in Q3, back to 14 in January 2026 (group coordinator refactoring), then down to 3 in February. Burst contributor pattern aligned with release cycles and coordinator work.
m1a2st: Steady growth from 10 merged/41 reviewed in March to 10 merged/14 reviewed in September. The shift from heavy reviewer to balanced author-reviewer reflects increasing seniority and ownership.
AndrewJSchofield: Extraordinary November (16 authored + 92 reviewed), sharply declining to 2 authored + 20 reviewed in December. End-of-year burnout or planned cooldown.
10. AI-Assisted Review Emergence
The copilot-pull-request-reviewer[bot] grew from 1-4 reviews per month in Q1 to 23 reviews in September, primarily reviewing lucasbru's and aliehsaeedii's PRs. A separate "Copilot" account posted 61 review comments in September 2025, all polishing-type: typo corrections, grammar, missing spaces. Zero probing or directing comments.
By Q4, the bot was reviewing 13-20 PRs per month. This is the earliest visible adoption of AI-assisted code review in the project, and it's notable that the comments are exclusively surface-level. The review economy's real work, the probing questions about concurrency, backward compatibility, and edge cases, remains entirely human.
Cross-Source Analysis
Jira-to-GitHub Pipeline
The Jira data (30-day window, Jan-Feb 2026) reveals the project's planning-to-shipping dynamics:
- 308 Jira issues, 80 resolved in-window
- Only 26% of assigned issues have linked PRs (code actually exists)
- 74% are planned but not yet coded
- 77 issues are assigned, still open, with no visible activity from the assignee
Chia-Ping Tsai's delegation network: 36 tickets created, 34 delegated to others. He created the tickets, assigned them to his mentorship network, reviewed the resulting PRs on GitHub, and closed the tickets on Jira. This is the full lifecycle of directed engineering work, visible only when you triangulate across three platforms.
Net ticket creators vs. net shippers:
| Creator (drives backlog) | Created | Resolved | Net |
|---|---|---|---|
| Chia-Ping Tsai | 36 | 2 | +34 |
| Matthias J. Sax | 17 | 1 | +16 |
| Mickael Maison | 14 | 2 | +12 |
| Shipper (executes work) | Resolved | Created | Net |
|---|---|---|---|
| David Jacot | 5 | 4 | +1 |
| TaiJuWu | 5 | 1 | +4 |
| TengYao Chi | 4 | 1 | +3 |
| Lan Ding (DL1231) | 4 | 1 | +3 |
The people shaping the backlog are a completely different set from the people shipping the code. This is not a dysfunction; in Apache's governance model, this split between PMC members who direct and committers/contributors who implement is by design.
Mailing List Governance
In a single 30-day window (late Jan to mid-Feb 2026):
- 10 KIPs passed formal votes
- Median time from discussion to vote closure: ~11 days
- The Kafka 4.2.0 release required 4 RCs over 24 days
- Mickael Maison volunteered as 4.3.0 release manager
Voting power concentration: Andrew Schofield cast 6 binding votes (highest), followed by Lucas Brutschy (6), Chia-Ping Tsai (5), Matthias J. Sax (4). These four plus David Jacot and Lianet Magrans form the effective governance core.
Stalled KIP: KIP-1270 (ProcessExceptionalHandler for GlobalThread) accumulated 17 discussion messages over 27 days but zero binding votes. The proposer (Arpit Goyal) cast 6 votes, all non-binding. This KIP needs a committer champion or it will die.
The Full Lifecycle Example
KIP-1271 (Record Headers in State Stores) provides the clearest end-to-end lifecycle visible across all three data sources:
- Mailing list: Proposed, discussed (8 messages), voted (passed with 3 binding +1 from Matthias J. Sax, Lucas Brutschy, Alieh Saeedi) in 9 days
- Jira: Tickets created (KAFKA-20121, KAFKA-20132, etc.) and assigned to implementers
- GitHub: 40 PRs shipped in February 2026 by frankvicky, aliehsaeedii, and bbejeck, with mjsax providing 379 review comments as architectural gatekeeper
This is a healthy pipeline. The discussion-to-code lag was under 30 days. The review intensity was high (554 review comments across the feature). The cross-review pattern between implementers prevented single-reviewer bottlenecks.
Community Health Dimensions
Newcomer Welcoming
The Jira newbie label appears on 9 issues, indicating an active onboarding pipeline. On GitHub, chia7712's broad review coverage (67 unique authors in March alone) means nearly every new contributor receives review attention from the project's most experienced reviewer. This is a strength.
Interaction Breadth
| Contributor | Unique Interaction Partners (peak month) | Profile |
|---|---|---|
| FrankYang0529 | 30 | Broad community connector |
| chia7712 | 67+ | Broadest reach of anyone |
| lucasbru | 25+ | Cross-subsystem |
| m1a2st | 24 | Growing breadth |
| AndrewJSchofield | 20+ | Share groups domain |
Contributors working in isolation (0 reviews given): jim0987795064 (16 merged Q3), RaidenE1 (9 merged Q3), shashankhs11 (7 merged Sep). These contributors consume review capacity without contributing back. Not necessarily a problem at small scale, but worth monitoring.
Net Reviewer Ratio (12-Month)
The project's review load-bearers:
| Reviewer | Estimated 12-Mo Reviews | PRs Authored | Ratio |
|---|---|---|---|
| chia7712 | ~2,100+ | <10 | >200:1 |
| junrao | ~150+ | 0 | pure |
| mjsax | ~500+ | ~40 | 12:1 |
| AndrewJSchofield | ~400+ | ~50 | 8:1 |
| lucasbru | ~200+ | ~70 | 3:1 |
| frankvicky | ~300+ | ~40 | 7:1 |
Consistency
The most consistent contributors (active every month for 12 months):
- chia7712, m1a2st, lucasbru, AndrewJSchofield, DL1231, mingyen066, mjsax
Burst contributors (high intensity, intermittent):
- dajac (release-cycle aligned), smjn (feature-aligned), sjhajharia (sustained burst Q3)
Dashboard vs. Reality
| What a Dashboard Shows | What Actually Happened |
|---|---|
| chia7712: 0 PRs merged | Reviewed 2,100+ PRs, mentored 7+ contributors, created 36 Jira tickets, cast 5 binding KIP votes, functioned as the project's architectural gravity center |
| DL1231: 15 PRs in December | Grew from 0 to 15 PRs/month over 8 months under chia7712's deliberate mentorship (12->21->41 reviews); 93% stewardship work holding the codebase together |
| sjhajharia: no feature PRs | Systematically cleaned every module in the project; net deleted thousands of lines of dead code |
| junrao: 0 PRs, ~6 reviews/month | 8 comments per review; single comment on PR #21065 drove the architecture for a critical producer corruption fix |
| frankvicky: 0 PRs in June 2025 | Shifted entirely to pure reviewer role (34 reviews); came back in Feb 2026 to ship 10 PRs for KIP-1271 |
| cadonna: disappeared from stats in July | His March 2025 client-side KIP-1071 work was architecturally foundational; his departure was absorbed but never replaced 1:1 |
| 73% stewardship ratio | The project is stabilizing after shipping three concurrent protocol rewrites; this is health, not stagnation |
| Kafka 4.2.0: "released Feb 2026" | Required 4 release candidates over 24 days; a junior contributor (brandboat) shipped the release-blocking fix under intense review |
| KIP-1271: "40 PRs merged" | Three implementers cross-reviewing each other + senior gatekeeper; 554 review comments; the highest-concentration feature sprint of the year |
| clolov: 91 PRs in Q2 | Every single one was test modernization (Mockito migration, JUnit 5). Zero fanfare, zero feature PRs, zero visibility. The project's test infrastructure exists because of this person. |
Analysis period: March 1, 2025 through February 28, 2026. Data sources: GitHub API (apache/kafka), Jira (KAFKA project), dev@kafka.apache.org mailing list. All statistics reflect date-filtered activity within each monthly window. Every claim cites specific PRs, tickets, or mailing list threads.