Apache Airflow Engineering Digest

Jan 23 – Feb 22, 2026 · GitHub PRs + PR Complexity Scores + GitHub Issues + dev@ mailing list

The headline

Jarek Potiuk runs this project. Not in the sense that he has the most complex code, but in the sense that he simultaneously operates as the top code shipper (63 PRs merged), the top reviewer (412 reviews), the top mailing list participant (60 messages), a top issue closer (25 issues closed, 18 other people's), the release manager for Airflow 2.11.1, and the primary enforcer against AI-generated spam PRs. No other contributor comes close on any single dimension, let alone all six. The complexity data adds nuance: only 1 of his 16 scored PRs was high-complexity. His impact comes from breadth and volume across every surface of the project.

The more surprising finding is what's happening below him. Ash Berlin-Taylor, who a dashboard would show as completely inactive (0 PRs, 0 lines, 0 issues, 0 mailing list messages), reviewed 148 PRs with 214 review comments. He approved only 7 times out of 148 reviews. His reviews target the project's hardest architectural PRs: 16 of the PRs he reviewed scored high-complexity (>= 0.5), including the top-scoring PR in the entire project (#58992, serde migration, 0.772). He is Airflow's architectural conscience, and he never says yes.

Meanwhile, Shahar Epstein, who a PR dashboard would show as a minor contributor (2 merged PRs, 238 lines), is the project's most prolific issue triager. She closed 39 issues this month, 37 filed by other people, commented 70 times across 43 issues, gave 240 PR reviews, cast 5 binding votes, and was promoted to PMC member during this period. She is the second-most-important person in the project, visible only when you cross-reference all four sources.

The complexity data reveals Airflow's real architecture of effort: 1,595 PRs were active this month, but only 548 had review comments. Of 4,692 classified review comments, 11% were probing (reviewer uncertainty), 80% directing, and 8% polishing. Probing ratio correlates 0.352 with review rounds, weaker than Kafka's 0.449, suggesting Airflow's review culture is more directive ("change this") and less interrogative ("should we?").

Ash Berlin-Taylor: The architectural conscience

GitHub: 0 PRs merged · 148 reviews · 214 review comments · 0 issue comments · 0 lines Mailing list: 0 messages

Zero code. Zero issues. Zero mailing list presence. His entire contribution is 214 review comments, and they are uniformly the highest-quality feedback in the dataset. He approved 7 times out of 148 reviews, a 5% approval rate.

The complexity data makes his role precise: 16 of the PRs he reviewed scored >= 0.5 complexity. He is not randomly distributed across the project; he concentrates on the hardest work. His top reviewee is amoghrajesh (33 reviews), who has the highest average complexity (0.400) among prolific contributors. ashb is essentially amoghrajesh's dedicated architectural reviewer.

On PR #58992 (move serde to task SDK, the highest-complexity PR at 0.772, probing ratio 0.43): he questioned API impact and deserialization behavior across the SDK boundary. On PR #57744 (move Config Parser to shared library, 0.684): he gave 6 detailed reviews including re-export patterns, dead comment removal, and questioning why tests needed changing in a pure refactor. On PR #53821 (write-to-ES, 207 days to merge): "This can't? shouldn't? work in Airflow 3-- tasks logs are uploaded in the worker, and workers no long have access to the Database." He provided 12 reviews and 22 comments on this single PR, guiding it through fundamental Airflow 3 logging architecture concerns.

On PR #59764 (dag clear only_new): "I don't think we should ever change the version of an existing Dag Run. I'm fairly sure that will break a number of core assumptions."

On PR #57778 (task timeouts): "But let me be crystal clear: This is wrong. UP_FOR_RETRY is not a running state. It is a finished state, the task process is no longer running when this state is reached."

His most distinctive behavior: he almost never approves. Of 148 reviews: 133 COMMENTED, 7 CHANGES_REQUESTED, 7 APPROVED, 1 DISMISSED. He operates as a pure architectural advisor who leaves formal approval to other maintainers. A dashboard would show ashb as inactive. In practice, he is preventing design mistakes in the core scheduler, task runner, and logging architecture.

Jarek Potiuk: The everything contributor

GitHub PRs: 63 merged · 412 reviews · 179 review comments · 242 issue comments · 29,000+/16,000- lines GitHub Issues: 3 opened · 25 closed (18 other people's) · 69 comments across 56 issues Mailing list: 60 messages · 8 threads started · 32 threads participated · 12 votes (5 binding) Complexity: 16 PRs scored, avg 0.223. 1 high-complexity, 12 low-complexity.

The complexity data reframes Potiuk's contribution. His 63 merged PRs include release synchronizations (+11,290/-7,021 in a single sync PR), dependency cleanup, SPDX licensing, CI changes, and docs updates. This is the mechanical work that keeps the project running. Only 1 of 16 scored PRs crossed the high-complexity threshold (#58825, pre-K hook imports check, 0.512).

His 412 reviews are broadly distributed: amoghrajesh (46), jscheffl (39), xBis7 (12), Arunodoy18 (8). He approves 60% of the time (245 APPROVED vs 14 CHANGES_REQUESTED), the highest approval rate among top reviewers. He is the final gatekeeper who unblocks frequently.

On the mailing list, he drove the entire Airflow 2.11.1 release lifecycle, started the most contentious discussion (AI-slop prevention, 14 messages, 7 participants), and raised security concerns about task provenance in AIP-67. His review comments enforce quality standards on GitHub: on PR #59407, "Please. If you use agents, do not commit this reasoning as part of your PR." On PR #62108: "We reported you to Github for Inauthentic / Spam activity."

He is simultaneously the project's builder, reviewer, release manager, governance participant, spam enforcer, and community organizer. The complexity data clarifies that his building is maintenance-grade: dependency updates, release syncs, CI fixes. That is exactly what the project needs from him, and nobody else is doing it at this scale.

The most complex work

The complexity scoring classifies every review comment as probing (reviewer uncertain), directing (reviewer knows the fix), or polishing (nits). The top PRs by composite score:

PR	Title	Author	Score	Probing	Rounds	Comments	Pattern
#58992	Move serde to task SDK	amoghrajesh	0.772	43%	22	23	Core refactor, SDK boundary
#57354	Upgrade email notifications to SmtpNotifier	amoghrajesh	0.733	33%	30	18	Breaking change concern
#55139	Move DagBag to task SDK	amoghrajesh	0.716	21%	44	24	Foundational abstraction move
#61310	Thread-safe auth manager init	stegololz	0.700	63%	7	8	Unresolved, never merged
#60410	YAML connection form metadata	amoghrajesh	0.689	22%	23	32	JSONSchema pushback
#57744	Move Config Parser to shared lib	amoghrajesh	0.684	21%	26	19	Core refactor

amoghrajesh dominates the complexity leaderboard. 4 of the top 6 highest-complexity PRs are his. These are the core Airflow 3 refactoring: moving serde, DagBag, and Config Parser to the task SDK. ashb reviewed all of them. The amoghrajesh-to-ashb pipeline is where Airflow's hardest architectural work happens.

PR #61310 (thread-safe auth manager initialization, probing ratio 63%) is the only top PR that never merged. Multiple reviewers (jscheffl, pierrejeambrun) questioned unrelated changes included in the PR, and the design questions about thread-safety remained unresolved. The highest probing ratio in the top 10, and the only one that didn't land.

PR #60410 shows ashb's review quality. amoghrajesh initially reimplemented JSONSchema validation; ashb pushed back, and jscheffl questioned unnecessary container classes. amoghrajesh responded by integrating JSONSchema directly. The review made the implementation fundamentally better.

Contributor complexity profiles

The complexity data distinguishes who does hard work from who does a lot of work:

Contributor	PRs scored	Avg complexity	High (>=0.5)	Low (<0.3)	Pattern
xBis7	4	0.618	3	0	Consistently hard: OTel, scheduler
johnhoran	3	0.530	2	0	K8s Pod Operator specialist
Lee-W	5	0.500	3	0	Partitioned assets, timetables
guan404ming	6	0.463	3	1	AssetGraph, dag run filtering
dabla	8	0.441	4	2	GenericTransfer, deferrable ops
amoghrajesh	34	0.400	12	15	Core refactoring + routine providers
potiuk	16	0.223	1	12	High volume, maintenance-focused
jscheffl	13	0.192	1	11	Maintenance + Edge UI plugin
Arunodoy18	16	0.191	0	14	Routine contributions

xBis7 has the highest average complexity (0.618) with zero low-complexity PRs. All work focused on OTel metrics and scheduler query optimization. PR #56150 (OTel env variables) took 84 review rounds and 108 comments over 143 days. jason810496 gave 6 reviews covering code reuse, error types, and module placement.

amoghrajesh is the most interesting profile: 34 scored PRs, 12 high-complexity and 15 low-complexity. The split reveals two distinct modes. Core Airflow 3 refactoring (serde migration, config parser moves) generates deep architectural review. Routine provider YAML metadata work passes quickly. He is simultaneously the project's deepest architectural contributor and a volume producer of mechanical changes.

potiuk and jscheffl, the project's two most active maintainers by review count, both focused their authored PRs this month on maintenance and tooling (avg 0.223 and 0.192). Their highest-impact work came through reviews, release verification, and governance rather than through the PRs they wrote.

The four reviewers

Reviewer	Reviews	Comments	Approval rate	Top reviewee	Pattern
Potiuk	412	179	60%	amoghrajesh (46)	High-volume gatekeeper, unblocks frequently
jscheffl	317	210	49%	potiuk (29)	Balanced maintainer, mutual review with Potiuk
jason810496	241	204	34%	Owen-CH-Leung (21)	Review-heavy mentor, files 26 issues from review
ashb	148	214	5%	amoghrajesh (33)	Architectural conscience, almost never approves

Potiuk approves 60% of reviews, the highest rate, indicating a gatekeeper who actively unblocks. ashb approves 5%, the lowest, indicating an advisor who challenges but doesn't gate. jscheffl sits in between (49%), reviewing potiuk's work as often as potiuk reviews his, creating a mutual accountability loop.

jason810496 (Zhe-You Liu) is the second-most-prolific reviewer (241) with only 6 PRs opened. His reviews are 158 COMMENTED vs 82 APPROVED, and he also filed 26 issues this month (more than anyone else), discovering bugs through his review work and filing them systematically. He operates as a mentoring/quality layer.

Shahar Epstein (shahar1) rounds out the top 5 with 240 reviews. Her 21 CHANGES_REQUESTED (the highest count of any reviewer) shows she gates more actively than anyone except ashb. Combined with her 39 issues closed and 5 binding votes, she operates at every layer: issue triage, PR review, and governance.

The AI-slop war

Airflow is actively fighting AI-generated spam, visible across both GitHub and the mailing list.

On the mailing list: Potiuk's "[DISCUSS] Stop assigning unknown contributors to issues (AI-slop prevention)" thread (14 messages, 7 participants) proposed removing the ability for unknown contributors to self-assign issues. The discussion produced a compromise: "soft assignments" that decay over time. Aritra Basu initially opposed, then came around. Damian Shaw proposed automatic unassignment.

On GitHub: Potiuk's review comments show active enforcement. PR #62108 was reported to GitHub for inauthentic spam. PR #61017 was flagged as "a completely hallucinated PR." PR #60901 was identified as "clearly AI generated - including the response."

The bot layer itself: GitHub Copilot generates 104 review comments (more than most human maintainers) but has zero formal review authority and never approves anything. The project uses AI as a supplementary reviewer while actively removing AI-generated human contributions. The irony is quantifiable: the AI reviewer produces more comments than 490 of 496 human contributors.

Release engineering

Release	RCs	Result	Key detail
Airflow 3.1.7	2	Shipped	RC1 had issues; accelerated RC2 with 6 binding +1
Airflow 2.11.1	1 (rc2)	Shipped	Potiuk drove multi-week effort. 6 binding + 5 non-binding
Helm Chart 1.19.0	3	Shipped	Jens found deployment issues in rc1. rc2 couldn't reach quorum. rc3 passed
Providers (Jan 27)	n/a	Shipped	Vincent Beck's first release as RM. 7 binding +1
Providers (Feb 10)	n/a	Shipped	Shahar Epstein as RM. 5 binding +1
AirflowCTL 0.1.2	1	Stalled	Only 1 binding +1 (Jens). Bugra voted 0 (abstain)
AIP-94	n/a	Passed	Decouple remote commands to airflowctl. 4 binding +1

The Helm Chart release is the story. Three RC cycles over 18 days. Jens Scheffler's deployment testing (Docker installs, kind clusters, Edge Executor integration, cryptographic signature verification, SVN checksums) caught issues that other voters missed. The rc2 vote failed not because anyone objected; it simply never attracted enough binding voters. Governance bottleneck: the release was ready, but the quorum couldn't be reached.

Top binding voters: Jens Scheffler (6), Bugra Ozturk (6), Shahar Epstein (5), Jarek Potiuk (5), Amogh Desai (4), Rahul Vats (4).

The issue layer

662 issues had activity this month. 216 opened, 248 closed. The label distribution tells the story:

Label	Issues	Signal
kind:bug	344	Bugs dominate 2:1 over features
area:core	271	Core scheduler/executor is where the hard problems are
kind:feature	182	Feature requests filed but lag behind bug fixes
good first issue	150	Large onramp, but many sit unclaimed
needs-triage	138	21% still need initial classification

Who triages vs who files:

Contributor	Opened	Closed	Comments	Signal
jason810496	26	12	29	Top opener, files issues from review work
shahar1	5	39	70	Top closer, triages others' issues
potiuk	3	25	69	Issue-level maintainer
vincbeck	13	16	15	Full-cycle on auth issues
amoghrajesh	7	16	39	Balanced: files and triages

Shahar barely files issues (5) but closes 8x as many (39), almost all filed by other people. jason810496 files 26 issues while giving 241 PR reviews, discovering bugs through review and filing them systematically.

The longest-running and most-debated PRs

PR	Author	Days/Rounds	Score	Status	Story
#45931 Secrets backends order	moiseenkov	396+ days, 34 reviews	n/a	Still open	Nobody can agree on the approach. Crowiant provided 16 of 34 reviews
#53821 Write-to-ES fix	Owen-CH-Leung	207 days, 69 rounds	0.636	Merged	ashb (12 reviews, 22 comments) + jason810496 (21 reviews) shepherded it through Airflow 3 logging concerns. Community users pressed for merge due to upgrade breakage
#56150 OTel env variables	xBis7	143 days, 84 rounds	0.664	Merged	108 total comments. jason810496 drove review on module placement and error types
#53216 Version change indicators	choo121600	224+ days, 25 reviews	n/a	Still open	77 comments, still iterating
#57610 Informatica provider	cetingokhan	106 days, 23 reviews	0.573	Merged	First provider under AIP-95. jscheffl: "then you could be the first provider here following the new contribution model"

The complexity scores separate these long-running PRs into two categories. PR #56150 (OTel, 0.664) and PR #53821 (write-to-ES, 0.636) were genuinely complex: high probing ratios, architectural debate, multiple reviewers disagreeing. PR #57610 (Informatica, 0.573) was process-complex: the first provider under a new governance model, where the review was about onboarding procedures as much as code quality.

Provider ecosystem

185 of 1,595 PRs (12%) touch provider code. The landscape:

Google Cloud: 36 PRs. BigQuery lineage, Cloud Batch hooks, GKE operators, GCS sensors
Amazon: 25 PRs. ECS/EKS system tests, SES email, Glue catalog
FAB: 24 PRs. Yarn-to-pnpm migration (+6,181/-9,378 by jscheffl), auth manager
Kubernetes: 16 PRs. Pod operator, secrets backend. johnhoran's work (avg complexity 0.530) is the hardest provider work in the dataset
Edge: 15 PRs. Edge executor, multi-team (AIP-67)
OpenLineage: 7 PRs. kacpermuda authored all 6 merged. He is the lineage domain owner
New: Informatica. AIP-95, first provider under steward/contributor model. 106 days, 23 reviews

AIP governance

AIP	Subject	Status	Key detail
AIP-94	Decouple remote commands to airflowctl	Passed	4 binding +1 (Jens, Potiuk, Dheeraj, Shahar)
AIP-95	New provider: Informatica	Merged	First under new governance model. 15 messages, 9 participants
AIP-67	Execution API access control for multi-team	Active debate	Potiuk: "no protection against the tasks making claims that they belong to." Security question unresolved
AIP-100	Eliminate scheduler starvation on concurrency limits	Early	asquator proposed, Jens responded. Low engagement

The AIP-67 debate is the most architecturally significant. Vincent Beck proposed spawn-based isolation. Potiuk rejected it: "does not change much and is not good for performance." He called for original thinking from Ash, Kaxil, and Amogh. The multi-team feature will ship as experimental in Airflow 3.2, with this security question still unresolved.

The complexity vocabulary

Airflow's review comments reveal what reviewers worry about most:

Concern	Mentions	Signal
Breaking change	33	Airflow 3 migration creates constant backward-compatibility anxiety
Backward compatibility	18	Same concern, different phrasing (51 total mentions of compatibility)
Race condition	4	Thread safety in auth manager init, scheduler

"Breaking change" appears 33 times across 548 scored PRs, nearly 3x Kafka's 12 mentions. Airflow is in the middle of a major version migration (2.x to 3.x), and almost every core PR triggers compatibility questions. This is the project's dominant complexity concern.

Cross-source view

Contributor	PR reviews	PRs merged (complexity)	Issues closed	ML msgs	ML binding	Role
Potiuk	412	63 (avg 0.223, 1 high)	25	60	5	Breadth across every surface
ashb	148	0	0	0	0	Architectural review, GitHub only
jscheffl	317	32 (avg 0.192, 1 high)	11	20	6	Verification + governance
shahar1	240	2	39	20	5	Triage across all sources
jason810496	241	2	12 opened, 12 closed	6	0	Review + issue discovery
amoghrajesh	144	42 (avg 0.400, 12 high)	16	18	4	Hardest code + governance
pierrejeambrun	202	20	13	0	0	UI gatekeeper, GitHub only
vincbeck	188	15	16	4	0	Dependency manager + auth
Nataneljpwd	81	3	0	0	0	Mentor: Spark/K8s/Databricks
Copilot	0	0	0	0	0	104 review comments, no authority

The table reveals that amoghrajesh is the only person who ships high-complexity code AND participates in governance. Potiuk and jscheffl ship volume but their authored PRs this month were maintenance-focused. ashb reviews the hardest work but ships nothing. Shahar triages everything but codes little. The project's hardest architectural work flows through one person (amoghrajesh) reviewed by one other person (ashb), a two-person pipeline for the most consequential changes.

What a dashboard would show vs. what actually happened

Dashboard says	What actually happened
potiuk: 63 PRs, 29,000 lines	Also 412 reviews, 25 issues closed, 60 mailing list messages, drove the 2.11.1 release, and actively reports AI-spam accounts. But only 1 of 16 scored PRs was high-complexity. His impact comes from covering every surface of the project at scale.
ashb: 0 PRs, 0 lines	148 reviews, 214 review comments, 5% approval rate. 16 of his reviewed PRs scored high-complexity. He reviewed the serde migration, Config Parser move, write-to-ES feature, and every other hard architectural PR. Never says yes.
jason810496: 6 PRs, 241 reviews	Also filed 26 issues (the most of anyone). Discovers bugs through review and files them systematically. 158 COMMENTED vs 82 APPROVED: mentors more than he gates.
Shahar Epstein: 2 PRs, 238 lines	39 issues closed (37 others'), 240 PR reviews, 20 mailing list messages, 5 binding votes. The project's #1 issue triager, promoted to PMC during this period. Only visible when you cross-reference all four sources.
amoghrajesh: 42 PRs merged	12 of 34 scored PRs were high-complexity (the most of anyone). Owns the serde migration, SmtpNotifier upgrade, Config Parser move, and YAML metadata. ashb reviewed him 33 times. The project's deepest architectural contributor.
jscheffl: 32 PRs, 317 reviews	6 binding votes (tied highest), 20 mailing list messages. Caught Helm Chart rc1 deployment issues that other voters missed. His 13 scored PRs this month were primarily maintenance and tooling; his highest-impact work was in review and release verification.
xBis7: 4 PRs	Highest average complexity (0.618) of any contributor. Zero low-complexity PRs. The OTel env variables PR took 84 rounds and 143 days. Quietly doing the hardest work in the project.
Copilot (AI): 0 PRs	104 review comments, more than 490 of 496 human contributors. Never approves anything. The project uses AI to supplement review while removing AI-generated human contributions.
Helm Chart 1.19.0: shipped	Three RCs. rc2 failed not because anyone objected; it couldn't attract enough binding voters. Governance bottleneck.
1,595 PRs active	Only 548 had review comments. 11% probing, 80% directing, 8% polishing. Breaking change (33 mentions) and backward compatibility (18) dominate: the Airflow 3 migration drives complexity.

Generated by Canopy from four sources: GitHub PRs (with PR complexity classification scores), GitHub Issues, and the dev@ mailing list. Cross-referenced into a single narrative. Jan 23 – Feb 22, 2026.