Apache Spark Community Health Report

February 14 -- 21, 2026 (7-day window)

Newcomer Welcoming

ReviewerNewcomer PRs ReviewedTotal Reviews
anishshri-db4444
cloud-fan2781
dongjoon-hyun2668
heyihong2222
viirya1221
zhengruifeng922
pan3793810
gengliangwang79
HyukjinKwon740
HeartSaVioR77

anishshri-db reviewed exclusively newcomer/infrequent contributor PRs in Structured Streaming. cloud-fan and dongjoon-hyun review broadly across experience levels.

Interaction Breadth

ContributorUnique PeopleRole
dongjoon-hyun19Broadest; interacts across all subsystems
cloud-fan18SQL and cross-cutting review
HyukjinKwon13PySpark and infra bridge
pan379312Build system and cross-cutting
zhengruifeng10PySpark review and planning
Yicong-Huang9Python serialization cluster
gaogaotiantian9PySpark testing and review
szehon-ho7SQL and data sources
HeartSaVioR6Streaming-focused cluster
anishshri-db5Streaming-only cluster

dongjoon-hyun and cloud-fan function as connective tissue. anishshri-db and HeartSaVioR form a tight streaming cluster.

Helping vs Self-Promoting (Net Reviewer Ratio)

ContributorPRs AuthoredPRs ReviewedNet Reviewer
cloud-fan281+76 (overwhelmingly helping)
anishshri-db044+44 (pure helper)
dongjoon-hyun1568+52 (mostly helping)
HyukjinKwon140+39 (mostly helping)
gengliangwang09+9 (pure helper)
mikhailnik-db06+6 (pure helper)
HeartSaVioR37-18 (primarily authoring)
holdenk13-20 (primarily authoring)

Strong cadre of pure helpers. cloud-fan's 81:2 review-to-author ratio is extraordinary.

Top Net Reviewers

RankContributorGivenReceivedNet
1cloud-fan848+76
2dongjoon-hyun6917+52
3anishshri-db440+44
4HyukjinKwon401+39
5zhengruifeng2610+16

Top 4 net reviewers carry the project's quality burden. Concentration risk if any reduce activity.

Consistency (Jira Cross-Reference)

Consistently active: dongjoon-hyun (1,087 Jira + 15 PRs + 68 reviews), cloud-fan (622 Jira + 81 reviews), zhengruifeng (619 Jira + 22 reviews).

GitHub-only (implementation-focused): uros-db, AlSchlo, dichlorodiphen.

High Jira, lower GitHub this week: HyukjinKwon (859 assigned, but 1 PR + 40 reviews).

Summary

DimensionStrengthConcern
Newcomer welcomingBroad coverage from top reviewersanishshri-db concentrated on few people
Interaction breadthdongjoon-hyun (19) and cloud-fan (18) connect communityStreaming subsystem is a tight cluster
Helping vs self-promotingExceptional pure helper ratioFew both author and review at scale
Net reviewer ratioTop 4 carry quality burdenConcentration risk
ConsistencyStrong Jira cross-reference signal7-day window too short for full measurement

Want this for your private team?

Canopy generates digests like this for private engineering teams. Connect your GitHub, Jira, and Slack.

Get started
Canopy

Engineering digests, not dashboards.