π― Hypothesis Test
"Projects are concentrated similarly across contractors and districts"
πOverall Statistics
High-level overview of the dataset analyzed
π―Concentration Analysis (HHI)
The Herfindahl-Hirschman Index (HHI) measures market concentration. District HHI measures how concentrated contractors are within each district (HHI=10,000 = one contractor serves that district exclusively - monopoly risk). Contractor HHI measures how concentrated districts are within each contractor's portfolio (HHI=10,000 = contractor works in only one district - specialization, not necessarily concerning).
Competitive Market
Moderate Concentration
High Concentration
π Key Interpretation
π‘ Key Finding
πInequality Analysis (Gini Coefficient)
The Gini coefficient measures inequality in distribution. 0 = perfect equality, 1 = perfect inequality.
π Interpretation
πIndependence Test (Chi-square)
Tests whether contractor selection and district assignment are independent variables.
π What This Means
πConcentration Pattern Correlation
Correlates HHI values between contractors and districts to test if concentration patterns are similar.
π Interpretation
πEvidence Summary
Breakdown of all four statistical tests
π Test Criteria & Adjustments
1. HHI Similarity Test: Tests if average district HHI and contractor HHI are within 500 points of each other. This threshold (500) was chosen to allow for moderate differences while still indicating similar concentration levels. Adjustment: The 500-point threshold accounts for the inherently different scales - districts naturally have lower HHI (more contractors per district) while contractors have higher HHI (fewer districts per contractor).
2. Gini Similarity Test: Tests if the Gini coefficients for contractors and districts are within 0.1 of each other. This threshold allows for moderate inequality differences while identifying similar distribution patterns. Adjustment: The 0.1 threshold balances statistical rigor with practical significance - smaller differences indicate similar inequality structures even if not identical.
3. Chi-square (Dependence) Test: Uses standard significance threshold (p < 0.05) to test if contractor selection and district assignment are independent. CramΓ©r's V measures effect size. Adjustment: Standard statistical threshold - no adjustment needed. This test shows strong dependence, indicating contractors are NOT randomly distributed across districts.
4. Correlation Test: Uses standard significance threshold (p < 0.05) to test if district HHI values correlate with contractor HHI values. Adjustment: Standard statistical threshold - no adjustment needed. The lack of correlation suggests concentration patterns operate differently: contractors specialize geographically while districts maintain competitive markets.
Result: 2 out of 4 tests support the hypothesis, leading to a "PARTIALLY SUPPORTED" verdict. The hypothesis is partially true: while contractors and districts show dependence (Chi-square) and similar inequality patterns (Gini similarity), they differ in concentration levels (HHI similarity fails) and do not correlate in concentration patterns (Correlation fails). This suggests concentration happens in opposite directions - contractors concentrate in few districts, while districts maintain competitive contractor markets.
πTop 10 Most Concentrated Districts
Districts with the highest contractor concentration (HHI)
| Rank | District Engineering Office | HHI Score | Market Type |
|---|
πTop 10 Most Concentrated Contractors
Contractors with the highest district concentration (HHI)
| Rank | Contractor Name | HHI Score | Specialization |
|---|
π Statistical Methods Used
- Herfindahl-Hirschman Index (HHI): Measures market concentration by summing the squares of market shares
- Gini Coefficient: Measures statistical dispersion representing inequality in distribution
- Chi-square Test of Independence: Tests whether two categorical variables are independent
- CramΓ©r's V: Effect size measure for categorical association (0 to 1 scale)
- Pearson Correlation: Measures linear relationship between two continuous variables
- Spearman Correlation: Non-parametric measure of rank correlation