πŸ“Š Data Sources

Comprehensive list of all data sources used in BetterGovPH visualizations

πŸ›οΈ Primary Government Data Sources

  • Budget Data (GAA - General Appropriations Act):
    @bettergovph/open-budget-data
    Open Budget by BetterGov.ph
    Used in: Budget Analysis, Budget-NEP Correlation, Budget-Flood Correlation
  • National Expenditure Program (NEP):
    @bettergovph/open-budget-data
    Open Budget by BetterGov.ph
    Used in: NEP Analysis, Budget-NEP Correlation
  • DPWH Program Budget Analysis:
    People's Budget Coalition - Citizens' Budget Tracker
    Used in: Budget-NEP Program Correlation, Program-level Analysis
    Comprehensive analysis of DPWH budget programs including Convergence and Special Support Program, Asset Preservation Program, Flood Management Program, Bridge Program, Network Development Program, Local Program, General Administration and Support, and Support to Operations across fiscal years 2018-2025
  • Flood Control Projects (DPWH):
    the team from extra.bayanwat.ch
    Sumbong sa Pangulo - Flood Control Map
    BetterGov.ph - Flood Control Projects Map
    Used in: Flood Control Analysis, Interactive Map, Budget-Flood Correlation, Flood-DIME Correlation
  • DIME (Digital Information for Monitoring and Evaluation):
    Department of Budget and Management
    Used in: DIME Infrastructure Analysis, Interactive Map, Flood-DIME Correlation
  • PhilGEPS Contract Data:
    @gabconcepcionph/dpwh-contracts-dashboard
    @csiiiv/philgeps-awards-dashboard
    Used in: Interactive Map, Contractor Verification
    Comprehensive transparency dashboard providing access to Philippine government procurement data from 2013-2025
  • Raw Philippine Data (Persons, Memberships & Legislative Documents):
    BetterGovPH Raw Philippine Data on HuggingFace
    Used in: Semantic Search, Political Analysis, Legislative Document Research
    Comprehensive dataset containing 45,400+ persons (politicians and public officials), 86,200+ memberships (political positions and party affiliations), and legislative documents (Senate and House bills). Available in Parquet format for easy analysis and DuckDB for SQL queries. Licensed under CC0 1.0 Universal (Public Domain).
  • Political Dynasty Data:
    Ateneo School of Government (ASoG) - Participate Project
    Used in: Dynasty Analysis, Dynasty Map, Family Tree Visualization
    Comprehensive dataset tracking political dynasties at the local government level (2004-2016) with 86,234 records across all provinces. Data accuracy and consistency are the responsibility of the original ASoG source.
  • Philippines GeoJSON Data:
    OSSPhilippines/geoph
    Used in: Dynasty Map, Province Boundaries, Interactive Visualizations
    Open source GeoJSON data formats for the Philippines, providing accurate province and regional boundaries for mapping
  • Congressional District Mappings:
    Comprehensive congressional district mappings for Philippine provinces and cities. For provinces, municipalities are mapped to their respective congressional districts (397 of 417 non-partylist congressmen, 95.2% coverage). For cities, barangays are mapped using official PSGC data (136 cities with 7,418 barangays). Data combines official government sources with geographic boundary data from OSSPhilippines/geoph.
    Used in: District Analysis, Dynasty Projects, Province-District Filtering, Municipality-District Mapping, Congressman Project Matching
    Credit: Philippine Statistics Authority (PSA) - Philippine Standard Geographic Code (PSGC) for official municipality and barangay listings, Commission on Elections (COMELEC) - Official congressional district boundaries and electoral district maps, House of Representatives - Congressional district assignments and representative listings (17th-20th Congress, 2016-2025), OSSPhilippines/geoph - Geographic boundary data for barangays and municipalities
  • SEC (Securities and Exchange Commission):
    SEC Philippines - Check with SEC
    Used in: SEC Contractors Database, Interactive Map
  • PhilGEPS Merchant Information:
    PhilGEPS Open Data - Merchant Information
    Used in: Contractor Registration Details, Merchant Verification
    Provides registration details and eligibility status for contractors registered in the PhilGEPS system. Data is queried using normalized contractor names from our database to extract registration information and compliance status.
  • Political Dynasty Network (Dynasty5):
    Dynasty5 Network Visualization
    Used in: Dynasty Analysis, Political Network Visualization
    Interactive network visualization of political dynasty relationships across regions in the Philippines, sourced from Ateneo School of Government (ASoG) Participate Project and Philippine Statistics Authority poverty data

🀝 Civil Society and Research Organizations

  • People's Budget Coalition (PBC):
    2026 Budget Analysis (GAB 2026). Used for exploratory charts in the NEP β†’ GAB 2026 tab.
    Credit: People's Budget Coalition (PBC)
  • Senator Relationships Data:
    Additional senator family relationships data extracted from OCR images. Used in: Dynasty Analysis, Relationship Constellations.
    Credit: inyongmaasahan (inyongmaaasahan@gmail.com)
  • Records of flood-control contractors (PCIJ, 2025):
    Primary investigative basis for contractor ownership, valuations, and joint ventures powering dynasty contractor profiles and `/integrated` tabs.
    Credit: Philippine Center for Investigative Journalism
  • Rappler Politicontractors Tracker (2025):
    TRACKER: Public officials with links to government contractors
    Used in: Contractor verification for congressmen with zero-cache entries, cross-referencing beneficial ownership of firms such as A.D. Gonzales Jr. Construction, Sunwest Construction, and Quezon-based Suarez companies. β€œPoliticontractors” terminology originates from this Rappler series.
    Investigative map maintained by Rappler that documents public officials and the construction firms owned by their families. Provides vetted leads for integrating contractor patterns into autocomplete filters and cache generation.
  • Contractor-Dynasty Relationship Data:
    Investigative data on construction companies and contractors linked to political dynasties. Used in: Dynasty Analysis, Relationship Constellations, Contractor Connections.
    Credit: Rappler, Philippine Daily Inquirer, Manila Standard, Manila Times, Wikipedia, PhilAtlas - Investigative reporting and reference materials used to validate contractor-family linkages.
  • InfraWatch Augmented Projects (2016-2025):
    Supplemental infrastructure records used to highlight projects without matching PhilGEPS contracts and to enrich `/philgeps` search results.
    Credit: DPWH Infrastructure Status Microsite
  • DPWH Infrastructure Projects Data Pipeline:
    Automated scraping and parsing pipeline that converts DPWH infrastructure project tables into structured CSV datasets, providing contract coverage from 2016-2025 for validation of InfraWatch red-flag indicators.
    Credit: dpwh-infra-data-scraper by csiiiv
  • Flood Project Red Flag Classification:
    Additional flood project risk indicators and qualitative assessments layered onto DPWH datasets to surface red-flag items in `/flood` dashboards.
    Credit: UP-NCPAG GRIT Labs (NCPAGKilatis)

πŸ“… Data Coverage & Range

  • Budget Data (GAA): 2020-2025 government budget allocations
  • NEP Data: 2020-2025 expenditure programs
  • Flood Control Projects: Multi-year DPWH infrastructure projects
  • DIME Projects: 2016-2026 infrastructure monitoring data
  • Contractor Data: Active SEC-registered contractors with government projects
  • Raw Philippine Data: 2004-2016 political memberships, legislative documents (various congress sessions)
  • Political Dynasty Data: 2004-2016 local government political dynasty records
  • Congressional District Mappings: Working coverage of 85 provinces and 22 cities with municipality-to-district and barangay-to-district mappings across 256 districts (7,418 barangays) supporting 416 of 417 non-partylist congressmen (99.8% coverage)

πŸ“ˆ Dataset Statistics

  • Budget Line Items: Comprehensive nationwide budget allocations
  • NEP Items: Detailed expenditure program entries
  • Flood Control Projects: DPWH flood control infrastructure nationwide
  • DIME Projects: 12,870+ major infrastructure projects worth β‚±740B+
  • SEC Contractors: 2,100+ verified contractors with SEC registration data
  • PhilGEPS Data: Philippine government procurement data from 2000-2025, providing comprehensive transparency on contract awards and government transactions
  • Raw Philippine Data: 45,400+ persons, 86,200+ political memberships, and legislative documents (Senate and House bills) available for semantic search and analysis
  • Political Dynasty Records: 86,234 political dynasty records across all 82 provinces (2004-2016)
  • GeoJSON Boundaries: Complete province and regional boundary data for interactive mapping
  • Congressional District Mappings: 256 districts mapped across 85 provinces and 22 cities, with municipality-to-district and barangay-to-district coverage (7,418 barangays) supporting 416 of 417 non-partylist congressmen (99.8% coverage)

πŸ”¬ Methodology & Processing

All data undergoes rigorous processing to ensure accuracy and usability:

  • Data Cleaning: Removal of duplicates, standardization of formats, and validation of entries
  • Fuzzy Matching: Advanced string matching algorithms to correlate contractor names across different datasets
  • Geolocation: Coordinate validation and mapping for spatial analysis
  • SEC Verification: Automated scraping and verification of contractor registrations
  • Correlation Analysis: Cross-referencing datasets to identify patterns and ensure consistency

πŸ“š Academic References

Our corruption risk analysis is based on peer-reviewed academic research:

  • EOGO Corruption Risk Analysis:
    "Corruption risk and political dynasties: exploring the links using public procurement data in the Philippines"
    Authors: Daniel Bruno Davis, Ronald U. Mendoza, Jurel K. Yap
    Journal: Economics of Governance (2023)
    DOI: https://doi.org/10.1007/s10101-023-00306-4
    Used in: EOGO Corruption Risk Analysis, Political Dynasty Impact Assessment, CRI Implementation

πŸ’» Open Source & Transparency

All source code and data processing scripts are publicly available:

🀝 Contact & Contributions

We welcome contributions, corrections, and suggestions to improve data quality and transparency.

For questions about data sources or to report issues, please visit our GitHub repository or contact the maintainers.