Methods for Maintaining Longitudinal Population Health Studies
*to jump to Hospital-level race/ethnic data quality DataBooks, please click HERE
1. Basic Computing Environment
Organize the computer and software
Prepare the Tools and Working (Project) Environments
Basic issues with Master and Confidential Environments
TOOLS_2019.ZIP contains FHOP macros and related files introduced in this volume
2. Standardizing Variables Over Time
Time variables
Demographic variables
Confidential data elements
3. Preparing Master Files
Setup activities
The RDYR macro
Check longitudinal consistency
4. Special Issues with Birth and Fetal Death Files
Steps to make master files
Check longitudinal consistency
Geographic classification
Data quality
BC_FORMATS_2019.ZIP contains FHOP's current format library for use with birth certificate and fetal death data (1989-2018)
11METH_WORK_2020.PDF describes steps to clean work-related variables in the California 2010-2017 birth (mother and father) and 2014-2017 death (decedent) files and develop formats to classify those variables.
WORK_FORMATS_2020.ZIP contains the resulting format library.
5. Special Issues with Death Files
Steps to make master files
Check longitudinal consistency
Cause of death
Geographic classification
Data quality
DT_FORMATS_2018.ZIP contains FHOP's current format library for use with death certificate and fetal death data (1980-2018)
SD_GEOCODE7.PDF summarizes work to evaluate the quality of address data in the California Death Statistical Master file in 2005 (before electronic death registration) and 2007 (after electronic death registration). It also compares the accuracy of two geocoding systems used in California at that time.
6. Maintaining Hospital Formats
Structure of formats program
OSHPD facility labels
Centers for Medicare and Medicaid Services
Clinical Classification System (CCS)
Injury Classification
OSH_FORMATS_2019.ZIP contains the SAS format library FHOP currently uses for OSHPD inpatient admissions (1983 to 2018) and emergency department and ambulatory care encounters (2005 to 2018). The files listed below are the source for the formats.
DXFH2018.ZIP contains the last cross-classified lists of ICD-9 diagnoses (1983 to Sep-2015). This file is the source for formats that variously classify ICD-9 diagnosis codes
DXTFH2018.ZIP contains the cross-classified lists of ICD-10 diagnoses (Oct-2015 - Dec-2017). This file is the source for formats that variously classify ICD-10 diagnosis codes
GEMI9I10.ZIP contains the longitudinal crosswalk between the ICD-9 and ICD-10 diagnosis codes. This file is the source for formats that back-classify ICD9 to be consistent with current ICD-10 groupings.
ICD10_CONVERSION_2020.PDF describes the work to validate the longitudinal GEMS crosswalk between the ICD-9 and the ICD-10 diagnosis codes with a focus on the Clinical Classification System, and particularly mental health (DXCH06) and conditions occurring during pregnancy, birth, and the puerperium (DXCH11).
PXAH2018.ZIP contains the last cross-classified lists of ICD-9 procedures (1983 to Sep-2015). This file is the source for the formats that variously classify ICD-9 procedure codes. CCS did not update ICD-9 procedure codes in 2015.
PXTFH2018.ZIP contains the cross-classified lists of ICD-10 procedures (Oct-2015 - Dec-2017). This file is the source for the formats that variously classify ICD-10 procedure codes
7. Maintaining Geography Formats
The need for longitudinal geographic datasets
Standard administrative boundaries
Planning and policy geography
Data sets with geographic boundaries
GEOG_FORMAT_INPUT_2020.ZIP contains the SAS programs and current input excel file used to make formats
GEOG_FORMATS_2020.ZIP contains the full set of California geography formats in current use
8. Annual Hospital Disclosure Report
Primary hospital data sets
Preparing AHDR data
Reconciling hospital events
9. Hospital Crosswalk
Why crosswalk is needed
Crosswalk methods and results
Crosswalk validation
Example: Hospital-level race/ethnic data quality
The following files have longitudinal hospital-level results
Birth Certificate
Patient Discharge
Emergency Department
10. Population Master Files
Department of Finance
National Population Estimates
Intercensal Small Area Population Estimates
Issues and Decisions to be made on Collecting, Coding and Reporting Race and Ethnicity for Public Health Indicators
The “Race/Ethnicity Guidelines”, approved in 2003 by the California Directors of Public Health (CDPH) and Health and Human Services (CHHS) for use by all programs, explicitly did not address how to handle multi-race coding for trend analysis. Further, the National Center for Health Statistics (NCHS) had not yet provided guidance on what to do when the same groups are not available over time or there is a mismatch between groups in the numerator and denominator. This document discusses issues related to developing a standardized approach to coding and reporting race and ethnicity for data sets maintained by CDPH. The focus is using these to explore race/ethnic differences in indicators of health status and outcomes over time. (September 2011).
Creating Longitudinal Hospital-Level Data Sets
Per California regulations, hospital licenses are based on a given physical location. When hospitals disappear from various data files the explanation is not readily apparent. We must determine whether it is because the facility closed, merged, converted to consolidated reporting, or moved, resulting in a new license ID. Yet another possibility is that a new license ID was assigned to a facility at the same location. We developed a series of decision rules to resolve such issues in a longitudinally consistent manner. These included rules to handle changes in hospital identifiers, physical location, consolidated data reporting, ownership, organizational type, and structural capacity. This document provides a full discussion of the issues encountered in creating the hospital-level data sets, their resolution, and the creation of related analysis data sets and variables. (June 2004)
Methods to Prepare Hospital Discharge Data
OSHPD distributes Patient Discharge Data (PDD) to qualified researchers such as the Family Health Outcomes Project (FHOP). The FHOP human subjects protocols permit us to have the confidential PDD, for all discharges and ages, from 1983 forward. Currently we have processed all years through 2000 and are about to start with the 2001 and 2002 files. This document presents an overview of the methods we developed to create the core files we use as the source for the different PDD-based research and data products that FHOP distributes. (June 2004)