Virginia Tech
4 files

Data Cleaning for Deterministic Data Linkage for Sports and Recreational Injuries Using ICD-9 and ICD-9-CM

posted on 2022-06-22, 10:50 authored by Charlotte Baker, Robert Young

We have included SAS syntax examples for cleaning emergency department and inpatient hospital data from 2006 to 2012 (in ICD-9-CM) format from the Florida Agency for Healthcare Administration. The provided syntax (in .sas format and .py format), demonstrates how we took original data sets, restricted it to specific variables, reformatted it for our purposes, and transposed it to create two separate data sets that were then merged together to create an all-encompassing data set for individual level analysis. Syntax covered under GPL-3.0-or-later.

In addition, we have provided figures (author Baker) demonstrating the linkage process. One figure (Titled Figure 7) shows the steps to put together data from several years and retain all variables. The second figure (Titled Figure 8) shows the steps to take the final data sets created in Figure 7 and then create one master data set using merging. The figures are covered under a CC-BY-SA-4.0 license.



University Libraries, Virginia Tech


  • English (US)


Florida Agency for Health Care Administration