Make sure your question is testable by the data present in SPARCS.
Read the SPARCS Data Dictionary to better understand the data elements.
Is there enough variation in the dataset?
Is there a sufficient number of cases?
Would the result be interesting?
Could the health care system, or individual providers, act on the result?
Some examples from previous students that are answerable with SPARCS:
Does day of admission correlate with length-of-stay for CHF?
Does severity of illness score correlate with length-of-stay for patients with Drug and Alcohol dependence?
How does hospital level case-load relate to length-of-stay for those undergoing hip replacement?
Does a patient’s race impact the rate of cardiac catheterization among patients admitted admitted with acute MI?
If you want to look at the data, use the SPARCS tools to filter your data by a given diagnosis/procedure and download it.There are two data download options you will see on our SPARCS site:
The All Hospitals button will download the data from all of the NY hospitals that treat that condition. This may be a huge dataset and could be too big for Excel.
The '8 Hospitals Only' button only downloads the data from 8 NY hospitals that we chose. These data are more manageable, will work in Excel, and are likely a better choice for an initial student project.
For the purposes of the student exercise, these data are best evaluated in a spreadsheet or basic statistical program.
Frequently Asked Questions
What permission do we need to use these data in a presentation or publication?
The SPARCS data are under the open government public use license and made available through the Open NY initiative which poses no limitations “over its end use" though they do require attribution to the NYS DOH. The full license is here. Relevant portion:
"Unless otherwise noted on an individual document, file, web page or other item, the Department of Health grants users permission to reproduce materials published by the Department on this Website so long as the Department of Health is noted as the source, and the data the web page was accessed, along with the date of publication of the material cited, is noted. "
The Hospital compare data are under a similar open license. Relevant portion: "Works of the U.S. Government are in the public domain and permission is not required to reuse them."
How do the data on this site differ from the raw SPARCS data available on health.data.ny.gov?
We have made some changes to these data to make them easier to use for medical students. Remember that you can always get the original raw data from the DOH site. Our changes include:
Updating the hospital names across all three years of data so they are consistent.
Where possible, matching the payer names to be consistent. For example, NY State changed 'Insurance Company' to 'Private Health Insurance' in 2014. We changed the 2012 and 13 data to reflect that.
We do not include the provider license number in the CSV files exported from this system. You can still get those from the public data file on the DOH site.
We only include the first listed source of payment (payer) type.
Looking for more open clinical data sets for your projects?
SPARCS only includes the first three digits of the patient's zip code. Its left blank if the population size for that area is less than 20,000. “OOS” are Out of State zip codes.
What are the diagnosis and procedure codes used in SPARCS?