Review the data from the State of New York open-source data portal https://healthdata.ny.gov/
1. Select two data sets from the NY state open data portal that would help you address the problem area of high costs and waste in the US healthcare system. Describe how they could help you address the problem area of interest.
2. Review your data sources and identify:
a) What medical terminology systems are used in the data? ( SNOMED, CPT, etc)
b) (i) If you do not find any standardized codes, find at least two concepts in the data that could have been captured using standard codes.
b) (ii) If you do not find any standardized codes, choose two other data sets from the NY open-data portal that use the appropriate codes. Provide a snapshot of the clippings of the data sets
c) In 3-4 sentences, explain how these standardized codes might help your analyses.
d) Describe how it might be challenging to integrate the data for analysts.
e) Review the data sets you have chosen, and look at the data and the meta-data to think about what fields are available for linkages. Provide about 2-3 sentences to address each of the questions below
1. How you would work with an analytics team to link data from the various data sources?
2. Why?
3. What fields could be used to link the data sources?
4. What kind of linkage methods might you try?
5. Would these fields be sufficient to obtain high-quality matches? Why or why not? How would you know?
6. What privacy or legal concerns during the linkage effort?
7. Anything else about the data you notice?