The data is sourced from the 2020 annual CDC survey of 400k US adults regarding their health status. The dataset is a major part of the Behavioral Risk Factor Surveillance System (BRFSS), which conducts annual telephone surveys to comprehensively collect health-related information from residents across the United States.
The code given below select a random sample of 300 respondents from the original data set which you need to use to analyze the data for those 300 respondents.
Use the Heart Disease dataset to answer the following questions:
a. Is having heart disease independent of whether the respondent is diabetic? Include the two-way table and all your calculations in your answer.
b. Is there a relationship between having heart disease and the general health condition of the respondent? Include the two-way table, stacked bar graph, and all your calculations in your answer.
c. Is there a relationship between having heart disease and the gender of the respondent? Include the two-way table, stacked bar graph, and all your calculations in your answer.
d. Does doing physical activity lower the risk of having heart disease? If so, by how much?
B – Statistical Intuition in Data Science