Healthcare Dataset Stroke Data
Context: This dataset is used to predict whether a patient is likely to get stroke based on the input parameters like gender, age, and various diseases and smoking status. I have taken a subset of the original test data using the filtering method for Data Visualization purposes.
Motive: What age group, gender or smoking status are more likely to get stroke and how do they compare with respect to these factors? Is there any correlation between age and stroke?
About the Data: Output/Occurance of stroke is a categorical variable. 2 of the inputs(gender and smoking status) are categorical and ordinal respectively and age is numerical.
Notes: Unknown in Smoking status means the information is unavailable N/A in other input fields imply that it is not applicable.
Data Source: CSV Link: https://gist.githubusercontent.com/aishwarya8615/d2107f828d3f904839cbcb7eaa85bd04/raw/healthcare-dataset-stroke-data.csv Github Data Link: https://gist.github.com/aishwarya8615/d2107f828d3f904839cbcb7eaa85bd04