Heart disease is a problem that is often overlooked until too late. I set out to use a real dataset combined with Logistic Regression to better understand uncommon features – that being how BMI, Age Category, Race, SleepTime, Physical Health, and Mental Health play a role into developing Heart Disease to understand the relationship. You can read the whole paper here: Click here
Here are the results:
Based on the results of the project, it appears that the MLP Classifier performed better than the Logistic Regression model in terms of accuracy and F1 score. The KNN Ensemble model performed the worst in terms of MSE, but it is important to note that the MSE metric is not always the best way to evaluate the performance of a classification model.
One possible explanation for the better performance of the MLP Classifier could be the fact that it is a more complex model that is capable of capturing non-linear relationships between the features and the label. In contrast, the Logistic Regression model is a relatively simple model that assumes a linear relationship between the features and the label.
Based on these results, I would recommend using the MLP Classifier model for predicting heart disease, as it performed the best in terms of both accuracy and F1 score. Additionally, it is important to consider the SleepTime and MentalHealth features when building the model, as they appear to be the most important for predicting heart disease.
To answer the problem statement, it appears the optimal sleep time and mental health did not have a strong correlation with heart disease. Optimal sleep time is 6.517 hours to avoid heart disease. While Mental Health is 2.36 (out of 4) for good standing.