Paper deep dive

A Hybrid AI and Rule-Based Decision Support System for Disease Diagnosis and Management Using Labs

Muhammad Hammad Maqsood, Mubashir Sajid, Khubaib Ahmed, Muhammad Usamah Shahid, Muddassar Farooq

Year: 2026Venue: arXiv preprintArea: cs.AIType: PreprintEmbeddings: 26

Intelligence

Status: succeeded | Model: google/gemini-3.1-flash-lite-preview | Prompt: intel-v1 | Confidence: 98%

Last extracted: 3/22/2026, 5:15:56 AM

Summary

This paper presents a hybrid Clinical Decision Support System (CDSS) that integrates AI predictive modeling (XGBoost) with a rule-based expert system to assist physicians in disease diagnosis. By analyzing laboratory results, the system provides a ranked list of likely diagnoses and confirms them using clinically validated rules, while utilizing SHAP values to provide explainability for AI-driven inferences.

Entities (5)

Clinical Decision Support System · system · 100%ICD-10 · classification-standard · 100%SHAP · explainability-method · 100%XGBoost · algorithm · 100%CureMD · organization · 95%

Relation Signals (3)

Clinical Decision Support System → assign → ICD-10

confidence 100% · This component assigns International Classification of Diseases (ICD-10) codes to patients

Clinical Decision Support System → integrates → XGBoost

confidence 100% · The system utilizes an XGBoost multi-class classifier to compute the probability of various potential diagnoses

Clinical Decision Support System → uses → SHAP

confidence 100% · Prediction Explanation via SHAP Values. By quantifying the impact of each feature on the inference

Cypher Suggestions (2)

Find all algorithms used by the CDSS · confidence 95% · unvalidated

MATCH (c:System {name: 'Clinical Decision Support System'})-[:INTEGRATES]->(a:Algorithm) RETURN a.name

List all classification standards supported by the system · confidence 90% · unvalidated

MATCH (c:System)-[:ASSIGN]->(s:ClassificationStandard) RETURN s.name

Abstract

Abstract:This research paper outlines the development and implementation of a novel Clinical Decision Support System (CDSS) that integrates AI predictive modeling with medical knowledge bases. It utilizes the quantifiable information elements in lab results for inferring likely diagnoses a patient might have. Subsequently, suggesting investigations to confirm the likely diagnoses -- an assistive tool for physicians. The system fuses knowledge contained in a rule-base expert system with inferences of data driven predictors based on the features in labs. The data for 593,055 patients was collected from 547 primary care centers across the US to model our decision support system and derive Real-Word Evidence (RWE) to make it relevant for a large demographic of patients. Our Rule-Base comprises clinically validated rules, modeling 59 health conditions that can directly confirm one or more of diseases and assign ICD-10 codes to them. The Likely Diagnosis system uses multi-class classification, covering 37 ICD-10 codes, which are grouped together into 11 categories based on the labs that physicians prescribe to confirm the diagnosis. This research offers a novel system that assists a physician by utilizing medical profile of a patient and routine lab investigations to predict a group of likely diseases and then confirm them, coupled with providing explanations for inferences, thereby assisting physicians to reduce misdiagnosis of patients in clinical decision-making.

PDF

Open source PDF →Open local PDF →

Full Text

25,222 characters extracted from source content.

Expand or collapse full text

A Hybrid AI and Rule-Based Decision Support System for Disease Diagnosis and Management Using Labs Muhammad Hammad Maqsood 1 , Mubashir Sajid 1 , Khubaib Ahmed 1 , Muhammad Usamah Shahid 1 , and Muddassar Farooq 1 CureMD Research, 80 Pine St 21st Floor, New York, NY 10005, United States hammad.maqsood, mubashir.sajid, khubaib.ahmed, muhammad.usamah, muddassar.farooq@curemd.com https://w.curemd.com/ Abstract. This research paper outlines the development and implementation of a novel Clinical Decision Support System (CDSS) that integrates AI predictive modeling with medical knowledge bases. It utilizes the quantifiable information elements in lab results for inferring likely diagnoses a patient might have. Sub- sequently, suggesting investigations to confirm the likely diagnoses – an assistive tool for physicians. The system fuses knowledge contained in a rule-base expert system with inferences of data driven predictors based on the features in labs. The data for 593,055 patients was collected from 547 primary care centers across the US to model our decision support system and derive Real-Word Evidence (RWE) to make it relevant for a large demographic of patients. Our Rule-Base comprises clinically validated rules, modeling 59 health conditions that can directly confirm one or more of diseases and assign ICD-10 codes to them. The Likely Diagno- sis system uses multi-class classification, covering 37 ICD-10 codes, which are grouped together into 11 categories based on the labs that physicians prescribe to confirm the diagnosis. This research offers a novel system that assists a physician by utilizing medical profile of a patient and routine lab investigations to predict a group of likely diseases and then confirm them, coupled with providing ex- planations for inferences, thereby assisting physicians to reduce misdiagnosis of patients in clinical decision-making. Keywords: Clinical Decision Making · Hybrid CDSS · Laboratory Data Analy- sis · AI in Healthcare 1 Introduction Clinical Decision Support Systems (CDSS) are vital tools in modern healthcare, de- signed to enhance medical decisions by leveraging knowledge and patients’ data to pro- vide insights. These systems help physicians use RWE derived from Real World EMR Data. These systems integrate data from a variety of sources to recommend action- able interventions, improving patient outcomes and streamlining healthcare processes. They assist healthcare providers by offering decision-support tools that are directly in- tegrated into clinical workflows of EMR systems. CDSS have been shown to improve valued based outcome for patients by reducing the cost of healthcare services. CDSS tools have been proposed that either utilize predictive modeling or draw inference using the rules in a medical rule base system [14]. arXiv:2603.14876v1 [cs.AI] 16 Mar 2026 2H. Maqsood, M. Sajid, K. Ahmed, M. U. Shahid, M. Farooq Despite advances in technology, many clinical decision support systems (CDSS) remain restricted, primarily utilizing either rule-based algorithms or classical artificial intelligence (AI) techniques without fully integrating both approaches to harness their combined potential[4]. Knowledge driven rule-based system have historically enjoyed higher acceptability among physicians as the rules can be clinically validated by providers. They incorpo- rate domain knowledge directly into the system and assist in the decision making of a physician. In these systems, modelling knowledge representation and upgrading con- tinuously is a significant challenge; as a result, gaps in care holes pop in these systems [10] with advancements in medical knowledge and associated technologies. Machine learning (ML) and big data analytics have enhanced the sophistication of CDSS systems. Modern CDSS systems leverage ML models to offer predictive ana- lytics and patient centered recommendations [2] [4]. These ML algorithms are able to achieve higher accuracy for a number of chronic diseases. Clinicians are, however, reluctant to trust these systems because of their black box type nature of making infer- ences. The trust and acceptability of Predictive ML models in healthcare is increasing [1], especially as assistive tools, for the models that also explain their inference method. In this paper, we make efforts to add explainability layer on all ML tools. The major con- tributions of this research include: (1) the fusion of Knowledge Base with ML Models to design a Hybrid CDSS; (2) Utilization of quantifiable information in Laboratory Data for designing Likely diagnoses models for patients; and (3) Explanation of inferences to a physician using SHAP values to take informed decisions with the assistance of AI models. 2 The Proposed CDSS 2.1 System Overview This paper introduces a CDSS that integrates Machine Learning (ML) with a rule-based framework to provide a robust support mechanism for diagnosing a wide range of dis- eases with a focus on interpretability of the system shown in Figure 1. It has the follow- ing components. Diagnosis Confirming Module This component assigns International Classification of Diseases (ICD-10) codes to patients based on a disease specific rule base, made using clinical guidelines of relevant associations or institutes, by facotring in demographics and and laboratory results. Likely Diagnosis Assistive Module By analyzing laboratory results, which offer a quantifiable snapshot of a patient’s health profile, this component predicts potential di- agnoses and helps physicians in focusing their diagnostic efforts. D. Jing and M.Jing have demonstrated the potential for predicting diagnosis using laboratory results with the help of multi-class classification, achieving an F1 score of 0.76 [13]. Using multi- class classification on grouped diseases allows physicians to maintain focus during the Hybrid Clinical Decision Support System3 process of diagnosis confirmation among a group of likely diagnoses options. For dis- eases that are deemed highly likely, the system also recommends further investigations to confirm these diagnoses, ensuring that the new information aids in patient centered decision-making by complying with the clinical guidelines. Fig. 1: Proposed Clinical Decision Support System Fig. 2: CDSS Workflow CDSS Output 1. Rule-Based Diagnosis Confirmation. The CDSS confirms diagnosis based on a predefined rule set that is applied to lab results and outputs ICD-10 codes. 2. Multiclass Probabilities (Likely Diagnosis). The system utilizes an XGBoost multi-class classifier to compute the probability of various potential diagnoses based on lab data and presents a ranked list of likely diagnoses to clinicians. 3. Prediction Explanation via SHAP Values. By quantifying the impact of each feature on the inference, SHAP values provide clinicians with explainable insights into rationale behind diagnostic inferences [16] and helps develop trust, essential for mainstream adoption of the ML based CDSS systems [8] 4. Recommendation for Follow-Up Labs. When a likely diagnosis is identified, the system suggests follow-up investigations that could confirm the diagnosis by re- ducing misdiagnosis rate. 3 Data Overview The dataset utilized for developing and validating the proposed Clinical Decision Sup- port System (CDSS) uses demographic (age and gender) of patients, and laboratory results that are extracted from anonymized and de-identified CureMD EHR data, from 593,055 electronic medical records, making it a larger dataset from studies reported earlier in the literature ranging in sample size of 50,000 to 100,000 EMR records records[7]. These data sources are curated to maintain high standards of privacy and confidentiality, adhering to HIPAA guidelines. For instance, a systematic review of hospital length of stay prediction tools found median sample sizes of around 53,211 records[7]. 4H. Maqsood, M. Sajid, K. Ahmed, M. U. Shahid, M. Farooq 1. Geographical and Demographic Distribution. The data encompasses a wide dis- tribution across different states, medical practices and demographics (age and gen- der) (seen in Appendix 5 6). The distribution of race is similar to the actual dis- tribution of races [9] in the United States of America. This diversity mirrors the actual real world data, ensuring the system’s applicability across various healthcare settings and trustworthiness of RWE derived. 2. Disease Distribution. The dataset includes a varied distribution of diseases as seen in Table 1, allowing the system to handle multiple diagnostic scenarios and adapt to a broad spectrum of medical conditions. To ensure a robust and relevant participant sample, eligibility criteria was estab- lished and patients who have undergone laboratory tests within the specified one-year period prior to a potential diagnosis were selected. Patients without age and demo- graphic information were filtered out. The first occurrence of the diagnosis was used as a reference date for training. For normal patients, labs in the latest one-year period were used. 3.1 Data Preparation Apache Spark, via PySpark, was utilized to efficiently handle and process the dataset in our on premise state-of-the-art Lakehouse. Along with Lab values, age and gender were used as features since interpretation of labs requires age and gender[6][15]. Lab tests including CBC, CMP, lipid panel and liver function panel were also used. The dataset includes laboratory data collected over a 23-year period from 2000 to 2023. For the purpose of diagnosis, we consider only the laboratory data from a one-year period im- mediately preceding the confirmed diagnosis of a disease. This time frame is selected to capture the most relevant biochemical changes that might indicate the onset of a disease. This ensures that the lab results added are recent and relevant [17][5]. Normal patients are selected on the basis that they have not been diagnosed with any of the ICD-10 codes added in the pool of diseases. ICD-10 codes for classification were chosen and grouped together as shown in Table 1 based on the feedback from medical profession- als, leveraging expert clinical insights to inform the modeling process. Outliers were identified and removed based on standard deviation to clean the clinical errors in data reporting, which was verified with the help of visual tools such as violin plots. Lab names were initially mapped using IMO API to remove any errors in them, which were further refined by mapping of same lab tests onto one key. This ensured that a specific lab was not repeated in the pool with a little different name. After mapping of lab names, the lab units were mapped. It was made sure that the units were consistent. The values with inconsistent units were ignored. The mappings of both lab names and lab units were verified by clinical experts. 4 Methodology 4.1 Diagnosis Confirmation Methodology The Diagnosis Confirmation System is designed to utilize lab results to confirm patient diagnoses. It is based on a rule-based expert system, where each rule within the rule Hybrid Clinical Decision Support System5 Table 1: Disease Distribution DiseaseICD 10 CodesCount Upper Respiratory Tract Infections (URTI) J02, J03, J32, J31, J01, J06, J00 102,802 Gastroesophageal Reflux Disease (GERD) K29, K21, K30, K20, K25 50,086 Lung diseasesJ45, J44, J43, J42, J40, J20 49,121 Type 2 diabetes mellitus (T2DM)E1130,403 AnemiaD64, D63, D5060,784 Kidney diseasesN18, N1725,553 HypothyroidismE00, E01, E02, E03,24,084 Ischemic Heart disease (IHD)I20, I21, I2515,553 DyslipidemiaE5577,558 Vitamin D DeficiencyE7869,122 Disorders of White Blood CellsD70, D71, D72, D7615,346 Normal- - - -72,643 base can confirm a diagnosis, if conditions in the antecedents are met. These rules and conditions were extracted by a panel of physicians from the clinical guidelines. The rules look for anomalies in the lab reports according to the clinical guidelines, and then confirm a diagnosis. Rule base Design: Each rule in the rule base is comprised of multiple conditions that must all be true for the rule to confirm a diagnosis. Each condition is comprised of: – The lab test identifier – The comparison value – The unit of measurement – The type of comparison (e.g., greater than, less than, equal to) Currently, the system includes rules for 59 health conditions, and we want to expand by incorporating rules from guidelines, extracted by our team of clinical analysts. The patient lab results are evaluated against the stored rules to confirm a diagnosis. This process is fully automated and provides a robust framework for assigning ICD-10 codes based on the latest lab data. 4.2 Likely Diagnosis Methodology Recognizing that all labs results might not be available, we train AI/ML models to infer the set of likely diagnoses based on available information without using any im- putation. XGBoost was used for multi-class classification since it inherently manages missing values [3]. It is also explainable using SHAP [11], and outperforms the bagging ensembles[12]. Data Utilization and Partitioning: 80% of the total dataset was allocated for train- ing and internal validation purposes. Stratified splitting approach was used to ensure 6H. Maqsood, M. Sajid, K. Ahmed, M. U. Shahid, M. Farooq that each split is representative of the overall population distribution. Cross validation was performed using 5-fold cross-validation. The remaining 20% served as a stan- dalone test set to evaluate the model’s performance. Grid search cross-validation ap- proach was used to tune hyper-parameters including max depth, learningrate, nestimators, gamma, subsample, colsamplebytree, and reglambda. Class Imbalance During experimentation, classes were balanced by under sampling to the count of the smallest disease group. This resulted in a loss of data and decrease in performance of our model. Then weights for each class were initialized based on the inverse frequency of the disease in the training data. Through empirical studies, a heuristic was found: taking the square root of these weights yielded the best results. This approach helped to enhance model sensitivity for less prevalent conditions without compromising the overall accuracy of the inference system. 5 Results and Discussion We used the Top N criterion (Most probable N diseases) from [13] to evaluate the accu- racy of our likely diagnosis multi-class model. In this criterion, we check if the correctly diagnosed disease is within the N highest probability diseases. We also used confusion matrices to check the precision of our models. We then compared the distribution of our predictions among diseases to the true distribution of diseases in the EMR data. Top-N1234567891011 Accuracy31.1852.6566.4376.0083.1088.4392.4395.4097.4998.8799.6 Table 2: Top-N Accuracy (%) 5.1 Accuracy and Recall The table 2 shows the accuracy results of the model for different values of N. As it can be seen from the table 2, the 80 percent threshold is achieved when the Top 5 approach is used. This 80 percent threshold serves as an optimal trade off between the number of predicted diseases and an increase in accuracy if the disease set is increased after 5. To see the recall of individual diseases, Table 3 shows the results. Disease groups like URTI, Dyslipidemia and Anemia are predicted in top 5 with a recall of 0.943, 0.866, 0.860 respectively. Even in Top 1 results, The best performance of the model is in distinguishing the disease patients from the normal patients with a very high recall of 0.91 in Top 5 diseases. The Top 5 approach is being considered here since our goal is to predict the likely diagnoses, recommend the subsequent lab investigations, and confirm it using our rule base. 5.2 Confusion Matrices The precision for the multi-class model trained for likely diagnosis is presented in figure 4. It shows the density of a disease being predicted correctly. This helps to observe what label is being predicted for which disease. It can be observed that the label ’normal’ Hybrid Clinical Decision Support System7 Table 3: Recall rates for various diseases at different rankings DiseaseTop 1Top 2Top 3Top 4Top 5 URTI0.3320.6290.8100.9020.943 GERD0.2350.3840.5040.6590.802 LungDisease0.1240.2890.5000.6860.827 T2DM0.2020.3560.4930.5960.685 Anemia0.4290.6390.7580.8250.860 KD0.3590.5990.7080.7760.815 Hypothyroidism0.1710.2790.3750.4510.528 IHD0.1260.2130.3080.3970.488 Dyslipidemia0.3030.5410.6880.7880.866 VDdeficiency0.3100.5420.6680.7520.823 DWBC0.2970.4540.5520.6200.674 normal0.4950.7260.8240.8750.916 has the best results in the sense of being predicted as normal. Furthermore, the color density shows that predicted label is the same as the true label for most of diseases in the disease set, and is evident from the color density in the diagonal. It can be seen that URTI patients are being classified as lower lung disease patients, which makes sense as both disease groups are related to lungs and have similar set of symptoms. For other labels, the model is very precise which can be seen from the color intensity in the center of the graph. 5.3 Prediction Distribution Figure 3 shows the distribution of predicted labels compared to the true labels. When considering the Top 1, the predictions of our model matches the true prevalence of these diseases in the EMR data very closely. For example, 15.68% patients (18.5k ap- prox) in the test data have URTI, and our model matches it very closely and predicts prevalence of URTI for 17.33% patients. This allows us to ensure that our model is inferring the correct group of diseases amongst the different disease groups, and is not over-predicting or under-predicting for any particular disease group. The distribution of weights while training the model helped bring the true vs predicted counts closer to one another. The maximum difference can be seen in the case of Lung Disease which is lower than 3%. 6 Conclusion In this paper, we propose a novel disease inference engine that fuses the knowledge from a rule based expert system with the prediction models of ML to overcome the shortcomings of each one of them. The likely diagnosis system serves as an assistant to a physician, helping him in correctly diagnosing a patient and subsequent management according to the guidelines of respective institutes and associations. The physician can then order further lab investigations to confirm diagnoses in the likely set of diagnoses. When the lab results are received in the EMR system, the expert system confirms the diagnosis based on the rules. 8H. Maqsood, M. Sajid, K. Ahmed, M. U. Shahid, M. Farooq Fig. 3: Actual vs Predicted Counts with Percentages Fig. 4: Precision in disease forecasting Hybrid Clinical Decision Support System9 For making the sytem explainable to a physician, the mean Shap (Appendix B) were used and learned trends were validated. The Mean Shap values show the overall Trend learned by the model while the Shap values for individual patient data (Appendix 9) help understand the model decision for a patient. This provides a physician with a reasoning of why the inference model is giving a certain output and can gain the trust of a physician, if the explanations are grounded in medical knowledge. By optimizing the diagnosis accuracy and reducing the redundancy of unnecessary tests, the proposed system can become a componnent of CDSS, and can significantly enhance its operational efficiency by reducing healthcare costs. It’s ability to handle multiple diseases and integrate new information makes it a robust system that can adapt to various clinical needs, thus broadening its applicability and generalizability. Limitations and Future Work The study, while robust in its achievements, faces several limitations that could affect its application and generalizability. The scope of the research is limited to using laboratory data as the primary source of information for making predictions. While lab results are helpful for many conditions, the exclusion of other critical factors such as vital signs, social history, and patient-reported symptoms could limit the comprehensiveness and accuracy of the predictions. The reliance solely on the lab investigations might intro- duce biases where certain diseases that manifest predominantly through other clinical signs or patient histories are underrepresented or inaccurately predicted. The future direction of research includes using lab specific data in a time window in which lab results are more meaningful. Beyond recommending the follow-up labs, the system will be enhanced to include include recommendations for medications, proce- dures, surgeries, and treatment pathways. This would assist in managing patients using a systems medicine approach. References 1. Julia Amann, Alessandro Blasimme, Effy Vayena, Dietmar Frey, Vince I Madai, and Pre- cise4Q Consortium. Explainability for artificial intelligence in healthcare: a multidisci- plinary perspective. BMC medical informatics and decision making, 20:1–9, 2020. 2. Geir Thore Berge, Ole-Christoffer Granmo, Tor Oddbjørn Tveit, Bjørn Erik Munkvold, AL Ruthjersen, and J Sharma. Machine learning-driven clinical decision support system for concept-based searching: a field trial in a norwegian hospital. BMC Medical Informatics and Decision Making, 23(1):5, 2023. 3. Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016. 4. Zhao Chen, Ning Liang, Haili Zhang, Huizhen Li, Yijiu Yang, Xingyu Zong, Yaxin Chen, Yanping Wang, and Nannan Shi. Harnessing the power of clinical decision support systems: challenges and opportunities. Open Heart, 10(2):e002432, 2023. 5. Goce Dimeski and Oliver Treacy. Biochemical tests for diagnosing and evaluation stages of chronic kidney disease. In Chronic Kidney Disease-Beyond the Basics. IntechOpen, 2022. 6. Anthony Fenton, Emma Montgomery, Peter Nightingale, A Michael Peters, Neil Sheerin, A Caroline Wroe, and Graham W Lipkin. Glomerular filtration rate: new age-and gender- specific reference ranges and thresholds for living kidney donation. BMC nephrology, 19:1– 8, 2018. 10H. Maqsood, M. Sajid, K. Ahmed, M. U. Shahid, M. Farooq 7. Swapna Gokhale, David Taylor, Jaskirath Gill, Yanan Hu, Nikolajs Zeps, Vincent Lequertier, Luis Prado, Helena Teede, and Joanne Enticott. Hospital length of stay prediction tools for all hospital admissions and general medicine populations: systematic review and meta-analysis. Frontiers in Medicine, 10, 2023. 8. Caroline Jones, James Thornton, and Jeremy C Wyatt. Enhancing trust in clinical decision support systems: a framework for developers. BMJ health & care informatics, 28(1), 2021. 9. Nicholas Jones, Rachel Marks, Roberto Ramirez, and Merarys R ́ ıos-Vargas. 2020 census illuminates racial and ethnic composition of the country. United States Census Bureau, 12, 2021. 10. Guilan Kong, Dong-Ling Xu, Xinbao Liu, and Jian-Bo Yang. Applying a belief rule-base in- ference methodology to a guideline-based clinical decision support system. Expert Systems, 26(5):391–408, 2009. 11. Ziqi Li. Extracting spatial effects from machine learning model using local interpretation method: An example of shap and xgboost. Computers, Environment and Urban Systems, 96:101845, 2022. 12. Adeola Ogunleye and Qing-Guo Wang. Xgboost model for chronic kidney disease diagnosis. IEEE/ACM transactions on computational biology and bioinformatics, 17(6):2131–2140, 2019. 13. Dong Jin Park, Min Woo Park, Homin Lee, Young-Jin Kim, Yeongsic Kim, and Young Hoon Park. Development of machine learning model for diagnostic disease prediction based on laboratory tests. Scientific reports, 11(1):7567, 2021. 14. Reed T Sutton, David Pincock, Daniel C Baumgart, Daniel C Sadowski, Richard N Fedorak, and Karen I Kroeker. An overview of clinical decision support systems: benefits, risks, and strategies for success. NPJ digital medicine, 3(1):17, 2020. 15. Sakon Suwanrungroj, Parichart Pattarapanitchai, Sirinart Chomean, and Chollanot Kaset. Establishing age and gender-specific serum creatinine reference ranges for thai pediatric population. Plos one, 19(3):e0300369, 2024. 16. Viswan Vimbi, Noushath Shaffi, and Mufti Mahmud. Interpreting artificial intelligence mod- els: a systematic review on the application of lime and shap in alzheimer’s disease detection. Brain Informatics, 11(1):10, 2024. 17. Xiaoxia Wen, Ping Leng, Jiasi Wang, Guishu Yang, Ruiling Zu, Xiaojiong Jia, Kaijiong Zhang, Birga Anteneh Mengesha, Jian Huang, Dongsheng Wang, et al. Clinlabomics: lever- aging clinical laboratory data by data mining strategies. BMC bioinformatics, 23(1):387, 2022. Hybrid Clinical Decision Support SystemA A Data Distributions A.1 Race Distribution Fig. 5: Race Distribution of Data A.2 Statewise Data Distribution Fig. 6: Distribution of Patients by States BH. Maqsood, M. Sajid, K. Ahmed, M. U. Shahid, M. Farooq B Shap Plots Fig. 7: Shap summary plot for T2DM Hybrid Clinical Decision Support SystemC Fig. 8: Summary plot showing absolute impact of feature on decision Fig. 9: Important risk features and their impact for a given patient