Kawasaki disease (KD) is the leading cause of acquired heart disease in children. Its prompt treatment can effectively lower the risk of severe complications, such as coronary aneurysms. However, accurately diagnosing KD at its early stage is impracticable given its unknown pathogenesis and lack of pathognomonic features. In this study, we investigated data-driven approaches by using a cohort of 10,367 patients extracted from electronic health records for early KD assessment. The incompleteness of clinical data presents group-based missing patterns associated with different clinical assessment measures. To address this problem, we developed a method integrating feature clustering to enable matrix-based representation and convolutional neural networks (CNN) for feature extraction and fusion to explicitly exploit the multi-source data structure. Integrating missing data imputation methods with the proposed method demonstrated superior accuracy (an AUC of 0.97) compared with a number of benchmark methods. The present method shows potential to improve clinical data mining. Our study highlighted the feasible utilization of matrix-based feature representation and CNN-based feature extraction for incomplete clinical data mining to support medical decision-making.
Keywords: Clinical data mining Convolutional neural networks Electronic health records Medical decision making