Abstract: Increasing global Emergency Department (ED) visits, exacerbated by COVID-19, has presented multiple challenges in recent years. Electronic Health Records (EHRs) as comprehensive digital repositories of patient health information offer a pathway to construct prediction systems to address these issues. However, the heterogeneity of EHRs complicates accurate predictions. A notable challenge is the prevalence of high-cardinality nominal features (NFs) in EHRs. Due to their numerous distinct values, these features are often excluded from the analysis, risking information loss, reduced accuracy, and interpretability. This study proposes a framework, integrating a preprocessing technique with target encoding (TE-PrepNet) into machine learning (ML) models to address challenges of NFs from MIMIC-IV-ED. We evaluate performance of TE-PrepNet in two specific ED-based prediction tasks: triage-based hospital admissions and ED reattendance within 72 hours at discharge time. Incorporating three NFs,
Loading