SMU Data Science Review


The Federal Department of Health and Human Services spends approximately $830 Billion annually on Medicare of which an estimated $30 to $110 billion is some form of fraud, waste, or abuse (FWA). Despite the Federal Government’s ongoing auditing efforts, fraud, waste, and abuse is rampant and requires modern machine learning approaches to generalize and detect such patterns. New and novel machine learning algorithms offer hope to help detect fraud, waste, and abuse. The existence of publicly accessible datasets complied by The Centers for Medicare & Medicaid Services (CMS) contain vast quantities of structured data. This data, coupled with industry standardized billing codes provides many opportunities for the application of machine learning for fraud, waste, and abuse detection. This research aims to develop a new model utilizing machine learning to generalize the patterns of fraud, waste, and abuse in Medicare. This task is accomplished by linking provider and payment data with the list of excluded individuals and entities to train an Isolation Forest algorithm on previously fraudulent behavior. Results indicate anomalous instances occurring in 0.2% of all analyzed claims, demonstrating machine learning models’ predictive ability to detect FWA.

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License