Ramgopal Prajapat:

Learnings and Views

Fraud Analytics in Insurance

By: Ram on Jul 08, 2020

Fraud has been a significant cost drain for many organizations across industries and Insurance is no different.

“Total cost of insurance fraud (excluding health insurance) is more than $40 billion per year in the US alone” - https://www.fbi.gov/stats-services/publications/insurance-fraud

The fraud can be initiated by an employee of an insurer, broker & agent, and customers (prospective and existing).

Two common types of fraud in Insurance are:

  • Application Fraud
  • Claim Fraud
    • False Claim
    • Inflated Claim


Application or Underwriting Fraud


An applicant furnishes incorrect or false information to lower the premium amount for the policy application. For example, a driver, Manu, applies for auto insurance and fills up the application. The insurance application form requires him to provide his drinking habits and age. He provides false information so that his premium is low. In other scenarios, he provides incorrect information about the previous policy. This is an example of Application Fraud.

Insurance provider requires building mechanism to tag some of the applications as fraudulent based on information provided by the applicant. An accurate predictive or machine learning model can calculate the probability of application fraud.


Claim Fraud

Some misuse systems and processes by providing incorrect information for financial gain. Claim Fraud occurs when a customer applies for reimbursement or claim without eligibility for a claim. The customer submits forged documents to justify the claim. And in some cases, the claimants exaggerate the claim amount.

For example, a health insurance holder, Romney, got admitted to a local hospital. He incurred a medical expense of $3700 but colluded with hospital personal in getting medical expense receipts of $5723. He has applied for claiming the amount to his health insurance provider.

Challenges for Insurance providers are to manage customer experience on one side and expense due to fraudulent cases on the other side.

Due to increased competition and customer expectations, the insurance providers have to simplify and speed up the application and claim reimbursement processes.  If more applications or claims are tagged for decline or review, it will increase operational cost and also customer dissatisfaction among genuine customers.

Also, more and more insurance providers are moving to digital channels and the decision has to be faster.

Typically, there are a few ways to improve decision making to identify fraudulent applications and claim requests – improve the modeling process, select the best modeling technique and leverage enriched data sources.

Data from government agencies, Social Media, and external data providers can help in getting more reliable and probably accurate information for predictive models.

When historical information on fraud is available, machine learning and deep learning techniques such as neural networks, random forest, and support vector machine (SVM) can be used. In other scenarios unsupervised or subjective segmentation techniques such as k-means clustering, random forest (for anomaly detection) and Self Organized Maps (SOMs) could be tried.

Leave a comment