In this article originally published by Banking Strategies, PJ Rohall writes about the challenges of synthetic ID fraud and how data and advanced analytics can overcome them.

Synthetic identity fraud has been around for decades in the United States, but has only recently garnered scrutiny from financial institutions and service providers, as well as attention from the media. Rough estimates suggest that financial institutions are losing billions of dollars each year due to synthetic ID fraud, and the numbers appear to be growing. As this threat intensifies, how can financial institutions fight back?

Synthetic ID fraud began escalating when the Social Security Administration started randomizing the issuance of Social Security numbers in 2011. Randomized issuance enabled criminals to commit a new type of fraud by using any nine-digit number combination that did not have credit on file.

Another issue is that the financial industry doesn’t have an agreed-upon definition of synthetic ID fraud, and there is further ambiguity in the various subsets of synthetic ID fraud, which include first- and third-party. Without a common taxonomy, the industry struggles to put together a fully effective response.

Unlike many other types of attacks, synthetic ID fraud doesn’t necessarily victimize anyone immediately. Identities are cobbled together gradually at minimal cost and fraudulent applications have traditionally been difficult to detect. With a little patience, fraudsters can build up multiple accounts across a range of credit products. When they’re ready to “bust out”, they use their accumulated credit all at once and then disappear, leaving the lender with defaulted loans and nobody to hold accountable.

The problem for financial institutions isn’t just in stopping the fraud, it’s also determining where it will end up. Most synthetic ID fraud is lumped in with a financial institution’s legitimate customer credit defaults, artificially inflating credit risk and leaving little insight into the scope of the real problem. If you can’t quantify the losses, it’s hard to determine how to invest in preventative measures.

This year, the SSA launched the pilot program for electronic, consent-based verification of Social Security numbers – better known by its acronym, eCBSV. This program, targeted for industry wide roll-out in 2021, enables financial institutions to cross-reference specific application information with what the SSA has on file. Although only in its infancy, this program could play a significant role in mitigation strategies. Technology providers and fraud fighters across industries have also developed targeted data sets, advanced analytics and prescribed methodologies to help financial institutions tackle synthetic ID fraud.

The fundamental challenge in stopping synthetic ID fraud is understanding the depth and consistency of the data provided in the application. Does the data verify over a long period of time, and is that data consistent throughout the multiple data points in the application? Data consortiums can help determine this.

Financial institutions must also be more attuned to the types of inconsistencies in an application that would raise suspicions of synthetic ID fraud. For example, the applicant may use an invalid email address, or their phone number links to multiple name/address combinations. This doesn’t guarantee that the applicant is a criminal, but it does send up red flags. The process of combining those aggregate signals must be streamlined through a single platform where machine learning, link analysis and investigator review can work in conjunction.

Stopping all synthetic ID fraud at the application stage would be ideal, but this is not realistic, so it’s important to continue monitoring the behavior of applicants beyond the approval process. These criminals will try to behave like “normal” customers before they bust out, but they still do things that stand out. These include frequently requesting credit line increases over a short period of time, a lack of common customer transactions (installment payments, automated bill pay enrollment, etc.) and constantly building up authorized user tradelines.

Considered separately, these actions may not say much, but taken together, they tell a different story. Financial institutions must be capable of making sense of signals within their data, which requires reliable third-party data sources, a strong understanding of the specific synthetic ID fraud behaviors and advanced analytical techniques.

Finally, it’s important to build up accurate labels of synthetic ID fraud, which is initially a challenging task because most financial institutions lump synthetic ID fraud with all other credit write-offs. Fraud teams should have a clear taxonomy for accurately labeling synthetic ID crimes. Over time, these labels can be used in supervised modeling, leading to a richer and more useful knowledge base.