Simon Cooper, Engineering Fellow at Featurespace
Articles

That time when… I created the first version of the ARIC™ Risk Hub engine.

5 min read May 19, 2022

Take me to...

Simon-Cooper-1637

Simon Cooper

Engineering Fellow

Simon has been with Featurespace since 2013 and created the initial version of the ARIC engine. Constantly tinkering, he is still working on new features, prototypes, and modifications to the engine. Simon loves to code, the more intricate the system the better, and is continually surprised at how many transactions the ARIC engine is now responsible for. When not coding, Simon enjoys playing Victoria, his cello, in many bands and orchestras around Cambridge.

LinkedIn Connect

At the very start of Featurespace’s journey, we had a single customer running bespoke software for fraud detection on their gaming transactions. We knew how to create models that predicted various behaviors typically seen in financial fraud, and the data science team were already producing several proof-of-concept models for this purpose.

What we didn’t have was an all-purpose system to run these models in real time, processing live transactions. We realized that we could expand this technology into the wider financial and payment ecosystems to support banks and payment processors detecting and preventing fraud. To achieve this, we needed to expand the models’ capability to understand a wider range of distinct types and patterns of fraud.

Our single example in production – the gaming deposits system – was processing transactions using custom rules. These rules were kept up to date by Featurespace, but it was prone to slowdowns and bottlenecks, and needed constant ‘handholding’. We needed an engine that could run our data science models in real-time, with minimal intervention from the customer and the Featurespace team and could be updated by the customer’s data analysts themselves.

Dena-H_Adaptability-of-Fraudsters_IMG
Quotation Mark

The model can then analyze that data and produce a fraud score, which the engine then returns to the customer. All within a few milliseconds.

Simon Cooper, Engineering Fellow at Featurespace

The First ARIC™ Risk Hub Engine

This is what the first version of the ARIC Risk Hub engine was designed to do. It had no preconceived ideas about the models that would run on it, nor what those models actually did. It just gave the data science models exactly what they needed:

  • The events that the customer sent us in real-time.
  • Blocks of data the model could store information on, irrespective of the concepts that event represented – whether a customer, a merchant, a postcode, or a city.
  • And a way of taking the score returned by the model and giving it back to the customer in the format they needed it.

This structure gave us maximum flexibility to adapt our processing and models to fit the outcomes we needed to achieve as we expanded our deployments, and the types of fraud we were catching.

It was the engine’s responsibility to take the event, find the data needed by the model, and allocate it to the model in the right order at the right time. The model can then analyze that data and produce a fraud score, which the engine then returns to the customer. All within a few milliseconds.

The engine also had to scale. The same engine could be asked to process 10 events per second on a single machine running in the cloud, or 2,000 events per second running on a multi-node cluster in a customer’s data center. The ARIC Risk Hub models don’t need to know the size of the system, it’s the engine’s responsibility to scale the models appropriately.

As well as the engine, we also created our own language – AMDL – designed to be simple enough for non-programmers to use, but exactly fitting into how the engine worked, allowing it to scale with the engine with no changes on the analyst’s side needed. It could also be updated with no intervention from systems administrators, and no disruption to event processing.

ARIC™ Risk Hub Deployment

The very first deployments of the engine were designed and built as bespoke and fairly inflexible, but from those initial deployments we learnt where there were similarities and differences that existed across multiple deployments. The engine has been designed to be flexible across the core concepts of events, entities, and scores. Since those initial years, with each deployment we gained more and more information on the use cases the models can be applied to. That allowed us to continue to refine the engine to meet our customers’ growing needs and use cases.

Not long after the initial creation of the machine learning models, customers were able to update data science models on the fly. In 2016, we implemented a new way of processing entities that allowed the engine to scale regardless of the entity patterns in the incoming events. In 2019, new models could be added to an existing system, with already-maturated data, with no disruption to processing.

It’s a testament to the ARIC Risk Hub engine’s architecture that in 2021 we introduced Automated Deep Behavioral Networks, a recurrent neural network-based architecture which dramatically improves fraud detection rates. This latest innovation in machine learning models can be run on the existing engine with no modifications. To the ARIC Risk Hub engine, it was just another model.

Conclusion

The ARIC Risk Hub engine has changed massively since its creation back when Featurespace began, and because it’s been designed to be adaptable, it will continue to do so. However, at the same time, it is at its core still the same engine designed by the same smart people – an innovation that gives our models and business rules the data they need, when they need it, to return a score. And still within a few milliseconds.

ferdinand-stohr-NFs6dRTBgaM-unsplash-glitched-08-11-2021-11-11-31
article_b

Sign up for regular insights, content and news from Featurespace.