Take me to...
What we didn’t have was an all-purpose system to run these models in real time, processing live transactions. We realized that we could expand this technology into the wider financial and payment ecosystems to support banks and payment processors detecting and preventing fraud. To achieve this, we needed to expand the models’ capability to understand a wider range of distinct types and patterns of fraud.
Our single example in production – the gaming deposits system – was processing transactions using custom rules. These rules were kept up to date by Featurespace, but it was prone to slowdowns and bottlenecks, and needed constant ‘handholding’. We needed an engine that could run our data science models in real-time, with minimal intervention from the customer and the Featurespace team and could be updated by the customer’s data analysts themselves.
The model can then analyze that data and produce a fraud score, which the engine then returns to the customer. All within a few milliseconds.
This is what the first version of the ARIC Risk Hub engine was designed to do. It had no preconceived ideas about the models that would run on it, nor what those models actually did. It just gave the data science models exactly what they needed:
This structure gave us maximum flexibility to adapt our processing and models to fit the outcomes we needed to achieve as we expanded our deployments, and the types of fraud we were catching.
It was the engine’s responsibility to take the event, find the data needed by the model, and allocate it to the model in the right order at the right time. The model can then analyze that data and produce a fraud score, which the engine then returns to the customer. All within a few milliseconds.
The engine also had to scale. The same engine could be asked to process 10 events per second on a single machine running in the cloud, or 2,000 events per second running on a multi-node cluster in a customer’s data center. The ARIC Risk Hub models don’t need to know the size of the system, it’s the engine’s responsibility to scale the models appropriately.
As well as the engine, we also created our own language – AMDL – designed to be simple enough for non-programmers to use, but exactly fitting into how the engine worked, allowing it to scale with the engine with no changes on the analyst’s side needed. It could also be updated with no intervention from systems administrators, and no disruption to event processing.
The very first deployments of the engine were designed and built as bespoke and fairly inflexible, but from those initial deployments we learnt where there were similarities and differences that existed across multiple deployments. The engine has been designed to be flexible across the core concepts of events, entities, and scores. Since those initial years, with each deployment we gained more and more information on the use cases the models can be applied to. That allowed us to continue to refine the engine to meet our customers’ growing needs and use cases.
Not long after the initial creation of the machine learning models, customers were able to update data science models on the fly. In 2016, we implemented a new way of processing entities that allowed the engine to scale regardless of the entity patterns in the incoming events. In 2019, new models could be added to an existing system, with already-maturated data, with no disruption to processing.
It’s a testament to the ARIC Risk Hub engine’s architecture that in 2021 we introduced Automated Deep Behavioral Networks, a recurrent neural network-based architecture which dramatically improves fraud detection rates. This latest innovation in machine learning models can be run on the existing engine with no modifications. To the ARIC Risk Hub engine, it was just another model.
The ARIC Risk Hub engine has changed massively since its creation back when Featurespace began, and because it’s been designed to be adaptable, it will continue to do so. However, at the same time, it is at its core still the same engine designed by the same smart people – an innovation that gives our models and business rules the data they need, when they need it, to return a score. And still within a few milliseconds.