Lucy-Griffin-213_Article Hero
Articles

Process. Is that really the secret to a successful Data Science team?

15 min read Apr 26, 2022

Take me to...

Lucy-Griffin–1559_cropped

Lucy Griffin

Director of Analytics

Lucy leads the global Analytics department at Featurespace who combine cutting edge machine learning and deep industry expertise in risk strategy to fight fraud and financial crime. Her team decide how best to solve customer problems using of Featurespace's technology, owning the end-to-end lifecycle for models and rules. After completing a PhD in Chemistry, Lucy joined Featurespace as a Data Scientist in 2016 where she worked on numerous high-profile projects. She took the helm of the Data Science team in 2019, bringing all customer analytics under her leadership in 2021. Outside of work, Lucy is a keen cyclist who is happiest tearing around on a bike in the sunshine!

LinkedIn Connect

Featurespace's mission is to make the world a safer place to transact. To achieve this mission, the remit of Featurespace’s Data Science team is broad. Our work helps organizations identify transactional card fraud on behalf of issuers, processors and acquirers; intercept payment fraud and scams across multiple banking channels; and detect instances of money laundering.

The Journey

The specific use case for each of our machine learning models has its own fraud Modus Operandi, or risk typologies, and a different set of data to master.

For TSYS, as an example, Featurespace’s Data Scientists brought a world-leading risk score, called the ‘Foresight Score’ to market. This is something TSYS can provide to its own clients to bolster their transaction fraud detection capabilities while reducing false positives.

That same group also:

  • Created the machine learning models that support Worldpay’s FraudSight merchant fraud solution.
  • Built award-winning solution that NatWest relies on to prevent fraud and scams in non-plastic payments.
  • Introduced machine learning to the AML system in HSBC’s insurance business.

It’s uncommon for a Data Science team to deliver world-leading machine learning models across so many practice areas, yet that’s a critical aspect of the Featurespace mission. It’s even more uncommon for a team to create industry-leading scores in so many areas.

So, what’s our secret? The answer is going to sound mundane to some people. There are no unique industry insights or modeling techniques. We simply have a strong process.

Each model we deploy is packed with insights. The feature concepts that detect fraudulent activity, the robustness of the models, the consistent performance of those models — these are all the results of incremental gains that many people have made over many years.

News-1 FULL
Quotation Mark

Innovation is painstaking and unpredictable. With carefully constructed process, we can capture the insights we develop so that the knowledge doesn’t live only in the minds of individually brilliant people. Process ensures it flows from the pioneers of our team to everyone else in the organization. That knowledge gets folded into the rest of the work we do so it can live on in and — importantly — feed back into the next generation of models we deploy.

Lucy Griffin, Director of Analytics at Featurespace

#1 We clearly and prescriptively defined how we do what we do

Enda Ridge’s book “Guerilla Analytics: A practical approach to working with Data” describes 7 principles for designing a way of working for your data science team. Applying those principals to the way that we do Data Science at Featurespace has been transformational.

Stripped back, the Guerilla Analytics principles show Data Science teams how to think about model development or publishing of analytic results with all the rigor that is found in modern software development. The principles encourage Data Science teams to bring this consistency, not only to management of code but also management of data.

As a group, we maintain unambiguously defined expectations for the way work should be structured from how source data should be received, how data transformations should be tracked, how modelling code should be written, and analytical results stored and tracked. We strive always to maintain ways of working that is prescriptive yet also flexible to accommodate different customer environments, new technologies or atypical projects which do not perfectly fit within the prescribed boundaries. Flexibility is extended if work is done in a way that meets the following intentions:

  • Work should be performed in a way that maximizes the likelihood that a colleague will be able to follow what you have done
  • All data processing and analytical results must have clear provenance and be reproducible
  • All modelling code and data code must be version controlled

Working in a common way across the team results in quantifiable increases in the quality and quantity of the Data Science that we produce in four notable ways:

  1. Standardization has created team ownership of our code and results, every Data Scientist can follow the work of others and we can quickly bring new team members up to speed on any project once they understand the generalized work pattern. As a team we are responsible for modelling code which is expected to run in systems affecting millions of people every day, ensuring the supportability of this code by Featurespace Data Science the entity, rather than the individuals that created it, is critical.
  2. We’ve freed time for creativity by removing the need for thought or choice around work hygiene. No one enjoys trying to remember whether “results_v1.csv”, “results_v1_final.csv” or “results_v1_FINAL.csv” was the set of results that should be reported on for a high-profile presentation with a key customer! Having work patterns which remembers this for you if you follow them both reduces the risk of error but also allows the thought and attention to detail to be reallocated to unsolved problems of much greater value.
  3. Designing in reproducibility has designed confidence in our results into the process and prevented waste from rework. It’s no fun trying to work out which magic set of hyperparameters got you that incredible TPR after the fact – we operate always under the mantra that “if you can’t reproduce a result, it’s not a result”!
  4. Encouraging rigor in the way that all results and code are produced aligns the whole team behind a common expectation around attention to detail. The ways of working place a bar on the expectations for quality that we have from each other, setting the standard for what good looks like in an unequivocal way.

Our way of working defines how we do what we do as a team, but it must give us freedom to adapt and evolve so that it continues to meet our needs. By working on small experiments, we create the common ways of working that forges a leaner and more effective way of achieving the principles that underpins its design. This approach is actively encouraged across the team. During regular retrospectives changes to the ways of working can be advocated for, and voted on, by the whole team for inclusion in its current working version. We’ve found that encouraging ownership of incremental improvement keeps engagement with the ways of working high, and keeps uptake high, allowing the whole team to continue to benefit.

Quotation Mark

We strive always to maintain ways of working that is prescriptive yet also flexible to accommodate different customer environments, new technologies or atypical projects which do not perfectly fit within the prescribed boundaries.

Lucy Griffin, Director of Analytics at Featurespace

Lucy-Griffin–1559_cropped

#2 Like pilots and surgeons, we’ve grown to love checklists

In the same way that the aviation industry relies on pre-flight and in-flight checklists, we’ve also developed as a team to lean heavily on checklists as we build models. Each project phase is accompanied by a checklist containing all its key considerations from the technical tasks to be conducted to the customer communications and agreements required in the phase.

How is something so simple quite so effective? Deploying machine learning models well could easily be described as a dark art. The list of small issues that can catch you out as you try to develop a model from a historical dataset that generalizes well as a test set and does exactly the same on a run-time data stream reliably, seems to grow steadily over time. Each time you think you have mastered it real-time machine learning has a habit of reminding you that you still have much to learn. Before we had checklists as a team, we relied on the knowledge of a few experienced (read: battle-hardened) Data Scientists who try to draw down all their applicable nuggets of knowledge from memory and impart these to a project team. What we sorely lacked was a process that systematically collected the knowledge and allowed it to be distributed in a scalable way.

The checklists now represent the most succinct version of our team’s knowledge about successfully deploying machine learning models. Take two simple examples from our checklists for data exploration ‘check for duplicates in the data’ and ‘ensure that the time-zone applied to datetimes across different data sources/ fields is consistent’. Neither is a world-beating data exploration insight, and both are reasonably obvious if you’ve explored transactional data before! Nonetheless, if both are handled well during modelling, you’d notice a small positive impact on the quality of the score produced.

We recognize that excellent score performance comes from the accumulation of many small performance gains. What checklists have given us is a system that ensures all these incremental benefits are packaged in every model that is deployed. No Featurespace model is deployed without its project team demonstrating that its passes all the checks. The cynical might frame this as over-prescriptive but in our experience its liberating. The checklists provide prompts but are never exact about the way that the task is accomplished, allowing the Data Scientist freedom to experiment or explore in their own way. For the less experienced team members, they provide guidance along a narrow path to a successful outcome whilst prompting thoughtfulness about how the point applies to the project in question. For those with more experience, the checklists prevent complacency and create the freedom of mind to think about areas not considered by those before them.

As with the ways of working, the most important thing about our checklists is that they are not static and owned by the whole team. Improving the lists as a part of every project completed is an unwritten expectation, contribution is encouraged and celebrated since each item added represents the generation of new knowledge which can benefit every future project.

#3 Collaboratively review work at pivotal points in a project

Our third and final process pillar is model review sessions. We conduct these at the end of each important phase of a project, creating an intentional moment to step away from details and to inspect the quality and health of the overall of the machine learning system that we are building for a customer. Experienced Data Scientists who have not worked closely with the project team ask questions about the decisions made with the project team and ensure that best practices have been followed.

So why is this valuable when there are rigorous processes around data processing, code quality and extensive lists containing the accumulated knowledge of the team? The answer lies in the realization that experience is the ability to apply existing knowledge to new problems well. Some pieces of knowledge can be distilled into checklists and easily applied by all whilst other pieces of knowledge require intuition and experience to apply to full effect.

The difference between success and failure when deploying machine learning systems is rarely the micro-decisions about feature engineering or hyper-parameter choice. Misinterpreting the scope, misspecification or misunderstanding of data and unsuitable model design are much more existential problems. These are the problems where intuition and experience pay dividends, so we’ve arranged our model reviews around these existential problems for any machine learning deployment. In short, we bring in extra experience in the three places where its most valuable.

Model scope review: Here we aim to ensure proper and accurate translation of business requirements into a machine learning problem. Are the functional requirements for the model clear? Do we have score performance metrics that accurately reflect what the customer cares most about? Will we have access to data sources suitable to build a machine learning system from? Experience and intuition are extremely valuable when assessing the viability of a machine learning system. For example, optimizing a model to detect gross fraud value when the customer is looking to reduce net fraud losses is an easy mistranslation to make. A more subtle example is overlooking a feedback loop which means that once deployed your supervised machine learning model will see a steady degradation in the quality of its training data.

Data understanding review: No machine learning system is better than the data that its built from, but any misunderstanding of the data puts an artificially low ceiling on the quality! During this review we ask our teams to talk through their understanding of their modelling data in the context of the scope of the model they are trying to build. An experienced Data Scientist as a reviewer cannot change the quality or quantity of the data exploration conducted by the project team but can assess the quality of understanding and interpretation. It’s the act of needing to clearly articulate understanding that bring most value to this session.

Model design review: In this session, we examine the overall health of the model produced and its fitness for deployment. Goals for the reviewer are to gain confidence that the model will generalize well from the development data set to scoring runtime data and that the model has been designed in the most robust and resilient way possible.

The key to running successful model review sessions as detailed above is much deeper than the act of scheduling a review. Positive project improving outcomes from review sessions are amplified by striving to create a kind and supportive culture where mistakes can be admitted publicly and where challenges or questions are encouraged irrespective of the experience or job title of the challenger or challenged. Our culture has created the psychological security for members of the project team to seek guidance and ask questions about their areas of concern or uncertainty from those with experience.

Of course, one valuable element from these sessions is to provide assurance that the Data Science team’s ways of working has been followed and the checklists for the appropriate phases covered by the project’s Data Scientists. This quality assurance stage of the review can be conducted in advance since the ways of working and checklists create clear pass/fail criteria for the review; further fostering a sense of psychological security for less experienced team members. Work is ‘examined’ by those with more experience but the ‘exam questions’ are published well in advance so there are rarely surprises. Moving assurance stages away from the review session itself further liberates the session to become a collaborative forum for discussion with all participants present on equal footing.

Every team member sees project reviews as a valuable opportunity to learn and improve whilst generating fresh and new ideas, including the Data Scientist conducting the review. We believe that the quality of the machine learning systems produced gets pushed higher and higher in an environment like this.

It’s our process that allows us to strive to be better every day

We’ve shared the three key elements from our data science process.

  • Lean ways-of-working create confidence in every result produced,
  • Checklists encapsulate years of insight and,
  • Model review sessions provide independent quality assurance.

These three steps have been our secret to success so far because they’ve created a system which supports bringing together the learnings and ideas from all the brilliant people from our global Data Science team past and present.

While it’s only taken a few hundred words to describe the key pillars of our process, it’s taken us many years and a lot of work to get to create what we have today. Our success has always come from, and will continue to come from, the belief that we will never end offering the right guardrails to our teams. We want to ensure we nurture innovation and creativity so that our teams flourish because there is no end to making the world a safer place to transact.

thong-vo-Maf7wdHCmvo-unsplash-glitched-12-10-2021-14-37-02
article_b

Sign up for regular insights, content and news from Featurespace.

Read the report

The State of Fraud and Financial Crime in the U.S.

Discover the first benchmark for fraud values, volumes, and losses. By the industry, for the industry.