spot_img
HomeResearch & DevelopmentAI's Edge in Anti-Money Laundering: Learning from Transaction Patterns

AI’s Edge in Anti-Money Laundering: Learning from Transaction Patterns

TLDR: A new AI method uses a transformer neural network and contrastive learning to detect money laundering. It analyzes raw transaction data, learns patterns without needing labels, and then employs a two-threshold system to identify fraudsters while keeping false alarms low, significantly outperforming traditional rule-based and other machine learning approaches.

Money laundering is a significant global problem, estimated to involve 2 to 5% of the world’s GDP annually. It undermines governments’ ability to collect taxes, fight crime, and damages the stability of financial institutions. Detecting it is incredibly challenging because perpetrators often mimic legal financial behaviors, and money laundering patterns constantly evolve to evade detection.

Financial institutions are under strict regulations to detect and report suspicious activities, falling under the ‘anti-money laundering’ (AML) framework. To cope with the massive volume of customers and transactions, they often rely on automated ‘rule-based systems’. While these systems offer explainability, they suffer from severe limitations, particularly a very low precision, leading to 95% to 98% false positives. This inefficiency stems from their reliance on hard-coded thresholds that are difficult to update as fraud patterns change.

This new research introduces a novel approach that leverages machine learning to address these challenges, complementing existing rule-based frameworks. Instead of relying on aggregated, summarized features of customer activity, which often lose valuable information and require expert knowledge to design, this work processes the entire set of raw transaction time series.

A New Approach: Transformers and Contrastive Learning

The core of this new procedure involves a transformer neural network, a powerful tool initially developed for natural language processing. Just as language can be seen as a ‘time series of words’, financial transactions can be viewed as a ‘time series of events’. The transformer is adept at capturing long-range dependencies within these sequences, which is crucial for identifying money laundering patterns that might unfold over several months.

A key innovation is the use of ‘contrastive learning’ to pre-train the transformer. This is done without any labeled data, which is a major advantage given the scarcity and unreliability of truly labeled fraudulent data in real-world scenarios. Contrastive learning works by teaching the model to recognize similarities and differences between observations. It learns to create representations where similar financial activities are mapped closer together in a digital space, while dissimilar ones are pushed further apart. This self-supervised approach helps the transformer learn the underlying ‘semantics’ of financial transactions.

The process involves feeding raw transaction data into the transformer, which generates a compact numerical representation (an ’embedding’) for each account. A ‘projection head’ then further refines this representation into a lower-dimensional space, making computations more efficient. To enhance the learning process, the system samples ‘positive examples’ (observations similar to a reference) and ‘negative examples’ (dissimilar observations) based on auxiliary data that contains aggregated metrics and customer descriptors. It even adds a small amount of Gaussian noise to these examples in the projection space to explore new patterns and prevent overfitting, especially useful in highly imbalanced datasets where fraudsters are rare.

Two-Threshold Classification for Better Detection

Once the transformer has learned these powerful representations, they are used for the downstream task of money laundering detection. This involves a classification step where a simple logistic regression classifier is trained on a small amount of labeled data (accounts identified as fraudsters or non-fraudsters).

To tackle the severe class imbalance (where fraudsters are a very small percentage of accounts), the research introduces a ‘two-thresholds’ classification procedure:

  • Low Threshold (Tl): This threshold helps identify and discard the least suspicious observations. Accounts with scores below Tl are confidently declared non-fraudulent, saving analysts time by removing them from further investigation.

  • High Threshold (Th): This threshold targets the most suspicious observations. Accounts with scores above Th are flagged as potential fraudsters for in-depth investigation.

Crucially, both thresholds are calibrated using the Benjamini-Hochberg (BH) procedure, a statistical method that controls the ‘False Discovery Rate’ (FDR). This means the procedure ensures that the proportion of false positives (legitimate accounts wrongly flagged as suspicious) is kept below a prescribed level, a significant improvement over traditional rule-based systems that often suffer from very high false positive rates.

Also Read:

Experimental Validation

The methodology was tested on a real-life, anonymized dataset of company bank accounts, comprising complex time series of transactions with both quantitative and qualitative features. The dataset reflected the real-world challenge of strong class imbalance, with fraudsters making up only 5% of the test set.

Visualizations of the learned representations showed that the transformer, trained with contrastive learning, was able to create distinct clusters for fraudsters, making them more easily distinguishable compared to representations learned by other methods like LSTM autoencoders or traditional tabular data approaches. The transformer-based approach consistently outperformed competitors in terms of separating the score distributions of fraudsters and non-fraudsters.

When applying the two-thresholds procedure, the transformer-based method (especially with fine-tuning) demonstrated a significantly higher ability to detect true fraudsters for a given FDR level compared to other models. For instance, at an FDR of 0.40, it detected more than twice the number of fraudsters compared to the LSTM-based approach. Similarly, for the low threshold, it successfully identified a much larger percentage of non-fraudulent accounts, further optimizing investigation resources.

In conclusion, this research presents a robust and adaptive framework for money laundering detection. By combining the power of transformer neural networks with contrastive learning and a controlled two-threshold classification, it offers a promising path to overcome the limitations of traditional systems, leading to more efficient and accurate identification of financial crime. You can read the full research paper here: Representation learning with a transformer by contrastive learning for money laundering detection.

Meera Iyer
Meera Iyerhttps://blogs.edgentiq.com
Meera Iyer is an AI news editor who blends journalistic rigor with storytelling elegance. Formerly a content strategist in a leading tech firm, Meera now tracks the pulse of India's Generative AI scene, from policy updates to academic breakthroughs. She's particularly focused on bringing nuanced, balanced perspectives to the fast-evolving world of AI-powered tools and media. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -