TLDR: A new framework called Competitive Isolation PSM-DID, developed by Alibaba Group, provides an unbiased way to measure the platform-level impact of interventions in search-based marketplaces. It addresses challenges like interference and selection bias by using mutual exclusion graph partitioning to isolate competing items, stratified CTCVR matching to find homogeneous comparison groups, and a two-sided sinking mechanism. This approach ensures accurate causal effect estimation, validated by experiments showing reduced cannibalization and precise measurement of GMV and order volume lifts.
In the complex world of online marketplaces, understanding the true impact of changes to a search system is a significant challenge. Imagine a scenario where a platform wants to know if a new pricing strategy or a change in how products are displayed actually increases overall sales or order volume. Traditional methods, like A/B testing, often fall short because of the intricate web of interactions between items and users. This is where a new framework, called Competitive Isolation PSM-DID, steps in to offer a more accurate solution.
Developed by researchers from Alibaba Group, this novel approach addresses the fundamental problem of “interference” in two-sided marketplaces. Interference occurs when the treatment applied to one group (e.g., a price change for certain items) unintentionally affects another group (e.g., other items or users), making it difficult to isolate the true impact of the intervention. For instance, if a discount on one product leads customers to buy it instead of a similar, non-discounted product, that’s a “cannibalization” effect that can skew results.
The Competitive Isolation PSM-DID framework combines several sophisticated techniques to overcome these hurdles. At its core, it integrates Propensity Score Matching (PSM) with a Difference-in-Differences (DID) approach, but with crucial enhancements. The key innovations are:
Mutual Exclusion Graph Partitioning
To prevent items in the “treatment” group from interfering with items in the “control” group, the framework uses a clever technique called mutual exclusion graph partitioning. Think of it like dividing a marketplace into two distinct, non-overlapping sections. The researchers built a “competition graph” where items are nodes and connections represent how much they compete. Using an algorithm called Kernighan-Lin min-cut, they divided this graph into two balanced subgraphs. This ensures that changes made in one section don’t significantly affect the other, effectively isolating the competitive channels. This process significantly reduces cannibalization effects, which were a major source of bias in previous methods.
Homogeneous Item Mining
Another critical aspect is ensuring that the items being compared are truly similar before any intervention. This is achieved through “homogeneous item mining” using a method called Stratified CTCVR Matching. This isn’t just about matching items by broad categories; it’s a much more granular process. Items are stratified (grouped) by four key dimensions: category (e.g., Electronics > Laptops > Gaming), exposure level (how many times they’re viewed), transaction level (historical sales volume), and price band. Within these finely tuned groups, items are then ranked by their CTCVR (Click-Through Conversion Rate) similarity, which captures how users interact with them. This meticulous matching ensures that the “control” group accurately represents what would have happened to the “treatment” group without the intervention, satisfying the “parallel trends” assumption essential for accurate causal inference.
Also Read:
- Unlocking Causal Insights in Complex Systems with Flexible Cluster Models
- Dynamic AI for Food Security: Forecasting Global Trade Links with IVGAE-TAMA-BO
Two-Sided Sinking Mechanism
To facilitate platform-level causal inference while maintaining market completeness, the framework employs a “two-sided sinking mechanism.” This involves operationally demoting items (e.g., by applying a significant search rank penalty) in either the treatment or control group during the measurement period. This “sinking” helps to suppress competitive interference and allows for a clearer observation of metrics for the isolated groups, ensuring that the overall market dynamics are still considered without direct cross-group competition.
The researchers rigorously proved that under conditions of mutual exclusion and parallel trends, their method provides an unbiased estimation of platform-level effects, making it equivalent to a perfect A/B test. This is a significant theoretical guarantee for a method that can be deployed in real-world scenarios where traditional A/B testing is often impractical due to operational constraints like uniform pricing.
Extensive experiments, both offline and online, demonstrated the framework’s effectiveness. In offline evaluations, the Stratified CTCVR Matching consistently achieved significantly lower order volume gaps compared to traditional solutions and other variants, reducing the 30-day order volume gap to 1.36% ± 0.51% at 600K daily orders, a substantial improvement. Online experiments confirmed that the mutually exclusive approach reduced inter-item cannibalization from 2.0% to a negligible 0.1%. This precision allowed for the detection of statistically significant platform-level lifts, such as a 0.01% ± 0.23% GMV lift and a 0.06% ± 0.15% order volume lift over 7 days, which would have been obscured by interference biases in other methods.
This work not only provides a robust framework for platform-level causal estimation but also contributes an open dataset for marketplace interference analysis, fostering further research in this critical area. The ability to accurately measure the impact of interventions at a platform level, rather than just an item level, offers immense value for large-scale marketplaces like those operated by Alibaba, enabling data-driven decisions that can lead to substantial improvements in key business metrics. You can find the full research paper here: Unbiased Platform-Level Causal Estimation for Search Systems.


