Training Dataset for Visual Product Recommendations

Visual product recommendation — "shop the look", "complete the set", "you might also like" — is one of the most commercially impactful applications of computer vision in retail. But the training data requirements differ significantly from classification or search. This post explains what data you need and how to structure it.

Recommendation vs. Search: Different Data Needs

Visual search asks: find me an identical or very similar product. Visual recommendation asks: find products that go well with this. That second question requires pairs or sets of items that humans have judged as compatible — not just visually similar. The contrastive/triplet approach needs:

Positive pairs — items that are compatible (same outfit, same room style)
Hard negatives — items that are visually similar but NOT compatible (same colour palette, wrong style mix)

Positive pairs can be mined from curated editorial content (look books, styled room photos). Hard negatives are harder to get and are usually generated by the training loop itself using online hard mining.

Dataset Scale for Recommendation Models

Rule of thumb: you need at least 100,000 positive pairs to get a recommendation model that outperforms a fallback popularity ranker. For category-specialised models (fashion outfit completion, room furnishing), 50,000 pairs per category is a reliable minimum.

Using an Ecommerce Image Dataset as a Base

You can bootstrap the image supply from a wide-coverage ecommerce image dataset and then layer on compatibility labels from editorial content or historical co-purchase data. The wider the original dataset, the better coverage you get for long-tail categories.

Evaluation

Standard retrieval metrics (Recall@K, NDCG) apply. Additionally, measure:

Category coverage — does the model return recommendations outside the query category? (desired for room/outfit completion)
Style consistency — use a style classifier to verify returned items are consistent in style with the query
Business metrics — CTR and add-to-basket rate on a live A/B test

Next Steps

Explore the visual search dataset guide for encoder training details, or contact us to discuss a custom dataset scoped to your catalog and recommendation use case.

Explore Our Image Dataset Guides

Browse the full catalog →

Choosing the Right Training Dataset for Visual Product Recommendations