Retail AI Training Data: How to Prepare Images for Attribute Models

How to structure retail AI training data for attribute extraction models: category hierarchies, multi-label annotations, imbalance handling, and dataset sourcing.

Attribute extraction — automatically predicting colour, material, fit, and style from a product image — is one of the highest-value applications of computer vision in retail. Getting the training data right is 80% of the work.

Why Attribute Models Need Specialised Data

Generic ImageNet-pretrained models know what a "shirt" is. They do not know that this shirt is slim-fit, cotton-blend, and navy-striped. To learn those distinctions a model needs thousands of labelled examples per attribute per category — a much larger and more structured dataset than category classification requires.

Our retail AI dataset is purpose-built for this: each image is tagged with up to 12 attributes drawn from a controlled vocabulary, covering 8 top-level product categories.

Designing a Multi-Label Attribute Schema

Common taxonomies use a two-level hierarchy:

Category: Tops → Attribute group: Fit → Values: [slim, regular, oversized]
Category: Tops → Attribute group: Neckline → Values: [crew, v-neck, turtleneck]

Each image carries one value per attribute group (or null if not applicable). Use JSONL with one object per image to avoid the column explosion of flat CSV when attribute count grows beyond 20.

Handling Class Imbalance in Attribute Data

Attribute distributions are more skewed than category distributions. "Black" typically outweighs "yellow" by 15:1 in fashion datasets. Approaches that work:

  • Focal loss during training (no data changes required)
  • Stratified sampling at batch level
  • Targeted data acquisition — request specific attribute/category combinations from ImageHub's custom catalog

Combining Multiple Data Sources

Mixing data from multiple retailers improves attribute model robustness but introduces domain shift. Mitigate this by:

  1. Normalising the annotation vocabulary across sources (map synonyms to canonical values)
  2. Recording source_domain in metadata and using domain-adaptation fine-tuning if needed
  3. Holding out one retailer's data as a cross-domain validation set

Getting Started

Download a free ImageHub sample to inspect our attribute schema and annotation coverage. For production-scale attribute training data, contact us for a custom order.


Explore Our Image Dataset Guides

Browse the full catalog →