About

The M5Product dataset is a large-scale multi-modal pre-training dataset with coarse and fine-grained annotations for E-products.

• 6 Million multi-modal samples, 5k properties with 24 Million values

• 5 modalities-image text table video audio

• 6 Million category annotations with 6k classes

Wide data source (1 Million merchants provide)

Sampler

The data acquisition page is shown as follows.

Examples

Citation

If you find our dataset useful in your research, please consider citing:

@ARTICLE{2021arXiv210904275D
     title={M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining},
    author={Xiao Dong and Xunlin Zhan and Yangxin Wu and Yunchao Wei and Michael C. Kampffmeyer and Xiaoyong Wei and Minlong Lu and Yaowei Wang and Xiaodan Liang},
    year={2021},
    eprint={2109.04275},
    journal = {arXiv e-prints},
    year={2021},
}

Annoucement

2021/09/2 Initial release.

2022/03/12 Update.

Organization

SYSU, BJTU, UiT, PengCheng Lab, Alibaba Group.