the world's first large-scale multi-modal short-video encyclopedia, where the primitive units are items, aspects, and short videos.
Kuaipedia is developed by Knowledge Engineering Group in Kuaishou (KwaiKEG), collaborating with HIT and HKUST. It is the world's first large-scale multi-modal short-video encyclopedia where the primitive units are items, aspects, and short videos.
Please refer to the paper for more details.
Kuaipedia: a Large-scale Multi-modal Short-video Encyclopedia [Manuscript]
We are excited to release a subset of Kuaipedia, featuring the most popular wiki entries for enhanced research opportunities. Along with this, we've also shared our experimental findings. Sample files can be located in the ./data
folder, accompanied by a README.md
file to clarify each field.
To download the full subset and experimental results of Kuaipedia, please go ahead to huggingface/dataset/kuaipedia, or use the following link:
link: https://pan.baidu.com/s/1yUB97aL2rBVt-Q0c6sYIcw code: kwyw
The raw video can be found by concatenating video_id
with the prefix kuaishou.com/short-video
. E.g. kuaishou.com/short-video/3xwwuqndapzs6nu.
If you're experiencing any issues with downloading the data file, please don't hesitate to reach out to [email protected] for assistance.
Statistics
Full Dump | Subset Dump | |
---|---|---|
#Items | > 26 million | 51,702 |
#Aspects | > 2.5 million | 1,074,539 |
#Videos | > 200 million | 769,096 |
The comparative results with the baseline models are as follows:
Model | Item P | Item R | Item-Aspect P | Item-Aspect R |
---|---|---|---|---|
Random | 87.7 | 49.8 | 36.4 | 49.6 |
LR | 90.4 | 68.3 | 55.1 | 2.7 |
T5-small | 93.7 | 76.1 | 79.3 | 58.5 |
BERT-base | 94.3 | 77.8 | 81.5 | 62.7 |
GPT-3.5 | 90.5 | 86.4 | 41.8 | 95.7 |
Ours | 94.7 | 79.7 | 83.0 | 65.7 |
Feel free to explore and utilize this valuable dataset for your research and projects.
@article{Kuaipedia22,
author = {Haojie Pan and
Yuzhou Zhang and
Zepeng Zhai and
Ruiji Fu and
Ming Liu and
Yangqiu Song and
Zhongyuan Wang and
Bing Qin
},
title = {{Kuaipedia:} a Large-scale Multi-modal Short-video Encyclopedia},
journal = {CoRR},
volume = {abs/2211.00732},
year = {2022}
}
Except for the contributers to the paper. We also appreciate efforts and helps from Jingrun Zhang, Yuelei Li, Lijun Mei, Chunguang Pan, Xing Hu, Lingyu Zou, Yang Li, Dexing Yang, Wenzheng Zhao, Guixin Qiu, Lin Yang, Meijuan Yang, Teng Tu, Xinyi Zheng, Yunhui Guo and others who contributes to this project.
If you are insterested in Kuaipedia and more cases, please contact us by e-mail [email protected]