The corpus cmn-cn_web_2019 is a Mandarin Chinese Web text corpus (People’s Republic of China) based on material from 2019.
It contains 2,547,177 sentences and 61,340,150 tokens.
Details
Leipzig Corpora Collection: Mandarin Chinese Web text corpus (People’s Republic of China) based on material from 2019. Leipzig Corpora Collection. Dataset. https://corpora.uni-leipzig.de?corpusId=cmn-cn_web_2019.
BibTeX
@misc{cmn-cn_web_2019,
author = {Leipzig Corpora Collection},
title = {Mandarin Chinese Web text corpus (People’s Republic of China) based on material from 2019},
howpublished = {https://corpora.uni-leipzig.de?corpusId=cmn-cn_web_2019},
note = {Accessed: 2023-03-27}
}