The corpus mar-in_web_2019 is a Marathi Web text corpus (India) based on material from 2019.
It contains 1,712,155 sentences and 20,715,106 tokens.
Details
Leipzig Corpora Collection: Marathi Web text corpus (India) based on material from 2019. Leipzig Corpora Collection. Dataset. https://corpora.uni-leipzig.de?corpusId=mar-in_web_2019.
BibTeX
@misc{mar-in_web_2019,
author = {Leipzig Corpora Collection},
title = {Marathi Web text corpus (India) based on material from 2019},
howpublished = {https://corpora.uni-leipzig.de?corpusId=mar-in_web_2019},
note = {Accessed: 2025-03-21}
}