Leipzig Corpora Collection

Search in 1018 Corpus-Based Monolingual Dictionaries for 290 Languages.

Selected language: German Newscrawl 2011

Search suggestions: verletzt · VfB · unterstützt · schlechte · holte

More information about: German Newscrawl 2011 Change corpus

The corpus deu_newscrawl_2011 is a German news corpus based on material crawled in 2011. It contains 26,142,898 sentences and 425,703,275 tokens. Details

DOWNLOADS

Download parts of this corpus.

STATISTICS

More details about this corpus on our corpus and language statistics page.

Further services:

Description

German news corpus based on material crawled in 2011

Details

Name	deu_newscrawl_2011	Sentences	26,142,898
Language	German ()	Types	5,876,654
Genre	Newscrawl	Tokens	425,703,275
Year	2011

Link to the corpus

https://corpora.uni-leipzig.de?corpusId=deu_newscrawl_2011

Annotations

coocSim
GDEX
wordsLevenshteinSim

Cite this corpus

Leipzig Corpora Collection: German news corpus based on material crawled in 2011. Leipzig Corpora Collection. Dataset. https://corpora.uni-leipzig.de?corpusId=deu_newscrawl_2011. BibTeX

@misc{deu_newscrawl_2011,
    author = {Leipzig Corpora Collection},
    title = {German news corpus based on material crawled in 2011},
    howpublished = {https://corpora.uni-leipzig.de?corpusId=deu_newscrawl_2011},
    note = {Accessed: 2024-11-27}
}