Wortschatz – deu_newscrawl_2011

German news corpus based on material crawled in 2011 with 26,142,898 sentences. Change corpus

Description

German news corpus based on material crawled in 2011

Details

Name	deu_newscrawl_2011	Sentences	26,142,898
Language	German ()	Types	5,876,654
Genre	Newscrawl	Tokens	425,703,275
Year	2011

Link to the corpus

https://corpora.uni-leipzig.de?corpusId=deu_newscrawl_2011

Annotations

coocSim
GDEX
wordsLevenshteinSim

Cite this corpus

Leipzig Corpora Collection: German news corpus based on material crawled in 2011. Leipzig Corpora Collection. Dataset. https://corpora.uni-leipzig.de?corpusId=deu_newscrawl_2011. BibTeX

@misc{deu_newscrawl_2011,
    author = {Leipzig Corpora Collection},
    title = {German news corpus based on material crawled in 2011},
    howpublished = {https://corpora.uni-leipzig.de?corpusId=deu_newscrawl_2011},
    note = {Accessed: 2025-03-13}
}

Word:

Temeswar

Number of occurrences: 22 Rank: 407,173 Frequency class: 19

Hyphenation: Te|mes|war

Temeswar

Neighbour Cooccurrences:

Word graph

This word on the current news