data resource untuk NLP bahasa indonesia
- leipzig indonesian sentence collectoin news articles, web articles, wikipedia data from 2008-2016
- wn-msa.sourceforge.net Wordnet Bahasa
- Quran indonesian quran translation (id.muntakhab, id.jalalayn, id.indonesian)
- Kompas online collection. This corpus contains Kompas online news articles from 2001-2002. See here for more info and citations.
- Tempo online collection. This corpus contains Tempo online news articles from 2000-2002. See here for more info and citations.
- corpus-frog-storytelling spoken text story telling
- TED-Multilingual-Parallel-Corpus Monolingual_data/Indonesian
- Opus Opus NLPL
- Sealang Sealang dataset
Word reference (kemdikbud) link
- Entri Dasar : 48.748 (44,64 %)
- Kata Turunan : 26.312 (24,09 %)
- Gabungan Kata : 30.625 (28,04 %)
- Peribahasa : 2.040 (1,87 %)
- Kiasan : 268 (0,25 %)
- Ungkapan : 1.129 (1,03 %)
- Varian : 91 (0,08 %)
- Entri Total : 109.213 (100,00 %)
- Makna Total : 127.775
- Contoh Total : 29.495
- Kategori Total : 255
- Makna Per Entri : 1,170
- Contoh Per Makna : 0,231
- Word Sastrawi
- Word spaCy : id
- Word name : random-name
- Word Indo name : genderprediction
- Word Indo place : Wilayah-Administratif-Indonesia
- Word Indo place : Indonesia-Postal-Code
- Word Wiktionary : word id
- Word sentiment : analisis-sentimen
- Word sentiment : ID-OpinionWords
- Word sentiment : Analisis-Sentimen-ID
- Word Acronims
- word : serangkai
- NER : yohanesgultom/nlp-experiments 1700 sentences
- NER : yusufsyaifudin/indonesia-ner 1835 sentences
- POS-TAG : famrashel/idn-tagged-corpus
- POS-TAG : pebbie/pebahasa ~600 sentence
- POS-TAG Parser : UniversalDependencies/UD_Indonesian-GSD ~4477 sentence
- Sentimen 1506 sentences
- panl10n Pan Localization
- Crawler Indonesian news portal