套件:frog(0.32-2 以及其他的)
tagger and parser for natural languages (runtime)
Memory-Based Learning (MBL) is a machine-learning method applicable to a wide range of tasks in Natural Language Processing (NLP).
Frog is a modular system integrating a morphosyntactic tagger, lemmatizer, morphological analyzer, and dependency parser for natural languages. It is based upon it's predecessor TADPOLE (TAgger, Dependency Parser, and mOrphoLogical analyzEr). Using Memory-Based Learning techniques, frog tokenizes, tags, lemmatizes, and morphologically segments word tokens in incoming UTF-8 text files, and assigns a dependency graph to each sentence. Frog is particularly targeted at the increasing need for fast, automatic NLP systems applicable to very large (multi-million to billion word) document collections that are becoming available due to the progressive digitization of both new and old textual data. Up to now, frog has only been tested and used using corpora of Dutch natural language (see the frogdata package for samples).
Frog is a product of the Centre of Language and Speech Technology at Radboud University Nijmegen, it subsumes previous work by the ILK Research Group (Tilburg University, The Netherlands) and the CLiPS Research Centre (University of Antwerp, Belgium). It is currently maintained at the KNAW Humanities Cluster.
If you do scientific research in NLP, Frog will likely be of use to you.
其他與 frog 有關的套件
|
|
|
|
-
- dep: libc6 (>= 2.34)
- GNU C 函式庫:共用函式庫
同時作為一個虛擬套件由這些套件填實: libc6-udeb
-
- dep: libfolia19 (>= 2.17)
- Implementation of the FoLiA document format
-
- dep: libfrog3 (>= 0.32)
- tagger and parser for Dutch language (library)
-
- dep: libgcc-s1 (>= 3.5)
- GCC 支援函式庫
-
- dep: libicu72 (>= 72.1~rc-1~)
- International Components for Unicode
-
- dep: libmbt2 (>= 3.10)
- memory-based tagger-generator and tagger - runtime
-
- dep: libstdc++6 (>= 13.1)
- GNU Standard C++ Library v3
-
- dep: libticcutils9 (>= 0.34)
- utility functions used in the context of Natural Language Processing (library)
-
- dep: libtimbl7 (>= 6.9)
- Tilburg Memory Based Learner - runtime
-
- dep: libucto6 (>= 0.30)
- Unicode Tokenizer - runtime
-
- rec: ucto
- Unicode Tokenizer