POS tagging

https://universaldependencies.org/ has labelled data for parts of speech, dependencies and information about morphology for Hindi, Sanskrit, Marathi, Tamil and Telugu.
I plan on using a LM-LSTM-CRF architecture for sequence tagging. However the language models in iNLTK use sentencepiece tokens. Could anyone guide me through using the existing lm for word tokens or do I need to retrain the word embeddings for word tokens?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

POS tagging #13

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

POS tagging #13

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions