-
Notifications
You must be signed in to change notification settings - Fork 160
Open
Labels
enhancementNew feature or requestNew feature or request
Description
https://universaldependencies.org/ has labelled data for parts of speech, dependencies and information about morphology for Hindi, Sanskrit, Marathi, Tamil and Telugu.
I plan on using a LM-LSTM-CRF architecture for sequence tagging. However the language models in iNLTK use sentencepiece tokens. Could anyone guide me through using the existing lm for word tokens or do I need to retrain the word embeddings for word tokens?
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request