Skip to content

subword # should be an option. #33

@FFengIll

Description

@FFengIll

For bert, there are many models use # for subword symbol, but not all.
Some popular bert-based models defined their own subword symbol.

For example, in e5 the symbol is .

>>> a = '▁'
>>> a.encode('utf-8')
b'\xe2\x96\x81'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions