Skip to content

Character count is incorrect for non-ASCII commit messages #1726

@eclairevoyant

Description

@eclairevoyant

Describe the bug
When typing non-ASCII characters in a commit message, the character count is incorrect.
Example: 测试 is counted as 6 characters.

To Reproduce
Create a commit message containing emoji, CJK, or other non-ASCII Unicode characters.

Expected behavior
测试 should be counted as either 2 graphemes or 4 columns.

Emoji width is probably impossible to get right due to variations in presentation schemes, so I don't have strong opinions on how this case should be addressed.

Screenshots

Context (please complete the following information):

  • OS/Distro + Version: Linux
  • GitUI Version 0.23.0
  • Rust version: 1.70.04

Additional context

I'm happy to submit a PR, in fact am working on one now, but I wanted to confirm the expected design here first.

Do we want to count graphemes or columns?

To me, columns make more sense as the whole intention of that informal spec of limiting to 50 (printable ASCII) characters is to make it fit visually on the screen. Counting graphemes sounds more like fun-to-know but not-useful info.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions