Skip to content

Commit ffdadc7

Browse files
committed
[lex] Provide unicode name for all control characters
This commit does not touch the new-line character as paper P2348. It resricts itself to consistent use of the unicode character name for space, horizontal tab, and vertical tab. Compared to PR #7359 it deliberately does not touch the grammar that would necessitate a review by core review. The intent is to rebase that PR if this one lands.
1 parent bd2412c commit ffdadc7

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

source/lex.tex

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -140,9 +140,9 @@
140140
would arise from a source file ending with an unclosed \tcode{/*}
141141
comment.
142142
\end{footnote}
143-
Each comment\iref{lex.comment} is replaced by one space character. New-line characters are
143+
Each comment\iref{lex.comment} is replaced by one \unicode{0020}{space} character. New-line characters are
144144
retained. Whether each nonempty sequence of whitespace characters other
145-
than new-line is retained or replaced by one space character is
145+
than new-line is retained or replaced by one \unicode{0020}{space} character is
146146
unspecified.
147147
As characters from the source file are consumed
148148
to form the next preprocessing token
@@ -882,7 +882,8 @@
882882
\end{footnote}
883883
operators, and other separators.
884884
\indextext{whitespace}%
885-
Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments
885+
Comments and the characters \unicode{0020}{space}, \unicode{0009}{character tabulation},
886+
\unicode{0009}{line tabulation}, \unicode{000c}{form feed}, and new-line
886887
(collectively, ``whitespace''), as described below, are ignored except
887888
as they serve to separate tokens.
888889
\begin{note}

0 commit comments

Comments
 (0)