diff --git a/source/lex.tex b/source/lex.tex index a26a086d5c..1ced646a79 100644 --- a/source/lex.tex +++ b/source/lex.tex @@ -108,10 +108,11 @@ physical source lines to form logical source lines. Only the last backslash on any physical source line shall be eligible for being part of such a splice. -Except for splices reverted in a raw string literal, if a splice results in -a character sequence that matches the -syntax of a \grammarterm{universal-character-name}, the behavior is -undefined. A source file that is not empty and that does not end in a new-line +\begin{note} +Line splicing can form +a \grammarterm{universal-character-name}\iref{lex.charset}. +\end{note} +A source file that is not empty and that does not end in a new-line character, or that ends in a splice, shall be processed as if an additional new-line character were appended to the file. @@ -488,7 +489,7 @@ operators and punctuators, and single non-whitespace characters that do not lexically match the other preprocessing token categories. If a \unicode{0027}{apostrophe} or a \unicode{0022}{quotation mark} character -matches the last category, the behavior is undefined. +matches the last category, the program is ill-formed. If any character not in the basic character set matches the last category, the program is ill-formed. Preprocessing tokens can be separated by diff --git a/source/preprocessor.tex b/source/preprocessor.tex index 0093ee7c8f..828b2c9673 100644 --- a/source/preprocessor.tex +++ b/source/preprocessor.tex @@ -1374,11 +1374,9 @@ of two placemarkers results in a single placemarker preprocessing token, and concatenation of a placemarker with a non-placemarker preprocessing token results in the non-placemarker preprocessing token. -If the result begins with a sequence matching the syntax of \grammarterm{universal-character-name}, -the behavior is undefined. \begin{note} -This determination does not consider the replacement of -\grammarterm{universal-character-name}s in translation phase 3\iref{lex.phases}. +Concatenation can form +a \grammarterm{universal-character-name}\iref{lex.charset}. \end{note} If the result is not a valid preprocessing token, the behavior is undefined.