Skip to content

Commit b933f0c

Browse files
authored
Fix Windows EH IP2State tables (remove +1 bias) (#144745)
This changes how LLVM constructs certain data structures that relate to exception handling (EH) on Windows. Specifically this changes how IP2State tables for functions are constructed. The purpose of this change is to align LLVM to the requires of the Windows AMD64 ABI, which requires that the IP2State table entries point to the boundaries between instructions. On most Windows platforms (AMD64, ARM64, ARM32, IA64, but *not* x86-32), exception handling works by looking up instruction pointers in lookup tables. These lookup tables are stored in `.xdata` sections in executables. One element of the lookup tables are the `IP2State` tables (Instruction Pointer to State). If a function has any instructions that require cleanup during exception unwinding, then it will have an IP2State table. Each entry in the IP2State table describes a range of bytes in the function's instruction stream, and associates an "EH state number" with that range of instructions. A value of -1 means "the null state", which does not require any code to execute. A value other than -1 is an index into the State table. The entries in the IP2State table contain byte offsets within the instruction stream of the function. The Windows ABI requires that these offsets are aligned to instruction boundaries; they are not permitted to point to a byte that is not the first byte of an instruction. Unfortunately, CALL instructions present a problem during unwinding. CALL instructions push the address of the instruction after the CALL instruction, so that execution can resume after the CALL. If the CALL is the last instruction within an IP2State region, then the return address (on the stack) points to the *next* IP2State region. This means that the unwinder will use the wrong cleanup funclet during unwinding. To fix this problem, compilers should insert a NOP after a CALL instruction, if the CALL instruction is the last instruction within an IP2State region. The NOP is placed within the same IP2State region as the CALL, so that the return address points to the NOP and the unwinder will locate the correct region. This PR modifies LLVM so that it inserts NOP instructions after CALL instructions, when needed. In performance tests, the NOP has no detectable significance. The NOP is rarely inserted, since it is only inserted when the CALL is the last instruction before an IP2State transition or the CALL is the last instruction before the function epilogue. NOP padding is only necessary on Windows AMD64 targets. On ARM64 and ARM32, instructions have a fixed size so the unwinder knows how to "back up" by one instruction. Interaction with Import Call Optimization (ICO): Import Call Optimization (ICO) is a compiler + OS feature on Windows which improves the performance and security of DLL imports. ICO relies on using a specific CALL idiom that can be replaced by the OS DLL loader. This removes a load and indirect CALL and replaces it with a single direct CALL. To achieve this, ICO also inserts NOPs after the CALL instruction. If the end of the CALL is aligned with an EH state transition, we *also* insert a single-byte NOP. **Both forms of NOPs must be preserved.** They cannot be combined into a single larger NOP; nor can the second NOP be removed. This is necessary because, if ICO is active and the call site is modified by the loader, the loader will end up overwriting the NOPs that were inserted for ICO. That means that those NOPs cannot be used for the correct termination of the exception handling region (the IP2State transition), so we still need an additional NOP instruction. The NOPs cannot be combined into a longer NOP (which is ordinarily desirable) because then ICO would split one instruction, producing a malformed instruction after the ICO call.
1 parent 9f733f4 commit b933f0c

29 files changed

+837
-98
lines changed
Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
// REQUIRES: x86-registered-target
2+
3+
// RUN: %clang_cl -c --target=x86_64-windows-msvc /EHa -O2 /GS- \
4+
// RUN: -Xclang=-import-call-optimization \
5+
// RUN: /clang:-S /clang:-o- -- %s 2>&1 \
6+
// RUN: | FileCheck %s
7+
8+
#ifdef __clang__
9+
#define NO_TAIL __attribute((disable_tail_calls))
10+
#else
11+
#define NO_TAIL
12+
#endif
13+
14+
void might_throw();
15+
void other_func(int x);
16+
17+
void does_not_throw() noexcept(true);
18+
19+
extern "C" void __declspec(dllimport) some_dll_import();
20+
21+
class HasDtor {
22+
int x;
23+
char foo[40];
24+
25+
public:
26+
explicit HasDtor(int x);
27+
~HasDtor();
28+
};
29+
30+
class BadError {
31+
public:
32+
int errorCode;
33+
};
34+
35+
void normal_has_regions() {
36+
// CHECK-LABEL: .def "?normal_has_regions@@YAXXZ"
37+
// CHECK: .seh_endprologue
38+
39+
// <-- state -1 (none)
40+
{
41+
HasDtor hd{42};
42+
43+
// <-- state goes from -1 to 0
44+
// because state changes, we expect the HasDtor::HasDtor() call to have a NOP
45+
// CHECK: call "??0HasDtor@@QEAA@H@Z"
46+
// CHECK-NEXT: nop
47+
48+
might_throw();
49+
// CHECK: call "?might_throw@@YAXXZ"
50+
// CHECK-NEXT: nop
51+
52+
// <-- state goes from 0 to -1 because we're about to call HasDtor::~HasDtor()
53+
// CHECK: call "??1HasDtor@@QEAA@XZ"
54+
// <-- state -1
55+
}
56+
57+
// <-- state -1
58+
other_func(10);
59+
// CHECK: call "?other_func@@YAXH@Z"
60+
// CHECK-NEXT: nop
61+
// CHECK: .seh_startepilogue
62+
63+
// <-- state -1
64+
}
65+
66+
// This tests a tail call to a destructor.
67+
void case_dtor_arg_empty_body(HasDtor x)
68+
{
69+
// CHECK-LABEL: .def "?case_dtor_arg_empty_body@@YAXVHasDtor@@@Z"
70+
// CHECK: jmp "??1HasDtor@@QEAA@XZ"
71+
}
72+
73+
int case_dtor_arg_empty_with_ret(HasDtor x)
74+
{
75+
// CHECK-LABEL: .def "?case_dtor_arg_empty_with_ret@@YAHVHasDtor@@@Z"
76+
// CHECK: .seh_endprologue
77+
78+
// CHECK: call "??1HasDtor@@QEAA@XZ"
79+
// CHECK-NOT: nop
80+
81+
// The call to HasDtor::~HasDtor() should NOT have a NOP because the
82+
// following "mov eax, 100" instruction is in the same EH state.
83+
84+
return 100;
85+
86+
// CHECK: mov eax, 100
87+
// CHECK: .seh_startepilogue
88+
// CHECK: .seh_endepilogue
89+
// CHECK: .seh_endproc
90+
}
91+
92+
int case_noexcept_dtor(HasDtor x) noexcept(true)
93+
{
94+
// CHECK: .def "?case_noexcept_dtor@@YAHVHasDtor@@@Z"
95+
// CHECK: call "??1HasDtor@@QEAA@XZ"
96+
// CHECK-NEXT: mov eax, 100
97+
// CHECK: .seh_startepilogue
98+
return 100;
99+
}
100+
101+
void case_except_simple_call() NO_TAIL
102+
{
103+
does_not_throw();
104+
}
105+
// CHECK-LABEL: .def "?case_except_simple_call@@YAXXZ"
106+
// CHECK: .seh_endprologue
107+
// CHECK-NEXT: call "?does_not_throw@@YAXXZ"
108+
// CHECK-NEXT: nop
109+
// CHECK-NEXT: .seh_startepilogue
110+
// CHECK: .seh_endproc
111+
112+
void case_noexcept_simple_call() noexcept(true) NO_TAIL
113+
{
114+
does_not_throw();
115+
}
116+
// CHECK-LABEL: .def "?case_noexcept_simple_call@@YAXXZ"
117+
// CHECK: .seh_endprologue
118+
// CHECK-NEXT: call "?does_not_throw@@YAXXZ"
119+
// CHECK-NEXT: nop
120+
// CHECK-NEXT: .seh_startepilogue
121+
// CHECK: .seh_endepilogue
122+
// CHECK-NEXT: ret
123+
// CHECK-NEXT: .seh_endproc
124+
125+
// This tests that the destructor is called right before SEH_BeginEpilogue,
126+
// but in a function that has a return value. Loading the return value
127+
// counts as a real instruction, so there is no need for a NOP after the
128+
// dtor call.
129+
int case_dtor_arg_calls_no_throw(HasDtor x)
130+
{
131+
does_not_throw(); // no NOP expected
132+
return 100;
133+
}
134+
// CHECK-LABEL: .def "?case_dtor_arg_calls_no_throw@@YAHVHasDtor@@@Z"
135+
// CHECK: .seh_endprologue
136+
// CHECK: "?does_not_throw@@YAXXZ"
137+
// CHECK-NEXT: nop
138+
// CHECK: "??1HasDtor@@QEAA@XZ"
139+
// CHECK-NEXT: mov eax, 100
140+
// CHECK: .seh_startepilogue
141+
// CHECK: .seh_endproc
142+
143+
// Check the behavior of CALLs that are at the end of MBBs. If a CALL is within
144+
// a non-null EH state (state -1) and is at the end of an MBB, then we expect
145+
// to find an EH_LABEL after the CALL. This causes us to insert a NOP, which
146+
// is the desired result.
147+
void case_dtor_runs_after_join(int x) {
148+
// CHECK-LABEL: .def "?case_dtor_runs_after_join@@YAXH@Z"
149+
// CHECK: .seh_endprologue
150+
151+
// <-- EH state -1
152+
153+
// ctor call does not need a NOP, because it has real instructions after it
154+
HasDtor hd{42};
155+
// CHECK: call "??0HasDtor@@QEAA@H@Z"
156+
// CHECK-NEXT: nop
157+
// CHECK: test
158+
159+
// <-- EH state transition from -1 0
160+
if (x) {
161+
might_throw(); // <-- NOP expected (at end of BB w/ EH_LABEL)
162+
// CHECK: call "?might_throw@@YAXXZ"
163+
// CHECK-NEXT: nop
164+
} else {
165+
other_func(10); // <-- NOP expected (at end of BB w/ EH_LABEL)
166+
// CHECK: call "?other_func@@YAXH@Z"
167+
// CHECK-NEXT: nop
168+
}
169+
does_not_throw();
170+
// <-- EH state transition 0 to -1
171+
// ~HasDtor() runs
172+
173+
// CHECK: .seh_endproc
174+
175+
// CHECK: "$ip2state$?case_dtor_runs_after_join@@YAXH@Z":
176+
// CHECK-NEXT: .long [[func_begin:.Lfunc_begin([0-9]+)@IMGREL]]
177+
// CHECK-NEXT: .long -1
178+
// CHECK-NEXT: .long [[tmp1:.Ltmp([0-9]+)]]@IMGREL
179+
// CHECK-NEXT: .long 0
180+
// CHECK-NEXT: .long [[tmp2:.Ltmp([0-9]+)]]@IMGREL
181+
// CHECK-NEXT: .long -1
182+
}
183+
184+
185+
// Check the behavior of NOP padding around tail calls.
186+
// We do not expect to insert NOPs around tail calls.
187+
// However, the first call (to other_func()) does get a NOP
188+
// because it comes before .seh_startepilogue.
189+
void case_tail_call_no_eh(bool b) {
190+
// tail call; no NOP padding after JMP
191+
if (b) {
192+
does_not_throw();
193+
// <-- no NOP here
194+
return;
195+
}
196+
197+
other_func(20);
198+
// <-- NOP does get inserted here
199+
}
200+
// CHECK-LABEL: .def "?case_tail_call_no_eh@@YAX_N@Z"
201+
// CHECK: test
202+
// CHECK-NEXT: je .LBB
203+
// CHECK: jmp "?does_not_throw@@YAXXZ"
204+
// CHECK-SAME: TAILCALL
205+
// CHECK-NEXT: .LBB
206+
// CHECK-NEXT: mov ecx, 20
207+
// CHECK-NEXT: jmp "?other_func@@YAXH@Z"
208+
// CHECK-SAME: TAILCALL
209+
// CHECK-NEXT: # -- End function
Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
// RUN: %clang_cl -c --target=x86_64-windows-msvc /EHs-c- -O2 /GS- \
2+
// RUN: -Xclang=-import-call-optimization \
3+
// RUN: /clang:-S /clang:-o- %s 2>&1 \
4+
// RUN: | FileCheck %s
5+
6+
#ifdef __clang__
7+
#define NO_TAIL __attribute((disable_tail_calls))
8+
#else
9+
#define NO_TAIL
10+
#endif
11+
12+
void might_throw();
13+
void other_func(int x);
14+
15+
void does_not_throw() noexcept(true);
16+
17+
extern "C" void __declspec(dllimport) some_dll_import();
18+
19+
class HasDtor {
20+
int x;
21+
char foo[40];
22+
23+
public:
24+
explicit HasDtor(int x);
25+
~HasDtor();
26+
};
27+
28+
void normal_has_regions() {
29+
{
30+
HasDtor hd{42};
31+
32+
// because state changes, we expect the HasDtor::HasDtor() call to have a NOP
33+
might_throw();
34+
}
35+
36+
other_func(10);
37+
}
38+
// CHECK-LABEL: .def "?normal_has_regions@@YAXXZ"
39+
// CHECK: .seh_endprologue
40+
// CHECK: call "??0HasDtor@@QEAA@H@Z"
41+
// CHECK-NEXT: call "?might_throw@@YAXXZ"
42+
// CHECK-NEXT: mov
43+
// CHECK: call "??1HasDtor@@QEAA@XZ"
44+
// CHECK-NEXT: mov ecx, 10
45+
// CHECK-NEXT: call "?other_func@@YAXH@Z"
46+
// CHECK-NEXT: nop
47+
// CHECK-NEXT: .seh_startepilogue
48+
// CHECK-NOT: "$ip2state$?normal_has_regions@@YAXXZ"
49+
50+
// This tests a tail call to a destructor.
51+
void case_dtor_arg_empty_body(HasDtor x)
52+
{
53+
}
54+
// CHECK-LABEL: .def "?case_dtor_arg_empty_body@@YAXVHasDtor@@@Z"
55+
// CHECK: jmp "??1HasDtor@@QEAA@XZ"
56+
57+
int case_dtor_arg_empty_with_ret(HasDtor x)
58+
{
59+
// The call to HasDtor::~HasDtor() should NOT have a NOP because the
60+
// following "mov eax, 100" instruction is in the same EH state.
61+
return 100;
62+
}
63+
// CHECK-LABEL: .def "?case_dtor_arg_empty_with_ret@@YAHVHasDtor@@@Z"
64+
// CHECK: .seh_endprologue
65+
// CHECK: call "??1HasDtor@@QEAA@XZ"
66+
// CHECK-NOT: nop
67+
// CHECK: mov eax, 100
68+
// CHECK: .seh_startepilogue
69+
// CHECK: .seh_endepilogue
70+
// CHECK: .seh_endproc
71+
72+
void case_except_simple_call() NO_TAIL
73+
{
74+
does_not_throw();
75+
}
76+
77+
// This tests that the destructor is called right before SEH_BeginEpilogue,
78+
// but in a function that has a return value.
79+
int case_dtor_arg_calls_no_throw(HasDtor x)
80+
{
81+
does_not_throw(); // no NOP expected
82+
return 100;
83+
}
84+
85+
// Check the behavior of CALLs that are at the end of MBBs. If a CALL is within
86+
// a non-null EH state (state -1) and is at the end of an MBB, then we expect
87+
// to find an EH_LABEL after the CALL. This causes us to insert a NOP, which
88+
// is the desired result.
89+
void case_dtor_runs_after_join(int x) {
90+
91+
// ctor call does not need a NOP, because it has real instructions after it
92+
HasDtor hd{42};
93+
94+
if (x) {
95+
might_throw();
96+
} else {
97+
other_func(10);
98+
}
99+
does_not_throw();
100+
// ~HasDtor() runs
101+
}
102+
103+
// CHECK-LABEL: .def "?case_dtor_runs_after_join@@YAXH@Z"
104+
// CHECK: .seh_endprologue
105+
// CHECK: call "??0HasDtor@@QEAA@H@Z"
106+
// CHECK-NEXT: test
107+
// CHECK: call "?might_throw@@YAXXZ"
108+
// CHECK-NEXT: jmp
109+
// CHECK: call "?other_func@@YAXH@Z"
110+
// CHECK-NEXT: .LBB
111+
// CHECK: call "?does_not_throw@@YAXXZ"
112+
// CHECK-NEXT: lea
113+
// CHECK-NEXT: call "??1HasDtor@@QEAA@XZ"
114+
// CHECK-NEXT: nop
115+
// CHECK-NEXT: .seh_startepilogue
116+
// CHECK-NOT: "$ip2state$?case_dtor_runs_after_join@@YAXH@Z":
117+
118+
119+
// Check the behavior of NOP padding around tail calls.
120+
// We do not expect to insert NOPs around tail calls.
121+
// However, the first call (to other_func()) does get a NOP
122+
// because it comes before .seh_startepilogue.
123+
void case_tail_call_no_eh() {
124+
// ordinary call
125+
other_func(10);
126+
127+
// tail call; no NOP padding after JMP
128+
does_not_throw();
129+
}
130+
131+
// CHECK-LABEL: .def "?case_tail_call_no_eh@@YAXXZ"
132+
// CHECK: .seh_endprologue
133+
// CHECK: call "?other_func@@YAXH@Z"
134+
// CHECK-NEXT: nop
135+
// CHECK-NEXT: .seh_startepilogue
136+
// CHECK: .seh_endepilogue
137+
// CHECK: jmp "?does_not_throw@@YAXXZ"
138+
// CHECK-NOT: nop
139+
// CHECK: .seh_endproc

0 commit comments

Comments
 (0)