-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Closed
Labels
I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.P-highHigh priorityHigh priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
I noticed a significant regression since nightly 2018-01-27 when enabling LTO. Here is a simplified example that has 10x difference.
fn fill<T: Clone + 'static>(init: &'static T) -> Box<Fn(&usize) -> Vec<T>> {
Box::new(move |&i| {
let mut vec = Vec::<T>::new();
vec.resize(i, init.clone());
vec
})
}
fn main() {
let zeroes = fill(&0);
for _ in 0..1_000_000 {
zeroes(&1000);
}
}
Without LTO enabled:
$ perf stat -B -e cycles,instructions ./target/release/columnar
Performance counter stats for './target/release/regr':
335,160,070 cycles
539,078,439 instructions # 1.61 insn per cycle
0.112408607 seconds time elapsed
with LTO enabled:
Performance counter stats for './target/release/regr':
1,121,648,955 cycles
5,233,309,105 instructions # 4.67 insn per cycle
0.361163946 seconds time elapsed
Tested variants
Bad: -C lto=fat
Good: -C lto=thin
Good: -C lto=fat -C codegen-units=1
Metadata
Metadata
Assignees
Labels
I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.P-highHigh priorityHigh priorityT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.