-
Notifications
You must be signed in to change notification settings - Fork 14.6k
[llvm-profgen] Extend llvm-profgen to generate vtable profiles with data access events. #148013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: users/mingmingl-llvm/samplefdo-profile-format
Are you sure you want to change the base?
Conversation
…enerate vtable profiles for non context-sensitive AFDO profiles
@llvm/pr-subscribers-pgo Author: Mingming Liu (mingmingl-llvm) ChangesAn RFC is in https://discourse.llvm.org/t/rfc-vtable-type-profiling-for-samplefdo/87283 This change extends to process perf data with Intel MEM_INST_RETIRED.ALL_LOADS samples and produce sample profiles with vtable information for non context-sensitive SampleFDO profiles.
Patch is 186.37 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/148013.diff 10 Files Affected:
diff --git a/llvm/test/tools/llvm-profgen/Inputs/dap-perf-trace.txt b/llvm/test/tools/llvm-profgen/Inputs/dap-perf-trace.txt
new file mode 100644
index 0000000000000..28f15b1ff199d
--- /dev/null
+++ b/llvm/test/tools/llvm-profgen/Inputs/dap-perf-trace.txt
@@ -0,0 +1,37 @@
+0 0x7b10 [0x88]: PERF_RECORD_MMAP2 3446532/3446532: [0x200000(0x60000) @ 0 08:01 527501 0]: r--p /path/to/dap.perfbin
+0 0x7b98 [0x88]: PERF_RECORD_MMAP2 3446532/3446532: [0x260000(0x153000) @ 0x5f000 08:01 527501 0]: r-xp /path/to/dap.perfbin
+0 0x7c20 [0x88]: PERF_RECORD_MMAP2 3446532/3446532: [0x3b3000(0xc000) @ 0x1b1000 08:01 527501 0]: r--p /path/to/dap.perfbin
+0 0x7ca8 [0x88]: PERF_RECORD_MMAP2 3446532/3446532: [0x3bf000(0x3000) @ 0x1bc000 08:01 527501 0]: rw-p /path/to/dap.perfbin
+1282514021937402 0x8660 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514022939813 0x87b0 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3fb0
+1282514023932029 0x8a00 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3fb0
+1282514024937981 0x8d48 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3fb0
+1282514028925828 0x94c0 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3fb0
+1282514028934870 0x9678 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608ac period: 233 addr: 0x3b3fc0
+1282514029934094 0x9830 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3fb0
+1282514040934785 0xb1d0 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608ac period: 233 addr: 0x3b3fc0
+1282514052924510 0xcbb8 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514053932406 0xcfb0 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608ac period: 233 addr: 0x3b3fc0
+1282514063928248 0xe5c8 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514073928057 0xfd20 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514081925013 0x10f28 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514084927335 0x11678 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514088926926 0x11f90 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514089929492 0x12270 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514119919997 0x16610 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514120924169 0x16920 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514145923603 0x1a338 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514146917708 0x1a428 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514173914003 0x1e1b0 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514188915199 0x20488 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514210915866 0x236d8 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514212908181 0x23a50 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608a2 period: 233 addr: 0x3b3f70
+1282514480886012 0x4a098 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608ac period: 233 addr: 0x3b3f80
+1282514840855333 0x7dd48 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608ac period: 233 addr: 0x3b3f80
+1282514955835364 0x8e380 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608ac period: 233 addr: 0x3b3f80
+1282514967839429 0x8fef8 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608ac period: 233 addr: 0x3b3f80
+1282515023830209 0x97f98 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608ac period: 233 addr: 0x3b3f80
+1282515356804308 0xc7b28 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608ac period: 233 addr: 0x3b3f80
+1282515410794371 0xcf590 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608ac period: 233 addr: 0x3b3f80
+1282515541786485 0xe2280 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608ac period: 233 addr: 0x3b3f80
+1282515703761203 0xf93c0 [0x60]: PERF_RECORD_SAMPLE(IP, 0x4002): 3446532/3446532: 0x2608ac period: 233 addr: 0x3b3f80
diff --git a/llvm/test/tools/llvm-profgen/Inputs/dap.perfbin b/llvm/test/tools/llvm-profgen/Inputs/dap.perfbin
new file mode 100755
index 0000000000000..e69de29bb2d1d
diff --git a/llvm/test/tools/llvm-profgen/Inputs/lbr-perf-for-dap.script b/llvm/test/tools/llvm-profgen/Inputs/lbr-perf-for-dap.script
new file mode 100644
index 0000000000000..34a2f5121b3c6
--- /dev/null
+++ b/llvm/test/tools/llvm-profgen/Inputs/lbr-perf-for-dap.script
@@ -0,0 +1,175 @@
+PERF_RECORD_MMAP2 3446532/3446532: [0x260000(0x153000) @ 0x5f000 08:01 527501 0]: r-xp /path/to/dap.perfbin
+PERF_RECORD_MMAP2 3446532/3446532: [0x7fff5ff28000(0x2000) @ 0 00:00 0 0]: r-xp [vdso]
+PERF_RECORD_MMAP2 3446532/3446532: [0xffffffffff600000(0x1000) @ 0 00:00 0 0]: --xp [vsyscall]
+ 350fd4 0x260832/0x260894/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//-
+ 260884 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26087a/0x260be0/P/-/-/2//- 0x2608ac/0x260870/P/-/-/1//- 0x2607fc/0x2608a4/P/-/-/2//- 0x2608a2/0x2607f0/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x260c52/0x260827/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/2//-
+ 2608a4 0x260c47/0x350850/P/-/-/2//- 0x260822/0x260c30/P/-/-/1//- 0x260808/0x26081d/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//-
+ 35104c 0x2608ac/0x260850/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//-
+ 260800 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26087a/0x260be0/P/-/-/2//- 0x2608ac/0x260870/P/-/-/1//- 0x2607fc/0x2608a4/P/-/-/2//- 0x2608a2/0x2607f0/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x260c52/0x260827/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/2//- 0x260822/0x260c30/P/-/-/1//- 0x260808/0x26081d/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//-
+ 2608a4 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//-
+ 35109d 0x26085a/0x260be0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26087a/0x260be0/P/-/-/2//- 0x2608ac/0x260870/P/-/-/1//- 0x2607fc/0x2608a4/P/-/-/2//- 0x2608a2/0x2607f0/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x260c52/0x260827/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/2//- 0x260822/0x260c30/P/-/-/1//- 0x260808/0x26081d/P/-/-/1//-
+ 260c30 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//-
+ 260bf0 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//-
+ 2608af 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26087a/0x260be0/P/-/-/2//- 0x2608ac/0x260870/P/-/-/1//- 0x2607fc/0x2608a4/P/-/-/2//- 0x2608a2/0x2607f0/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x260c52/0x260827/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/2//- 0x260822/0x260c30/P/-/-/1//- 0x260808/0x26081d/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//-
+ 350866 0x351059/0x351098/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//-
+ 350f40 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26087a/0x260be0/P/-/-/2//- 0x2608ac/0x260870/P/-/-/1//- 0x2607fc/0x2608a4/P/-/-/2//- 0x2608a2/0x2607f0/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x260c52/0x260827/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/2//- 0x260822/0x260c30/P/-/-/1//- 0x260808/0x26081d/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//-
+ 26090f 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//-
+ 260814 0x26090a/0x260880/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26087a/0x260be0/P/-/-/2//- 0x2608ac/0x260870/P/-/-/1//- 0x2607fc/0x2608a4/P/-/-/2//-
+ 350f57 0x350879/0x350887/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/2//- 0x260822/0x260c30/P/-/-/1//- 0x260808/0x26081d/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350887/P/-/-/3//- 0x260c47/0x350850/P/-/-/3//- 0x26080f/0x260c30/P/-/-/1//- 0x26088f/0x260800/P/-/-/3//- 0x26090a/0x260880/P/-/-/1//- 0x26091c/0x260900/P/-/-/1//- 0x2608bb/0x26090f/P/-/-/2//- 0x3510ab/0x2608af/P/-/-/3//- 0x351059/0x351098/P/-/-/7//- 0x350f8c/0x350fb4/P/-/-/4//-
+ 260880 0x350f8c/0x350fb4/P/-/-/4//- 0x260bf4/0x350f40/P/-/-/1//- 0x260be4/0x260bf0/P/-/-/1//- 0x26085a/0x260be0/P/-/-/2//- 0x2608ac/0x260850/P/-/-/1//- 0x26084c/0x2608a4/P/-/-/2//- 0x2608a2/0x260840/P/-/-/2//- 0x260832/0x260894/P/-/-/1//- 0x26081b/0x26082e/P/-/-/1//- 0x260c52/0x260814/P/-/-/1//- 0x3508da/0x260c4c/P/-/-/5//- 0x350879/0x350...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly cosmetic comments..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for reviews! PTAL.
} | ||
|
||
// Given a runtime address, canonicalize it to the virtual address in the | ||
// binary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the comment to mention 'non-text' and added a TODO to consider unifying text vs non-text. I think it should be possible to do some refactoring, but not sure if we can completely get rid of a data-vs-text hint from caller side (e.g. executable's linking option may affect the code/data to segment mapping).
@@ -946,6 +978,14 @@ SampleContextFrameVector ProfiledBinary::symbolize(const InstructionPointer &IP, | |||
return CallStack; | |||
} | |||
|
|||
StringRef ProfiledBinary::symbolizeDataAddress(uint64_t Address) { | |||
DIGlobal DataDIGlobal = unwrapOrError( | |||
Symbolizer->symbolizeData(SymbolizerPath.str(), {Address, 0}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVMSymbolizer's symbolize* interfaces [1] require a struct of object::SectionedAddress
[2] to symbolize an address, like, they don't take an integer of address alone. Chasing down the calls [3], sectionIndex
field isn't used so its value doesn't matter [4].
Upon this question, it's better to use UndefSection
[5] here. Added a static helper function getSectionedAddress
for this purpose, and use it for both code and data inside ProfiledBinary
class.
[2]
llvm-project/llvm/include/llvm/Object/ObjectFile.h
Lines 146 to 151 in 13f7786
struct SectionedAddress { | |
const static uint64_t UndefSection = UINT64_MAX; | |
uint64_t Address = 0; | |
uint64_t SectionIndex = UndefSection; | |
}; |
[3]
llvm-project/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
Lines 155 to 178 in 13f7786
LLVMSymbolizer::symbolizeDataCommon(const T &ModuleSpecifier, | |
object::SectionedAddress ModuleOffset) { | |
auto InfoOrErr = getOrCreateModuleInfo(ModuleSpecifier); | |
if (!InfoOrErr) | |
return InfoOrErr.takeError(); | |
SymbolizableModule *Info = *InfoOrErr; | |
// A null module means an error has already been reported. Return an empty | |
// result. | |
if (!Info) | |
return DIGlobal(); | |
// If the user is giving us relative addresses, add the preferred base of | |
// the object to the offset before we do the query. It's what DIContext | |
// expects. | |
if (Opts.RelativeAddresses) | |
ModuleOffset.Address += Info->getModulePreferredBase(); | |
DIGlobal Global = Info->symbolizeData(ModuleOffset); | |
if (Opts.Demangle) | |
Global.Name = DemangleName(Global.Name, Info); | |
return Global; | |
} |
[4]
llvm-project/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
Lines 1753 to 1766 in 13f7786
std::optional<DILineInfo> | |
DWARFContext::getLineInfoForDataAddress(object::SectionedAddress Address) { | |
DILineInfo Result; | |
DWARFCompileUnit *CU = getCompileUnitForDataAddress(Address.Address); | |
if (!CU) | |
return Result; | |
if (DWARFDie Die = CU->getVariableForAddress(Address.Address)) { | |
Result.FileName = Die.getDeclFile(FileLineInfoKind::AbsoluteFilePath); | |
Result.Line = Die.getDeclLine(); | |
} | |
return Result; | |
} |
[5]
llvm-project/llvm/include/llvm/Object/ObjectFile.h
Lines 146 to 147 in 13f7786
struct SectionedAddress { | |
const static uint64_t UndefSection = UINT64_MAX; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of requests regarding testing --
- I think the current dap.bin is non-PIE, can you add a PIE variant test case too?
- Can you trim the lbr-perf-for-dap.script test input to preserve only whats needed for a minimal test. I think a lot of the samples are repeated and we can just test for a lower count.
@@ -603,6 +640,16 @@ class ProfiledBinary { | |||
return ProbeDecoder.getInlinerDescForProbe(Probe); | |||
} | |||
|
|||
void addMMapNonTextEvent(MMapEvent MMap) { | |||
MMapNonTextEvents.push_back(MMap); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that multiple mmap events can happen on the same segment? If so, would it be better to use a map here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think even if there were mutliple mmap events for the same segment we would still only want the last one. For the things we care about .text* .data*, we shouldn't have multiple copies in the address space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Using vector here, I think there is a chance an address could be matched to an old map during canonicalization. If using map with offsets as keys, we only keep the most recent one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that it's rare for the virtual address ranges (i.e., the intervals formed by [MMapEvent.Address, MMapEvent.Address + MMap.Size)
) of mmap events to overlap with each other. Assuming this is true, I think there are two implementation options here:
-
Make llvm-profgen to validate mmap events don't have overlapping virtual address (and report 'unimplemented error' otherwise). This way, the implementation can use a map to do efficient address canonicalization and handle the common case. The updated change implements this option.
-
llvm-profgen doesn't validate address interval and cannot assume non-overlapping address ranges. The implementation uses a vector to record the mmap events in the order they are added and reverse iterate the map to do data address canonicalization. While a linear reverse iteration is probably acceptable given the number of non-text mmap events is observed to be 3 in most cases, I think option 1) does the better trade-off.
Let me know your thoughts and I'd be glad to follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update!
Sorry for not making it more clearer. I think this won't be an issue if we only consider the main executable right now. Its sections should just be loaded once and stay the same. I was thinking about if we are going to support dynamic loading/reloading of shared libraries (dlopen/dlclose
), which may make the same DSO appear at different addresses. Maybe this isn't something we need to consider right now. The current map solution can help surface such issue in case we want to go that direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification @apolloww !
I think this won't be an issue if we only consider the main executable right now. Its sections should just be loaded once and stay the same. I was thinking about if we are going to support dynamic loading/reloading of shared libraries (dlopen/dlclose), which may make the same DSO appear at different addresses. Maybe this isn't something we need to consider right now. The current map solution can help surface such issue in case we want to go that direction.
Acknowledged that dynamic loading of shared libraries can create overlapping intervals and that producing vtable symbols from shared DSOs is not something we want to pursue in this version. Added some comment to call out the considerations behind the current implementation. PTAL.
@@ -185,6 +185,16 @@ class BinarySizeContextTracker { | |||
|
|||
using AddressRange = std::pair<uint64_t, uint64_t>; | |||
|
|||
// The parsed MMap event | |||
struct MMapEvent { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe still keep this in PerfReader with its parsing logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously this class definition is moved for ProfiledBinary class to use.
Now moved MMapEvent
back to PerfReader.h and kept it outside PerfReaderBase
class, and updated ProfiledBinary class to use subfields of MMapEvent
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'd still prefer reusing MmapEvent
across the 2 classes, but no strong opinion on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved MMapEvent
from PerfReader to ProfiledBinary to reuse the classes. PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. Just adding a few nits.
can you add a PIE variant test case too?
IIUC any PIE/PIC code is not yet supported, possibly due to the mapping done below?
Or is it to ensure tests are deterministic (ie we have materialized dap-perf-trace.txt, lbr-perf-for-dap.script) ?
llvm-project/llvm/tools/llvm-profgen/PerfReader.cpp
Lines 424 to 429 in 870d957
StringRef DataSymbol = Binary->symbolizeDataAddress( | |
Binary->CanonicalizeNonTextAddress(DataAddress)); | |
if (DataSymbol.starts_with("_ZTV")) { | |
uint64_t IP = 0; | |
Fields[2].getAsInteger(0, IP); | |
Counter.recordDataAccessCount(Binary->canonicalizeVirtualAddress(IP), |
@@ -1027,6 +1027,20 @@ class FunctionSamples { | |||
return VirtualCallsiteTypeCounts[mapIRLocToProfileLoc(Loc)]; | |||
} | |||
|
|||
// At location \p Loc, add a type sample for the given \p Type with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could use triple ///
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
uint64_t DataAddress = 0; | ||
Fields[3].getAsInteger(0, DataAddress); | ||
|
||
StringRef DataSymbol = Binary->symbolizeDataAddress( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's worth adding a comment that the memory accesses pointing to the vtables (based on the ABI convention) are recorded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented that vtable accesses are used and Itanium C++ ABI is assumed to find out vtable accesses.
Co-authored-by: Paschalis Mpeis <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for reviews! PTAL.
@@ -1027,6 +1027,20 @@ class FunctionSamples { | |||
return VirtualCallsiteTypeCounts[mapIRLocToProfileLoc(Loc)]; | |||
} | |||
|
|||
// At location \p Loc, add a type sample for the given \p Type with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
uint64_t DataAddress = 0; | ||
Fields[3].getAsInteger(0, DataAddress); | ||
|
||
StringRef DataSymbol = Binary->symbolizeDataAddress( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented that vtable accesses are used and Itanium C++ ABI is assumed to find out vtable accesses.
@@ -185,6 +185,16 @@ class BinarySizeContextTracker { | |||
|
|||
using AddressRange = std::pair<uint64_t, uint64_t>; | |||
|
|||
// The parsed MMap event | |||
struct MMapEvent { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously this class definition is moved for ProfiledBinary class to use.
Now moved MMapEvent
back to PerfReader.h and kept it outside PerfReaderBase
class, and updated ProfiledBinary class to use subfields of MMapEvent
.
@@ -603,6 +640,16 @@ class ProfiledBinary { | |||
return ProbeDecoder.getInlinerDescForProbe(Probe); | |||
} | |||
|
|||
void addMMapNonTextEvent(MMapEvent MMap) { | |||
MMapNonTextEvents.push_back(MMap); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that it's rare for the virtual address ranges (i.e., the intervals formed by [MMapEvent.Address, MMapEvent.Address + MMap.Size)
) of mmap events to overlap with each other. Assuming this is true, I think there are two implementation options here:
-
Make llvm-profgen to validate mmap events don't have overlapping virtual address (and report 'unimplemented error' otherwise). This way, the implementation can use a map to do efficient address canonicalization and handle the common case. The updated change implements this option.
-
llvm-profgen doesn't validate address interval and cannot assume non-overlapping address ranges. The implementation uses a vector to record the mmap events in the order they are added and reverse iterate the map to do data address canonicalization. While a linear reverse iteration is probably acceptable given the number of non-text mmap events is observed to be 3 in most cases, I think option 1) does the better trade-off.
Let me know your thoughts and I'd be glad to follow up.
done! It took me sometime to find a good time to tune the profiling period for a non-PIE binary without over-heating the machine, and now sampling at 1,000,003 is a good number :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. It would be good to get an approval from llvm-profgen owners.
cc: @WenleiHe @wlei-llvm
uint64_t Count) { | ||
auto &TypeCounts = getTypeSamplesAt(Loc); | ||
bool Overflowed = false; | ||
TypeCounts[Type] = SaturatingMultiplyAdd(Count, /* Weight= */ (uint64_t)1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should insert into the map if it overflowed. Can you check the overflow first and then insert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As clarified offline, the SaturatingMultiplyAdd
will clamp the result if overflow happens, and we insert the return value of SaturatingMultiplyAdd
. From this perspective, counter_overflow is more of informative warning as opposed to a real error that wraps around a large unsigned integer into another value.
I updated the comment to make this more explicit, and will probably prepare a separate change around the warning handling (in SampleProf or InstrProf).
llvm/tools/llvm-profgen/PerfReader.h
Outdated
// mapping between the binary name and its memory layout. | ||
static bool extractMMapEventForBinary(ProfiledBinary *Binary, StringRef Line, | ||
MMapEvent &MMap); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra newline? The existing code doesn't have new lines between decls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
@@ -946,6 +978,14 @@ SampleContextFrameVector ProfiledBinary::symbolize(const InstructionPointer &IP, | |||
return CallStack; | |||
} | |||
|
|||
StringRef ProfiledBinary::symbolizeDataAddress(uint64_t Address) { | |||
DIGlobal DataDIGlobal = unwrapOrError( | |||
Symbolizer->symbolizeData(SymbolizerPath.str(), {Address, 0}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed followup.
|
||
const auto &MMapEvent = MMapIter->second; | ||
|
||
// If the address is within the non-text mmap event, calculates its file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: s/calculates/calculate/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
Sure! also cc @apolloww for (more) reviews and comments! Let me know if there are questions about the work of vtable type profiling (this change or #148002 for SamplePGO, or for Instrumented PGO). |
An RFC is in https://discourse.llvm.org/t/rfc-vtable-type-profiling-for-samplefdo/87283
This change extends to process perf data with Intel MEM_INST_RETIRED.ALL_LOADS samples and produce sample profiles with vtable information for non context-sensitive SampleFDO profiles.