-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Optimizing Code
By default Emscripten will compile code in a fairly safe way, and without all the possible optimizations. You should generally try this first, to see if things work properly (if not, see the Debugging page). Afterwards, you may want to compile your code so it runs faster. This page explains how.
Note: You can also optimize the source code you are compiling into JavaScript, see Optimizing the source code.
The recommended way to optimize code is with emcc. emcc works like gcc, and basic usage is presented in the Emscripten Tutorial. As mentioned there, you can use emcc to optimize, for example
emcc -O3 file.cpp
To see what the different optimizations levels do, run
emcc --help
- Optimization only makes sense when generating JavaScript, not when compiling to intermediate bitcode, and for that reason optimization flags are ignored unless the command will generate JavaScript. For more details see Building Projects.
- The meaning of
-O1, -O2
etc. in emcc are not identical to gcc, even though they have been chosen to be as familiar as possible! They can't be, because optimizing JavaScript is very different than optimizing native code. See the--help
as mentioned before for details. - If you compile several files into a single JavaScript output, be sure to specify optimization during the last emcc invocation, where you transform the code into JavaScript. See Building Projects for details.
If you run emcc with EMCC_DEBUG=1
(so, something like EMCC_DEBUG emcc
), then it will output all the intermediate steps after each optimization pass. The output will be in TEMP_DIR/emscripten_temp
, where TEMP_DIR
is by default /tmp
(and can be modified in ~/.emscripten
).
Sometimes it is useful to add code to the generated JavaScript before optimizations are applied. One reason this is useful is that the added code will then be optimized together with the generated code, which is very important when using the Closure Compiler, for example. Another use case is to modify the generated code in some way, for example to replace a compiled function with a handwritten one.
To do these sorts of things, you can use the --js-transform
flag to emcc. It lets you define a command that is run on the raw compiled JavaScript, before optimizations. The command is run with the filename as a parameter, and you can then read that file, append to it, and overwrite it as needed. For more details, see emcc --help
.
- One dangerous situation here is replacing a generated function with a handwritten one that contains arbitrary JavaScript, because the optimizations run on the generated code in later passes assume the code is of the form generated by compilation (this allows stronger optimizations, since we know a lot about what kind of code can possibly be generated). To hide handwritten code from the optimization passes, you should remove the function name from the
EMSCRIPTEN_GENERATED_FUNCTIONS
array that appears as metadata in a comment at the bottom of the file. Note that if you add a function, things will work fine as only functions in that list are optimized - the only situation that is potentially problematic is replacing a generated function, because its name will be in that list. - You can see the transformed code as one of the passes that are saved when using EMCC_DEBUG=1, see above.
The following procedure is how you should normally optimize your code:
- First, run emcc on your code without optimization. The default settings are set to be safe. Check that your code works (if not, see the Debugging page) before continuing.
- Try to optimize with
-O3
. This aggressively optimizes in various ways, including dangerous ones. If your code works, great! The only additional optimizations potentially worth investigating are memory compression, altering the typed array mode, and disabling LLVM optimizations; see the sections below. - If not, you should try
-O2
. That includes most optimizations, and should always work. If this fails, please file a bug. - If
-O2
works, you can stop there, the code is fairly well optimized in most cases. However,-O3
is significantly faster, so you can investigate what stops it from working. See the "Advanced Optimization Issues" sections below. You can probably use-O2
with some of the optimizations from those sections, giving you performance in between-O2
and-O3
.
These sections regard some more complex optimizations.
Typed arrays in JavaScript can be much faster than untyped arrays. However this does not always lead to faster code, so you should check if it does or not. There are also two different typed array modes, see Code Generation Modes.
The default is typed arrays mode 2 (C-like, shared buffers). Mode 1 can be faster in some cases and slower in others - it's worth trying both modes.
QUANTUM_SIZE
of 1 can speed up your code, but is dangerous, see Code Generation Modes. The default is 4, which is the 'normal', safe value (even in -O3
, because this optimization is really speculative).
In -O1
and above, LLVM optimizations are applied. However, in some cases they slow down the code, because they are tuned for optimizing native code, not JavaScript. It is worth trying your code without LLVM optimization as well, with --llvm-opts 0
.
CORRECT_SIGNS
, CORRECT_OVERFLOWS
and CORRECT_ROUNDINGS
are needed in some code. They add a lot of runtime overhead though. If you can, disable them entirely (by compiling with -s OPTION=0
etc., or by editing src/settings.js
). Test your code carefully with those options disabled, because it is very possible it will no longer run properly.
If you can't disable them entirely, you can enable corrections for specific lines. Setting the CORRECT_*
option to 2 (see the linespecific
test for more) will correct only those lines.
To automatically find which lines need correction, you can use Emscripten's Profile Guided Optimization (PGO), described in the next section.
In -O3
, no corrections are done, for maximum speed.
To optimize your code with PGO, you should do the following steps:
- Compile your code with
PGO=1, CHECK_SIGNS=1, CHECK_OVERFLOWS=1
. The generated code is now instrumented to correct everything, and to take note of what needed correction. - Run your code on a representative workload. The code will run slowly because of the PGO instrumentation. PGO info will be written out when the program stops normally (if it doesn't stop normally, you can call
CorrectionsMonitor.print()
manually). Save the PGO output. - Recompile your code with something like
pgo_data = read_pgo_data(pgo_filename)
Settings.CORRECT_SIGNS = 2
Settings.CORRECT_SIGNS_LINES = pgo_data['signs_lines']
Settings.CORRECT_OVERFLOWS = 2
Settings.CORRECT_OVERFLOWS_LINES = pgo_data['overflows_lines']
Here read_pgo_data
is a utility function from tools/shared.py, using that we take the processed PGO output and use corrections in mode 2 (correct only the specified lines) on the right lines.
Notes:
- You should compile your source code with -g to see the original source file and line numbers in the generated JavaScript.
- Your code must be run on a representative workload. Corrections will only be done if they were seen to be needed, so if you later run on different input that uses different code paths, things may break.
- There is no
CHECK_ROUNDINGS
, so PGO can't work on roundings corrections. This is rarely needed and has much less runtime overhead though, so just check if your code only works withCHECK_ROUNDINGS
, and if so, build that way. - For the first step, you might need
CHECK_SIGNED_OVERFLOWS=1
in rare cases and not justCHECK_OVERFLOWS=1
.
(Note for manual tweaking of corrections - you can probably ignore this - you don't necessarily need to recompile each time. You can edit the generated source. unSign
, for example, takes as a third parameter whether to ignore problems, so changing that to true will ignore signing on that line.)
Try changing I64_MODE
and/or DOUBLE_MODE
to 0. This may break code in some cases (see Code Generation Modes for details), but on most code it will work and be much faster.
In -O3
both of these are optimized to 0.
In -O1
and above exception catching is disabled. This prevents the generation of try-catch blocks, which lets the code run much faster. To re-enable them, run emcc with -s DISABLE_EXCEPTION_CATCHING=1
.
The following are some additional topics related to optimization. You probably don't need them.
For additional speed, JavaScript optimizers can be run after Emscripten. The best is probably the Closure Compiler, which both minifies and optimizes the code. The YUI Compressor is also useful (tends to optimize less, but runs faster).
The closure compiler is run automatically by emcc with -O3
.
You can optimize the .ll file that Emscripten receives by running optimizations in the compiler that generates the .ll file. Or you can run llvm-opt on an LLVM bitcode file. However, these LLVM optimizations are neither safe nor portable, they are designed in mind for native code and not JavaScript. It is entirely possible they will break Emscripten. There is a safe subset of them, which are applied if you call emscripten.py with --optimize
or if you tell emcc to optimize your code (any setting greater than 0 will use LLVM safe optimizations).
An additional optimization you can run before Emscripten is the dead function elimination tool, that is in tools/dead_function_eliminator.py
in Emscripten. This tool will scrub an .ll file and remove all functions that cannot be run, leaving only those functions that can be reached (through some chain of calls) from main() or from a global constant. This is useful in reducing the size of the .ll file, which both leads to smaller code and faster compilation, but make sure it doesn't remove functions that you want left in (if you are compiling a library, for example).