Link time optimization (LTO) for the Linux kernel This is an experimental feature. Link Time Optimization allows the compiler to optimize the complete program instead of just each file. LTO requires at least gcc 4.8 (but works more efficiently with 4.9+) LTO requires Linux binutils (the normal FSF releases used in many distributions do not work at the moment) The compiler can inline functions between files and do various other global optimizations, like specializing functions for common parameters, determing when global variables are clobbered, making functions pure/const, propagating constants globally, removing unneeded data and others. It will also drop unused functions which can make the kernel image smaller in some circumstances, in particular for small kernel configurations. For small monolithic kernels it can throw away unused code very effectively (especially when modules are disabled) and usually shrinks the code size. Build time and memory consumption at build time will increase, depending on the size of the largest binary. Modular kernels are less affected. With LTO incremental builds are less incremental, as always the whole binary needs to be re-optimized (but not re-parsed) Oops can be somewhat more difficult to read, due to the more aggressive inlining. Normal "reasonable" builds work with less than 4GB of RAM, but very large configurations like allyesconfig may need more memory. The actual memory needed depends on the available memory (gcc sizes its garbage collector pools based on that or on the ulimit -m limits) and the compiler version. gcc 4.9+ has much better build performance and less memory consumption - A few kernel features are currently incompatible with LTO, in particular function tracing, because they require special compiler flags for specific files, which is not supported in LTO right now. - Jobserver control for -j does not work correctly for the final LTO phase due to some problems with the kernel's pipe code. The makefiles hard codes -j for the final LTO phase to work around for this Configuration: - Enable CONFIG_LTO_MENU and then disable CONFIG_LTO_DISABLE. This is mainly to not have allyesconfig default to LTO. - FUNCTION_TRACER, STACK_TRACER, FUNCTION_GRAPH_TRACER, KALLSYMS_ALL, GCOV have to disabled because they are currently incompatible with LTO. - MODVERSIONS have to be disabled (may work with 4.9+) Requirements: - Enough memory: 4GB for a standard build, more for allyesconfig The peak memory usage happens single threaded (when lto-wpa merges types), so dialing back -j options will not help much. A 32bit compiler is unlikely to work due to the memory requirements. You can however build a kernel targeted at 32bit on a 64bit host. Example build procedure: Simplified procedure for distributions that have gcc 4.8, but not the Linux binutils (for example openSUSE 13.1 or FC20): The LTO builds requires gcc-nm/gcc-ar. Some distributions ship those in separate packages, which may need to be explicitely installed. - Get the latest Linux binutils from http://www.kernel.org/pub/linux/devel/binutils/ and unpack it. We install it in a separate directory to not overwrite the system binutils. # replace VERSION with respective version numbers cd binutils* # don't forget the --enable-plugins! ./configure --prefix=/opt/binutils-VERSION --enable-plugins make -j $(getconf _NPROCESSORS_ONLN) && sudo make install Fix up the kernel configuration to allow LTO: ./source/scripts/config --disable function_tracer \ --disable function_graph_tracer \ --disable stack_tracer --enable lto_menu \ --disable lto_disable \ --disable gcov \ --disable kallsyms_all \ --disable modversions make oldconfig Then you can build with # The COMPILER_PATH is needed to let gcc use the new binutils # as the LTO plugin linker # if you installed gcc in a separate directory like below also # add it to the PATH line below before the regular $PATH # The COMPILER_PATH setting is only needed if the gcc was not built # with --with-plugin-ld pointing to the Linux binutils ld # The AR/NM setting works around a Makefile bug COMPILER_PATH=/opt/binutils-VERSION/bin PATH=$COMPILER_PATH:$PATH \ make -j$(getconf _NPROCESSORS_ONLN) AR=gcc-ar NM=gcc-nm If you don't have gcc 4.8+ as system compiler you would also need to install that compiler. In this case I recommend getting a gcc 4.9+ snapshot from http://gcc.gnu.org (or release when available), as it builds much faster for LTO than 4.8. Here's an example build procedure: Assuming gcc is unpacked in gcc-VERSION cd gcc-VERSION ./contrib/download_preqrequisites cd .. mkdir obj-gcc # please don't skip this cd. the build will not work correctly in the # source dir, you have to use the separate object dir cd obj-gcc ../gcc-VERSION/configure --prefix=/opt/gcc-VERSION --enable-lto \ --with-plugin-ld=/opt/binutils-VERSION/bin/ld --disable-nls --enable-languages=c,c++ \ --disable-libstdcxx-pch make -j$(getconf _NPROCESSORS_ONLN) sudo make install-no-fixedincludes FAQs: Q: I get a section type attribute conflict A: Usually because of someone doing const __initdata (should be const __initconst) or const __read_mostly (should be just const). Check both symbols reported by gcc. Q: I see lots of undefined symbols for memcmp etc. A: Usually because NM=gcc-nm AR=gcc-ar are missing. The Makefile tries to set those automatically, but it doesn't always work. Better to set it manually on the make command line. Q: It's quite slow / uses too much memory. A: Consider a gcc 4.9 snapshot/release (not released yet) The main problem in 4.8 is the type merging in the single threaded WPA pass, which has been improved considerably in 4.9 by running it distributed. Q: It's still slow A: It'll always be somewhat slower than non LTO sorry. Q: What's up with .XXXXX numeric post fixes A: This is due LTO turning (near) all symbols to static Use gcc 4.9, it avoids them in most cases. They are also filtered out in kallsyms. References: Presentation on Kernel LTO (note, performance numbers/details outdated. In particular gcc 4.9 fixed most of the build time problems): http://halobates.de/kernel-lto.pdf Generic gcc LTO: http://www.ucw.cz/~hubicka/slides/labs2013.pdf http://www.hipeac.net/system/files/barcelona.pdf Somewhat outdated too: http://gcc.gnu.org/projects/lto/lto.pdf http://gcc.gnu.org/projects/lto/whopr.pdf Happy Link-Time-Optimizing! Andi Kleen