JSFeeds: blog.sqreen.io - How we built V8 natively on ARM

Thursday, 9 May, 2019 UTC

How we built V8 natively on ARM

Summary

When Amazon released their custom Graviton processor, we knew that ARM needed to be on our radar. Although clearly a first generation product, the investment required to build such a chip and Amazon’s track record were clear signs that better chips are incoming. As Amazon gets better and better chips to rent, they’re likely to pick up significant usage. Not wanting to wait for a customer to ask, we explored what we needed to run Sqreen on an ARM platform. While looking into this for our Python agent, we realized that the main roadblock was going to be PyMiniRacer, a V8 wrapper for Python.

V8 is the JavaScript engine used in Chrome, and is available as a standalone library for all your JavaScript needs. Because Chrome is available on 64-bit Android and thus aarch64 (sometime referred to as ARM64), we thought it would be easy to compile it ourselves. As we went to repurpose our automated build scripts for the x86 version of PyMiniRacer to run on a ARM-based AWS VM, it turned out it was not.

In the rest of this article, as we talk about building V8 on ARM, please note we were working with version 6.7.288 of V8 and things may have changed in later versions.

As a quick recap, aarch64 and x86 are two Instruction Set Architectures, standards meant to encode capabilities to be executed by a family of chips. In short, they are a way for a program to code a sequence of instructions.

In this case, x86 is an ISA created by Intel in 1976 which went on to be extremely popular on PCs and then servers. aarch64 is an ISA developed by ARM and used in the vast majority of phones and tablets using 64-bit chips.

Those two ISAs are not compatible, meaning aarch64 code can’t run natively on an x86 chip and vice versa. This is irrespective of which OS is run. In effect, that means there is no easy way to take a x86-compiled Linux software and run it on a aarch64 computer, even if it’s also running Linux.

x86 binaries used in the build

Without further ado, what happened when we tried to build V8 on an ARM server, considering V8 is already supposed to run on aarch64?

In short, a bunch of weird compilation errors.

Those issues were mostly caused by a peculiarity of the V8 build system. Specifically, V8 uses a couple of x86 binaries embedded in the source tree to orchestrate the compilation process :

gn: A high-performance system to configure ninja which will then do the actual building. Because it’s not super common outside Google, it’s not commonly included in Linux distribution’s repositories and thus was included in the repo so it would always work
clang: V8 relies on the latest version of the clang compiler: clang 7. Because of how new it is, it’s not widely available in repositories and in order to keep things simple, is also included in V8’s repo
A clang plugin: V8 extends the features of the base clang with an extension, which is basically a compiled dynamic library clang will load.

Despite V8 running on aarch64, it was apparently only compiled on x86 and targeted aarch64 through cross compilation, as only x86 versions of those files were available in the source tree. This suspicion will later be confirmed as we dive in the build system config, but let’s first focus on those binaries.

First binary: gn

On our first try, we ran into small wall of log hinging on OSError: [Errno 8] Exec format error when running gn python script, basically saying that the gn binary it was trying to use was not a valid ARM binary. Because we couldn’t easily find a pre-built binary for ARM, we built it ourselves. We however ran into an issue when clang errored-out when not recognizing the --icf=all option.

ICF stands for Identical Code Folding and is used in order to make smaller binaries. As we didn’t care about the size of gn, we could safely remove this flag. We manage to build gn using the following script, and then copied it to $V8_PATH/buildtools/linux64/gn.

git clone https://gn.googlesource.com/gn
cd gn
sed -i -e "s/-Wl,--icf=all//" build/gen.py
python build/gen.py
ninja -C out

Second binary: clang

We were working on an Ubuntu distribution, which didn’t have clang 7 in its repo. Thankfully, an aarch64 distribution was available on LLVM’s website. Once we realized that, this solved the problem and we simply replaced with the downloaded binaries to $V8_PATH/third_party/llvm-build/Release+Asserts.

Third binary: a clang plugin

Trying to run the build at this point would appear to start properly, but then fail when trying to actually compile. This is because V8 calls clang with a plugin that was built on x86 and thus won’t work with the ARM clang. In this case, we found the original repo and after some stumbling, built the library with the commands that follow.

wget https://chromium.googlesource.com/chromium/src/+archive/lkgr/tools/clang/plugins.tar.gz
mkdir plugin
cd plugin
tar xf ../plugins.tar.gz

clang++ *.cpp -c -I ../clang+llvm-7.0.1-aarch64-linux-gnu/include/ -fPIC -Wall -std=c++14 -fno-rtti -fno-omit-frame-pointer
clang -shared *.o -o libFindBadConstructs.so

Then, we simply had to copy it to LLVM’s lib directory. After doing that, every binary summoned during the build process and we were mostly done.

Build system misconfiguration

If you simply replaced the binaries as I just described, you’d still see your build fail, but with the following log:

Writing """\
   is_debug = false
   target_cpu = "x64"
   v8_target_cpu = "arm64"
   """ to /home/ubuntu/PyMiniRacer/py_mini_racer/extension/v8/v8/out.gn/arm64.release/args.gn.
   /home/ubuntu/PyMiniRacer/py_mini_racer/extension/v8/v8/buildtools/linux64/gn gen out.gn/arm64.release --check
     -> returned 1
   ERROR at //snapshot_toolchain.gni:101:1: Assertion failed.
   assert(v8_snapshot_toolchain != "",

One thing you’ll notice is target_cpu = "x64" which is… uh… incorrect, we’re building on an aarch64 CPU (which V8 calls arm64).

This is an issue within V8 build system, and infra/mb/mb_config.pyl specifically. This file configures what the build system is supposed to do for various targets (for instance the arm64.release we’re trying to build).

In our case, it’s interpreting arm64.release as default_release_arm64 which is then turned into… ['release', 'simulate_arm64']. Finally, simulate_arm64 is turned, as one would expect, into 'gn_args': 'target_cpu="x64" v8_target_cpu="arm64"'.

Now, why is arm64.release turned into simulate_arm64 you may ask? Probably for simplicity’s sake, as aarch64 was until very recently an exotic build environment and you wouldn’t want Android’s CI to run on Raspberry Pis. That being said, we need to patch this file.

The cleanest way would be to create a new target instead of simulate_arm64 but should it be pushed upstream, it would likely break Google CI. As we’re only trying to fix our local build, a simple fix fit in a small sed command:

sed -i -e "s/target_cpu=\"x64\" v8_target_cpu=\"arm64/target_cpu=\"arm64\" v8_target_cpu=\"arm64/".

With that, our build could proceed and we were able to get the artefacts we were looking for!

Conclusion

We went into this endeavor wanting to evaluate the maturity of the ARM ecosystem on servers. Having been interested by the topic for a while, I had heard that things had gotten much better and everything should work out of the box. Although most issues were fairly easy to fix, the problem is that ARM is still nowhere near mainstream on servers. And due to that, a lot of large software working on ARM smartphones break on ARM servers due to assumptions made right and left, especially in the build system.

Don’t read too much into our experience building V8 on ARM as far as predicting the future of the ARM platform. This is still early days. However, I think it can be an interesting anecdote on what to expect if you go in today: most thing will work, but don’t expect a perfectly smooth ride.

Building V8 on ARM: Insight 10/10

After the fact, we received a response to one of our inquiries from an ARM engineer. In this message, he told us that the V8 build config would have actually enabled us to use GCC instead of clang, and even to simply disable the plugin. This would like have saved us a couple of hours, and we hope that the post we linked to will save time for future users wanting to build V8 on ARM.

However, this doesn’t change my conclusion that V8’s build system was not ready to build on ARM out of the box. Moreover, I still expect this experience to be fairly common until those projects are fixed, one by one.

This, however, is a good problem to have. It means that everything else is working properly.

The post How we built V8 natively on ARM appeared first on Sqreen Blog | Modern Application Security.

... more @ blog.sqreen.io

blog.sqreen.io