Proposal: Using LLInt Asm in major architectures even if JIT is disabled

# Yusuke Suzuki (18 hours ago)

Recently, node-jsc is announced[1]. When I read the documents of that project, I found that they use LLInt ASM interpreter instead of CLoop in non-JIT environment. So I had one question in my mind: How fast the LLInt ASM interpreter when comparing to CLoop?

I've set up two builds. One is CLoop build (-DENABLE_JIT=OFF) and another is JIT build JSC with JSC_useJIT=false. And I've ran kraken benchmarks with these two builds in x64 Linux machine. The results are the followings.

Benchmark report for Kraken on sakura-trick.

VMs tested: "baseline" at /home/yusukesuzuki/dev/WebKit/WebKitBuild/nojit/Release/bin/jsc "patched" at /home/yusukesuzuki/dev/WebKit/WebKitBuild/nojit-llint/Release/bin/jsc

Collected 10 samples per benchmark/VM, with 10 VM invocations per benchmark. Emitted a call to gc() between sample measurements. Used 1 benchmark iteration per VM invocation for warm-up. Used the jsc-specific preciseTime() function to get microsecond-level timing. Reporting benchmark execution times with 95% confidence intervals in milliseconds.

                                       baseline

patched

ai-astar 3619.974+-57.095 ^ 2014.835+-59.016 ^ definitely 1.7967x faster audio-beat-detection 1762.085+-24.853 ^ 1030.902+-19.743 ^ definitely 1.7093x faster audio-dft 1822.426+-28.704 ^ 909.262+-16.640 ^ definitely 2.0043x faster audio-fft 1651.070+-9.994 ^ 865.203+-7.912 ^ definitely 1.9083x faster audio-oscillator 1853.697+-26.539 ^ 992.406+-12.811 ^ definitely 1.8679x faster imaging-darkroom 2118.737+-23.219 ^ 1303.729+-8.071 ^ definitely 1.6251x faster imaging-desaturate 3133.654+-28.545 ^ 1759.738+-18.182 ^ definitely 1.7808x faster imaging-gaussian-blur 16321.090+-154.893 ^ 7228.017+-58.508 ^ definitely 2.2580x faster json-parse-financial 57.256+-2.876 56.101+-4.265 might be 1.0206x faster json-stringify-tinderbox 38.470+-2.788 ? 38.771+-0.935 ? stanford-crypto-aes 851.341+-7.738 ^ 485.438+-13.904 ^ definitely 1.7538x faster stanford-crypto-ccm 556.133+-6.606 ^ 264.161+-3.970 ^ definitely 2.1053x faster stanford-crypto-pbkdf2 1945.718+-15.968 ^ 1075.013+-13.337 ^ definitely 1.8099x faster stanford-crypto-sha256-iterative 623.203+-7.604 ^ 349.782+-12.810 ^ definitely 1.7817x faster

<arithmetic> 2596.775+-14.857 ^

1312.383+-8.840 ^ definitely 1.9787x faster

Surprisingly, LLInt ASM interpreter is significantly faster than CLoop. I expected it would be fast, but it would show around 10% performance win. But the reality is that it is 2x faster. It is too much number to me to consider enabling LLInt ASM interpreter for non-JIT build configuration. As a bonus, LLInt ASM interpreter offers sampling profiler support even in non-JIT environment.

So my proposal is, how about enabling LLInt ASM interpreter in non-JIT configuration environment in major architectures (x64 and ARM64)?

Best regards, Yusuke Suzuki

[1]: lists.webkit.org/pipermail/webkit-dev/2018-September/030140.html

Contact us to advertise here
# Guillaume Emont (14 hours ago)

I did not run benchmarks with CLoop recently, but that has been my observation on MIPS in the past as well, and I would therefore expect to see similar results on Armv7, so I think it would make sense to do that on these platforms too. Obviously, in all cases we would want to have a cmake option to compile with CLoop, as that can be useful for testing/diagnosing issues.

Best regards,

Guillaume

Quoting Yusuke Suzuki (2018-09-19 08:23:43)

# Saam Barati (9 hours ago)

Did you turn off the RegExp JIT?

Want more features?

Request early access to our private beta of readable email premium.