Skip to content

Latest commit

 

History

History
124 lines (86 loc) · 7.08 KB

v8_exploration_II.md

File metadata and controls

124 lines (86 loc) · 7.08 KB

Exploring V8 Engine - II (Control Flow & Memory Structures)

In this post, we will start our exploration of V8 engine and look under the hood at call sequences that are made to execute a simple 'hello' + 'world' command. This post is a follow up to our Exploring V8 Engine - I post.

Setting up release.sample build (monolithic with debug)

To test out the V8 embedding, you need a compile your sample hello-world.cc file with the V8 source code. Doing this for each sample would be tedious. V8 provides you a way to build the V8 source as a standalone binary which we can directly use while building any samples which embed V8. This makes sure that you don't have to recompile the V8 source each time you make a small change to your sample.

The embedding V8 takes you through the process effortlessly. However, the x64.release.sample they build do not have debug symbols enabled (which we want). So, below is a slightly tweaked version of the gn args out.gn/x64.release.sample which includes all the extra debugging symbols we will require.

is_component_build = false
is_debug = true
target_cpu = "x64"
use_custom_libcxx = false
v8_monolithic = true
v8_use_external_startup_data = false
v8_enable_backtrace = true
v8_optimized_debug = false

Use the above config with gn args out.gn/x64.release.sample to enable all the symbols (similar to a debug build). Finally, while building your hello-world.cc example, add the -g flag to include your sample's source.

g++ -g -I. -Iinclude samples/hello-world.cc -o hello_world -lv8_monolith -Lout.gn/x64.release.sample/obj/ -pthread -std=c++0x -DV8_COMPRESS_POINTERS

This should now generate the hello_world executable which if you run:

Hello, World!
3 + 4 = 7

Tips on running GDB better

  • when building with is_debug flag and a monolithic binary, the build process populates v8/out.gn/x64.release.sample/obj with the debugging symbols. Thus, run GDB from v8/out.gn/x64.release.sample directory (or set you paths), so that you pick up the debugging symbols
    • Good:
    • Bad:
  • Configure your GDB to pickup the tools/gdbinit file

Now let's move on and hook up our good old friend GDB and see what's what.

Setting up GDB with GEF

I tried a couple of different debuggers:

  • plain GDB
  • plain LLDB
  • llnode (a LLDB extension for node analysis)

While llnode may be regarded much better for V8 analysis, it's maintained to work with the latest LTS release of NODEJS, which may not be most up to date with V8 master. This may cause issues with extensions giving you faulty data.

IMO, GDB with GEF is the best way to go along with using /tools/gdbinit file.

Now that we are all setup, let's start taking a look at the source code and the control flow.

hello-world.cc under a debugger

From our previous post, we understand the major ideas about Isolates,isolate_scope(), handle_scope and Contexts. To summarize, this is what the object structures would look like:

With that picture in our head, let's keep going.

Tracing under a debugger

Our trace starts at the block scope where we are trying to execute an one liner javascript code:

...
    { 
      // Create a string containing the JavaScript source code.
      v8::Local<v8::String> source =
          v8::String::NewFromUtf8Literal(isolate, "'Hello' + ', World!'");

      // Compile the source code.
      v8::Local<v8::Script> script =
          v8::Script::Compile(context, source).ToLocalChecked();

      // Run the script to get the result.
      v8::Local<v8::Value> result = script->Run(context).ToLocalChecked();

      // Convert the result to an UTF8 string and print it.
      v8::String::Utf8Value utf8(isolate, result);
      printf("%s\n", *utf8);
    }
...

The first just prepares the provided command into a utf-8 encoded javascript type string object, placed inside the target V8 isolate. This will later be consumed by or compiler to generate the object code.

Compiling

v8::Script::Compile(context, source) is where the magic begins. If you trace out the call structure for this function call, constructor in api.cc. Next, compiler.cc is hit where the function CompileScriptOnMainThread does all the work which is required to get the script compiled and ready to run.

This includes the flow hitting interpreter.cc and getting interpret. One good way is to use (gdb) rbreak filename.xx:. to break on all the functions of the file. This is what the backtrace looks like at that instance of time:

Running

Now that we have the code compiled and the env mostly setup, the only left out piece in the puzzle is, how does the code execute?

We see the run invocation on line 47. This line invokes v8::Script::Run which is responsible for handling the execution of the code in the provided context.

From the v8::Script::Run, Execution::Call is invoked which is responsible for setting up the remaining structures required to execute the interpret code and get back the result. Execution::Invoke is eventually invoked which converts the interpret intermediate bytecode into the platform-specific object code using GenerateCode and triggers the invocation of the code. Providing us the data in the structures we require.

Reflecting on this article

The main target of this post was to explore what main files and functions are touched during the compilation and execution of a simple example. More discussion on what code sections go what is described in this post.

The main idea to learn is while tracing have an idea of what file the control might hit and then use (gdb) rbreak filename:. or a similar eq to trace on a higher level rather than tracing the control flow line by line.