Skip to content

Latest commit

 

History

History
166 lines (115 loc) · 7.1 KB

Browser.md

File metadata and controls

166 lines (115 loc) · 7.1 KB

Browser Guide

The Usage is documented in the Browser's README.md file.

Architecture

The Browser inside /browser/source is strictly free-of-DOM and its implementation does not contain any document or window related code.

The only used Web API is the WebSocket API inside Browser Client which can be interchanged with the Stealth Client.

The Browser UI inside /browser/design receives all data via events, and does all interaction with the Browser's public methods.

The Browser UI uses the Web Browser Engine to make the UI interactive, so it uses HTML5, CSS3 and ES2018 (with respects to DOM APIs and Web APIs) to render itself and make itself interactive.

This allows to use the Browser inside a node.js only context, and allows to build a custom Web Scraper that runs only on servers or terminal environments.

Codebase

The Stealth Service contains a Browser, so it might be confusing which implementations can run in which environment.

As there's also the technological limitation that iframes are different ECMAscript sandboxes, you'll see custom data types and validations mechanisms across the codebase as the instanceof and typeof operators don't work.

The below table shows which file of the Browser codebase is imported from where. The vision of this is that building a new (custom) Browser should require only writing a new Client.mjs that can reuse all Network Services in order to communicate to the Stealth Service.

Everything in the Browser UI is nothing more than a remote control for the Stealth Service, whereas the Browser.mjs file is isomorphic and allows to create node.js based scrapers out of the box, too.

  • Browser, Covert and Stealth have the extern/base.mjs file that is imported from the Base Library.
  • Browser only has a Client.mjs and ENVIRONMENT.mjs, everything else in /browser/source is imported from /stealth/source.
  • /browser/extern/base.mjs is imported via browser/make.mjs from /base/build/browser.mjs.
  • /covert/extern/base.mjs is imported via covert/make.mjs from /base/build/node.mjs.
  • /stealth/extern/base.mjs is imported via stealth/make.mjs from /base/build/node.mjs.
Path Browser Stealth Notes
extern/base.mjs -> -> built via base
source/client/*.mjs <- x
source/parser/*.mjs <- x
source/Browser.mjs <- x
source/Session.mjs <- x
source/Tab.mjs <- x
------------------------ ------- ------- ---------------------------------
design/*.mjs x requires Browser or Webview
internal/*.mjs x requires Browser or Webview
source/Client.mjs x x always Platform-specific
source/ENVIRONMENT.mjs x x always Platform-specific

Execution Process

As explained above, the Browser Project reuses most of the implementations from the Stealth Service.

The browser/make.mjs builds the Browser and imports all necessary files from the Base Library and Stealth Service.

The browser/browser.mjs starts a preinstalled Browser Engine to open the Browser UI as a Progressive Web App.

The only difference between the Browser codebase and the Stealth codebase are these files:

The Browser UI needs an HTML5/CSS3 environment as it is implemented using Web Components and requires either the webview or an iframe element available.

Supported Webviews

In order to enable browser/browser.mjs to start a native preinstalled Browser's Webview, these are the requirements:

On GNU/Linux either of these:

  • chromium (requires (Ungoogled) Chromium version 70+)
  • electron (requires Electron version 8+)
  • gjs (included with GNOME, requires WebKit2 GTK version 4+)

On MacOS either of these:

  • Chromium.app (requires (Ungoogled) Chromium version 70+)
  • Safari.app (requires Safari version 12+)

On Windows either of these:

  • chrome.exe (requires (Ungoogled) Chromium version 70+)

Engine / Webview API Requirements

If you want to build a native Browser Engine, these are the current requirements for the Browser UI and Internal Pages:

  • Render HTML5, DOM Level 3 and CSS3 (to reuse /browser/design).
  • ECMAScript 2016 (with <script type="module"> support).
  • WebSocket API support.
  • iframe support.
  • HTMLIFrameElement support.
  • document.cookie support to transmit Cookie: session=<id>;path=/stealth headers for requests to /stealth/ URLs.
  • window.parent support to access the window.browser property outside the <iframe> element that renders the Internal Pages.
  • (Optionally) Transmission of tab id that is prefixed as /stealth/:<id>:/ and /stealth/:<id>,[flags]:/.

Site Modes

The Site Modes decide what type of content to load from a specific URL and what to optimize in the loaded HTML content in regards to what is being displayed and what not.

Media Types

Media Types and their representations in Stealth are compliant with IANA Assignments.

Media Types are represented by the MIME Object that is returned by the URL Parser's parse(url) method.

A typical MIME Object looks like this:

{
	"ext":    "abw",
	"type":   "other",
	"binary": true,
	"format": "application/x-abiword"
}

The Definition of a MIME Object's type property influences the loading behaviour (and is equivalent to the Site Modes menu bar in the Browser UI).

  • text loads text files.
  • image loads image files.
  • audio loads audio files.
  • video loads video files.
  • other downloads files that cannot be rendered.