A brief overview of Nuclei Engine architecture. This document will be kept updated as the engine progresses.
Template is the basic unit of input to the engine which describes the requests to be made, matching to be done, data to extract, etc.
The template structure is described here. Template level attributes are defined here as well as convenience methods to validate, parse and compile templates creating executers.
Any attributes etc. required for the template, engine or requests to function are also set here.
Workflows are also compiled, their templates are loaded and compiled as well. Any validations etc. on the paths provided are also done here.
Parse
function is the main entry point which returns a template for a filePath
and executorOptions
. It compiles all the requests for the templates, all the workflows, as well as any self-contained request etc. It also caches the templates in an in-memory cache.
Preprocessors are also applied here which can do things at template level. They get data of the template which they can alter at will on runtime. This is used in the engine to do random string generation.
Custom processor can be used if they satisfy the following interface.
type Preprocessor interface {
Process(data []byte) []byte
}
Model package implements Information structure for Nuclei Templates. Info
contains all major metadata information for the template. Classification
structure can also be used to provide additional context to vulnerability data.
It also specifies a WorkflowLoader
interface that is used during workflow loading in template compilation stage.
type WorkflowLoader interface {
GetTemplatePathsByTags(tags []string) []string
GetTemplatePaths(templatesList []string, noValidate bool) []string
}
Protocols package implements all the request protocols supported by Nuclei. This includes http, dns, network, headless and file requests as of now.
It exposes a Request
interface that is implemented by all the request protocols supported.
// Request is an interface implemented any protocol based request generator.
type Request interface {
Compile(options *ExecuterOptions) error
Requests() int
GetID() string
Match(data map[string]interface{}, matcher *matchers.Matcher) (bool, []string)
Extract(data map[string]interface{}, matcher *extractors.Extractor) map[string]struct{}
ExecuteWithResults(input string, dynamicValues, previous output.InternalEvent, callback OutputEventCallback) error
MakeResultEventItem(wrapped *output.InternalWrappedEvent) *output.ResultEvent
MakeResultEvent(wrapped *output.InternalWrappedEvent) []*output.ResultEvent
GetCompiledOperators() []*operators.Operators
}
Many of these methods are similar across protocols while some are very protocol specific.
A brief overview of the methods is provided below -
- Compile - Compiles the request with provided options.
- Requests - Returns total requests made.
- GetID - Returns any ID for the request
- Match - Used to perform matching for patterns using matchers
- Extract - Used to perform extraction for patterns using extractors
- ExecuteWithResults - Request execution function for input.
- MakeResultEventItem - Creates a single result event for the intermediate
InternalWrappedEvent
output structure. - MakeResultEvent - Returns a slice of results based on an
InternalWrappedEvent
internal output event. - GetCompiledOperators - Returns the compiled operators for the request.
MakeDefaultResultEvent
function can be used as a default for MakeResultEvent
function when no protocol-specific features need to be implemented for result generation.
For reference protocol requests implementations, one can look at the below packages -
All these different requests interfaces are converted to an Executer which is also an interface defined in pkg/protocols
which is used during final execution of the template.
// Executer is an interface implemented any protocol based request executer.
type Executer interface {
Compile() error
Requests() int
Execute(input string) (bool, error)
ExecuteWithResults(input string, callback OutputEventCallback) error
}
The ExecuteWithResults
function accepts a callback, which gets provided with results during execution in form of *output.InternalWrappedEvent
structure.
The default executer is provided in pkg/protocols/common/executer
. It takes a list of Requests and relevant ExecuterOptions
and implements the Executer interface required for template execution. The executer during Template compilation process is created from this package and used as-is.
A different executer is the Clustered Requests executer which implements the Nuclei Request clustering functionality in pkg/templates
We have a single HTTP request in cases where multiple templates can be clustered and multiple operator lists to match/extract. The first HTTP request is executed while all the template matcher/extractor are evaluated separately.
For Workflow execution, a separate RunWorkflow function is used which executes the workflow independently of the template execution.
With this basic premise set, we can now start exploring the current runner implementation which will also walk us through the architecture of nuclei.
The first process after all CLI specific initialisation is the loading of template/workflow paths that the user wants to run. This is done by the packages described below.
This package is used to get paths using mixed syntax. It takes a template directory and performs resolving for template paths both from provided template and current user directory.
The syntax is very versatile and can include filenames, glob patterns, directories, absolute paths, and relative-paths.
Next step is the initialisation of the reporting modules which is handled in pkg/reporting
.
Reporting module contains exporters and trackers as well as a module for deduplication and a module for result formatting.
Exporters and Trackers are interfaces defined in pkg/reporting.
// Tracker is an interface implemented by an issue tracker
type Tracker interface {
CreateIssue(event *output.ResultEvent) error
}
// Exporter is an interface implemented by an issue exporter
type Exporter interface {
Close() error
Export(event *output.ResultEvent) error
}
Exporters include Elasticsearch
, markdown
, sarif
. Trackers include GitHub
, Gitlab
and Jira
.
Each exporter and trackers implement their own configuration in YAML format and are very modular in nature, so adding new ones is easy.
After reading all the inputs from various sources and initialisation other miscellaneous options, the next bit is the output writing which is done using pkg/output
module.
Output package implements the output writing functionality for Nuclei.
Output Writer implements the Writer interface which is called each time a result is found for nuclei.
// Writer is an interface which writes output to somewhere for nuclei events.
type Writer interface {
Close()
Colorizer() aurora.Aurora
Write(*ResultEvent) error
Request(templateID, url, requestType string, err error)
}
ResultEvent structure is passed to the Nuclei Output Writer which contains the entire detail of a found result. Various intermediary types like InternalWrappedEvent
and InternalEvent
are used throughout nuclei protocols and matchers to describe results in various stages of execution.
Interactsh is also initialised if it is not explicitly disabled.
Interactsh module is used to provide automatic Out-of-Band vulnerability identification in Nuclei.
It uses two LRU caches, one for storing interactions for request URLs and one for storing requests for interaction URL. These both caches are used to correlated requests received to the Interactsh OOB server and Nuclei Instance. Interactsh Client package does most of the heavy lifting of this module.
Polling for interactions and server registration only starts when a template uses the interactsh module and is executed by nuclei. After that no registration is required for the entire run.
Next we arrive in the RunEnumeration
function of the runner.
HostErrorsCache
is initialised which is used throughout the run of Nuclei enumeration to keep track of errors per host and skip further requests if the errors are greater than the provided threshold. The functionality for the error tracking cache is defined in hosterrorscache.go and is pretty simplistic in nature.
Next the WorkflowLoader
is initialised which used to load workflows. It exists in v2/pkg/parsers/workflow_loader.go
The loader is initialised moving forward which is responsible for Using Catalog, Passed Tags, Filters, Paths, etc. to return compiled Templates
and Workflows
.
First the input passed by the user as paths is normalised to absolute paths which is done by the pkg/catalog
module. Next the path filter module is used to remove the excluded template/workflows paths.
pkg/parsers
module's LoadTemplate
,LoadWorkflow
functions are used to check if the templates pass the validation + are not excluded via tags/severity/etc. filters. If all checks are passed, then the template/workflow is parsed and returned in a compiled form by the pkg/templates
's Parse
function.
Parse
function performs compilation of all the requests in a template + creates Executers from them returning a runnable Template/Workflow structure.
Clustering module comes in next whose job is to cluster identical HTTP GET requests together (as a lot of the templates perform the same get requests many times, it's a good way to save many requests on large scans with lots of templates).
Operators package implements all the matching and extracting logic of Nuclei.
// Operators contain the operators that can be applied on protocols
type Operators struct {
Matchers []*matchers.Matcher
Extractors []*extractors.Extractor
MatchersCondition string
}
A protocol only needs to embed the operators.Operators
type shown above, and it can utilise all the matching/extracting functionality of nuclei.
// MatchFunc performs matching operation for a matcher on model and returns true or false.
type MatchFunc func(data map[string]interface{}, matcher *matchers.Matcher) (bool, []string)
// ExtractFunc performs extracting operation for an extractor on model and returns true or false.
type ExtractFunc func(data map[string]interface{}, matcher *extractors.Extractor) map[string]struct{}
// Execute executes the operators on data and returns a result structure
func (operators *Operators) Execute(data map[string]interface{}, match MatchFunc, extract ExtractFunc, isDebug bool) (*Result, bool)
The core of this process is the Execute function which takes an input dictionary as well as a Match and Extract function and return a Result
structure which is used later during nuclei execution to check for results.
// Result is a result structure created from operators running on data.
type Result struct {
Matched bool
Extracted bool
Matches map[string][]string
Extracts map[string][]string
OutputExtracts []string
DynamicValues map[string]interface{}
PayloadValues map[string]interface{}
}
The internal logics for matching and extracting for things like words, regexes, jq, paths, etc. is specified in pkg/operators/matchers
, pkg/operators/extractors
. Those packages should be investigated for further look into the topic.
pkg/core
provides the engine mechanism which runs the templates/workflows on inputs. It exposes an Execute
function which does the task of execution while also doing template clustering. The clustering can also be disabled optionally by the user.
An example of using the core engine is provided below.
engine := core.New(r.options)
engine.SetExecuterOptions(executerOpts)
results := engine.ExecuteWithOpts(finalTemplates, r.hmapInputProvider, true)
An example of using Nuclei From Go Code to run templates on targets is provided below.
package main
import (
"fmt"
"log"
"os"
"path"
"github.com/logrusorgru/aurora"
"go.uber.org/ratelimit"
"github.com/projectdiscovery/goflags"
"github.com/projectdiscovery/nuclei/v2/pkg/catalog"
"github.com/projectdiscovery/nuclei/v2/pkg/catalog/config"
"github.com/projectdiscovery/nuclei/v2/pkg/catalog/loader"
"github.com/projectdiscovery/nuclei/v2/pkg/core"
"github.com/projectdiscovery/nuclei/v2/pkg/core/inputs"
"github.com/projectdiscovery/nuclei/v2/pkg/output"
"github.com/projectdiscovery/nuclei/v2/pkg/parsers"
"github.com/projectdiscovery/nuclei/v2/pkg/protocols"
"github.com/projectdiscovery/nuclei/v2/pkg/protocols/common/hosterrorscache"
"github.com/projectdiscovery/nuclei/v2/pkg/protocols/common/interactsh"
"github.com/projectdiscovery/nuclei/v2/pkg/protocols/common/protocolinit"
"github.com/projectdiscovery/nuclei/v2/pkg/protocols/common/protocolstate"
"github.com/projectdiscovery/nuclei/v2/pkg/reporting"
"github.com/projectdiscovery/nuclei/v2/pkg/testutils"
"github.com/projectdiscovery/nuclei/v2/pkg/types"
)
func main() {
cache := hosterrorscache.New(30, hosterrorscache.DefaultMaxHostsCount)
defer cache.Close()
mockProgress := &testutils.MockProgressClient{}
reportingClient, _ := reporting.New(&reporting.Options{}, "")
defer reportingClient.Close()
outputWriter := testutils.NewMockOutputWriter()
outputWriter.WriteCallback = func(event *output.ResultEvent) {
fmt.Printf("Got Result: %v\n", event)
}
defaultOpts := types.DefaultOptions()
protocolstate.Init(defaultOpts)
protocolinit.Init(defaultOpts)
defaultOpts.Templates = goflags.FileOriginalNormalizedStringSlice{"dns/cname-service-detection.yaml"}
defaultOpts.ExcludeTags = config.ReadIgnoreFile().Tags
interactOpts := interactsh.NewDefaultOptions(outputWriter, reportingClient, mockProgress)
interactClient, err := interactsh.New(interactOpts)
if err != nil {
log.Fatalf("Could not create interact client: %s\n", err)
}
defer interactClient.Close()
home, _ := os.UserHomeDir()
catalog := catalog.New(path.Join(home, "nuclei-templates"))
executerOpts := protocols.ExecuterOptions{
Output: outputWriter,
Options: defaultOpts,
Progress: mockProgress,
Catalog: catalog,
IssuesClient: reportingClient,
RateLimiter: ratelimit.New(150),
Interactsh: interactClient,
HostErrorsCache: cache,
Colorizer: aurora.NewAurora(true),
ResumeCfg: types.NewResumeCfg(),
}
engine := core.New(defaultOpts)
engine.SetExecuterOptions(executerOpts)
workflowLoader, err := parsers.NewLoader(&executerOpts)
if err != nil {
log.Fatalf("Could not create workflow loader: %s\n", err)
}
executerOpts.WorkflowLoader = workflowLoader
configObject, err := config.ReadConfiguration()
if err != nil {
log.Fatalf("Could not read config: %s\n", err)
}
store, err := loader.New(loader.NewConfig(defaultOpts, configObject, catalog, executerOpts))
if err != nil {
log.Fatalf("Could not create loader client: %s\n", err)
}
store.Load()
input := &inputs.SimpleInputProvider{Inputs: []string{"docs.hackerone.com"}}
_ = engine.Execute(store.Templates(), input)
engine.WorkPool().Wait() // Wait for the scan to finish
}
Protocols form the core of Nuclei Engine. All the request types like http
, dns
, etc. are implemented in form of protocol requests.
A protocol must implement the Protocol
and Request
interfaces described above in pkg/protocols
. We'll take the example of an existing protocol implementation - websocket for this short reference around Nuclei internals.
The code for the websocket protocol is contained in pkg/protocols/others/websocket
.
Below a high level skeleton of the websocket implementation is provided with all the important parts present.
package websocket
// Request is a request for the Websocket protocol
type Request struct {
// Operators for the current request go here.
operators.Operators `yaml:",inline,omitempty"`
CompiledOperators *operators.Operators `yaml:"-"`
// description: |
// Address contains address for the request
Address string `yaml:"address,omitempty" jsonschema:"title=address for the websocket request,description=Address contains address for the request"`
// declarations here
}
// Compile compiles the request generators preparing any requests possible.
func (r *Request) Compile(options *protocols.ExecuterOptions) error {
r.options = options
// request compilation here as well as client creation
if len(r.Matchers) > 0 || len(r.Extractors) > 0 {
compiled := &r.Operators
if err := compiled.Compile(); err != nil {
return errors.Wrap(err, "could not compile operators")
}
r.CompiledOperators = compiled
}
return nil
}
// Requests returns the total number of requests the rule will perform
func (r *Request) Requests() int {
if r.generator != nil {
return r.generator.NewIterator().Total()
}
return 1
}
// GetID returns the ID for the request if any.
func (r *Request) GetID() string {
return ""
}
// ExecuteWithResults executes the protocol requests and returns results instead of writing them.
func (r *Request) ExecuteWithResults(input string, dynamicValues, previous output.InternalEvent, callback protocols.OutputEventCallback) error {
// payloads init here
if err := r.executeRequestWithPayloads(input, hostname, value, previous, callback); err != nil {
return err
}
return nil
}
// ExecuteWithResults executes the protocol requests and returns results instead of writing them.
func (r *Request) executeRequestWithPayloads(input, hostname string, dynamicValues, previous output.InternalEvent, callback protocols.OutputEventCallback) error {
header := http.Header{}
// make the actual request here after setting all options
event := eventcreator.CreateEventWithAdditionalOptions(r, data, r.options.Options.Debug || r.options.Options.DebugResponse, func(internalWrappedEvent *output.InternalWrappedEvent) {
internalWrappedEvent.OperatorsResult.PayloadValues = payloadValues
})
if r.options.Options.Debug || r.options.Options.DebugResponse {
responseOutput := responseBuilder.String()
gologger.Debug().Msgf("[%s] Dumped Websocket response for %s", r.options.TemplateID, input)
gologger.Print().Msgf("%s", responsehighlighter.Highlight(event.OperatorsResult, responseOutput, r.options.Options.NoColor))
}
callback(event)
return nil
}
func (r *Request) MakeResultEventItem(wrapped *output.InternalWrappedEvent) *output.ResultEvent {
data := &output.ResultEvent{
TemplateID: types.ToString(r.options.TemplateID),
TemplatePath: types.ToString(r.options.TemplatePath),
// ... setting more values for result event
}
return data
}
// Match performs matching operation for a matcher on model and returns:
// true and a list of matched snippets if the matcher type is supports it
// otherwise false and an empty string slice
func (r *Request) Match(data map[string]interface{}, matcher *matchers.Matcher) (bool, []string) {
return protocols.MakeDefaultMatchFunc(data, matcher)
}
// Extract performs extracting operation for an extractor on model and returns true or false.
func (r *Request) Extract(data map[string]interface{}, matcher *extractors.Extractor) map[string]struct{} {
return protocols.MakeDefaultExtractFunc(data, matcher)
}
// MakeResultEvent creates a result event from internal wrapped event
func (r *Request) MakeResultEvent(wrapped *output.InternalWrappedEvent) []*output.ResultEvent {
return protocols.MakeDefaultResultEvent(r, wrapped)
}
// GetCompiledOperators returns a list of the compiled operators
func (r *Request) GetCompiledOperators() []*operators.Operators {
return []*operators.Operators{r.CompiledOperators}
}
// Type returns the type of the protocol request
func (r *Request) Type() templateTypes.ProtocolType {
return templateTypes.WebsocketProtocol
}
Almost all of these protocols have boilerplate functions for which default implementations have been provided in the providers
package. Examples are the implementation of Match
, Extract
, MakeResultEvent
, GetCompiledOperators`, etc. which are almost same throughout Nuclei protocols code. It is enough to copy-paste them unless customization is required.
eventcreator
package offers CreateEventWithAdditionalOptions
function which can be used to create result events after doing request execution.
Step by step description of how to add a new protocol to Nuclei -
-
Add the protocol implementation in
pkg/protocols
directory. If it's a small protocol with fewer options, considering adding it to thepkg/protocols/others
directory. Add the enum for the new protocol tov2/pkg/templates/types/types.go
. -
Add the protocol request structure to the
Template
structure fields. This is done inpkg/templates/templates.go
with the corresponding import line.
import (
...
"github.com/projectdiscovery/nuclei/v2/pkg/protocols/others/websocket"
)
// Template is a YAML input file which defines all the requests and
// other metadata for a template.
type Template struct {
...
// description: |
// Websocket contains the Websocket request to make in the template.
RequestsWebsocket []*websocket.Request `yaml:"websocket,omitempty" json:"websocket,omitempty" jsonschema:"title=websocket requests to make,description=Websocket requests to make for the template"`
...
}
Also add the protocol case to the Type
function as well as the TemplateTypes
array in the same templates.go
file.
// TemplateTypes is a list of accepted template types
var TemplateTypes = []string{
...
"websocket",
}
// Type returns the type of the template
func (t *Template) Type() templateTypes.ProtocolType {
...
case len(t.RequestsWebsocket) > 0:
return templateTypes.WebsocketProtocol
default:
return ""
}
}
- Add the protocol request to the
Requests
function andcompileProtocolRequests
function in thecompile.go
file in same directory.
// Requests return the total request count for the template
func (template *Template) Requests() int {
return len(template.RequestsDNS) +
...
len(template.RequestsSSL) +
len(template.RequestsWebsocket)
}
// compileProtocolRequests compiles all the protocol requests for the template
func (template *Template) compileProtocolRequests(options protocols.ExecuterOptions) error {
...
case len(template.RequestsWebsocket) > 0:
requests = template.convertRequestToProtocolsRequest(template.RequestsWebsocket)
}
template.Executer = executer.NewExecuter(requests, &options)
return nil
}
That's it, you've added a new protocol to Nuclei. The next good step would be to write integration tests which are described in integration-tests
and cmd/integration-tests
directories.
To enable dumping of Memory profiling data, -profile-mem
flag can be used along with path to a file. This writes a pprof formatted file which can be used for investigate resource usage with pprof
tool.
$ nuclei -t nuclei-templates/ -u https://example.com -profile-mem mem.pprof
To view profile data in pprof, first install pprof. Then run the below command -
$ go tool pprof mem.pprof
To open a web UI on a port to visualize debug data, the below command can be used.
$ go tool pprof -http=:8081 mem.pprof
- v2/pkg/reporting - Reporting modules for nuclei.
- v2/pkg/reporting/exporters/sarif - Sarif Result Exporter
- v2/pkg/reporting/exporters/markdown - Markdown Result Exporter
- v2/pkg/reporting/exporters/es - Elasticsearch Result Exporter
- v2/pkg/reporting/dedupe - Dedupe module for Results
- v2/pkg/reporting/trackers/gitlab - Gitlab Issue Tracker Exporter
- v2/pkg/reporting/trackers/jira - Jira Issue Tracker Exporter
- v2/pkg/reporting/trackers/github - GitHub Issue Tracker Exporter
- v2/pkg/reporting/format - Result Formatting Functions
- v2/pkg/parsers - Implements template as well as workflow loader for initial template discovery, validation and - loading.
- v2/pkg/types - Contains CLI options as well as misc helper functions.
- v2/pkg/progress - Progress tracking
- v2/pkg/operators - Operators for Nuclei
- v2/pkg/operators/common/dsl - DSL functions for Nuclei YAML Syntax
- v2/pkg/operators/matchers - Matchers implementation
- v2/pkg/operators/extractors - Extractors implementation
- v2/pkg/catalog - Template loading from disk helpers
- v2/pkg/catalog/config - Internal configuration management
- v2/pkg/catalog/loader - Implements loading and validation of templates and workflows.
- v2/pkg/catalog/loader/filter - Filter filters templates based on tags and paths
- v2/pkg/output - Output module for nuclei
- v2/pkg/workflows - Workflow execution logic + declarations
- v2/pkg/utils - Utility functions
- v2/pkg/model - Template Info + misc
- v2/pkg/templates - Templates core starting point
- v2/pkg/templates/cache - Templates cache
- v2/pkg/protocols - Protocol Specification
- v2/pkg/protocols/file - File protocol
- v2/pkg/protocols/network - Network protocol
- v2/pkg/protocols/common/expressions - Expression evaluation + Templating Support
- v2/pkg/protocols/common/interactsh - Interactsh integration
- v2/pkg/protocols/common/generators - Payload support for Requests (Sniper, etc.)
- v2/pkg/protocols/common/executer - Default Template Executer
- v2/pkg/protocols/common/replacer - Template replacement helpers
- v2/pkg/protocols/common/helpers/eventcreator - Result event creator
- v2/pkg/protocols/common/helpers/responsehighlighter - Debug response highlighter
- v2/pkg/protocols/common/helpers/deserialization - Deserialization helper functions
- v2/pkg/protocols/common/hosterrorscache - Host errors cache for tracking erroring hosts
- v2/pkg/protocols/offlinehttp - Offline http protocol
- v2/pkg/protocols/http - HTTP protocol
- v2/pkg/protocols/http/race - HTTP Race Module
- v2/pkg/protocols/http/raw - HTTP Raw Request Support
- v2/pkg/protocols/headless - Headless Module
- v2/pkg/protocols/headless/engine - Internal Headless implementation
- v2/pkg/protocols/dns - DNS protocol
- v2/pkg/projectfile - Project File Implementation
- The matching as well as interim output functionality is a bit complex, we should simplify it a bit as well.