-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP - New DialerGroupConfig interface for modules to given greater control over connection (L4/TLS/etc) establishment #506
base: master
Are you sure you want to change the base?
Conversation
Tangent/question: This probably could just be accomplished with the multiple module, and defining order appropriately, bu maybe this would also make it easier to sub in my own dialergroup to accomplish that? |
@@ -165,39 +175,27 @@ func (scanner *Scanner) Protocol() string { | |||
return "amqp091" | |||
} | |||
|
|||
func (scanner *Scanner) Scan(target zgrab2.ScanTarget) (zgrab2.ScanStatus, any, error) { | |||
conn, err := target.Open(&scanner.config.BaseFlags) | |||
func (scanner *Scanner) GetDialerGroupConfig() *zgrab2.DialerGroupConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be identical in all the classes. Is it really necessary?
Alternately, can we maybe simplify with struct embedding to clean up all the repetition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hear you with avoiding repetition, but other than the function header, each module will have a unique implementation of the DialerGroupConfig
. So I'd argue it's not too much more repetition.
Additionally, we'd lose the assertion that each module will provide this functionality, rather than expecting some struct to be embedded at the implementation level. This approach gives us compile-time checks that a module author has provided what the Framework requires.
modules/fox/scanner.go
Outdated
scanner.dialerGroupConfig = &zgrab2.DialerGroupConfig{ | ||
L4TransportProtocol: zgrab2.TransportTCP, | ||
BaseFlags: &f.BaseFlags, | ||
TLSEnabled: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why force tls vs predicate on UseTLS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I didn't intend to drop this feature from the module. Let me go back and double-check the other modules too.
modules/banner/scanner.go
Outdated
s.dialerGroupConfig = &zgrab2.DialerGroupConfig{ | ||
L4TransportProtocol: zgrab2.TransportTCP, | ||
BaseFlags: &f.BaseFlags, | ||
TLSEnabled: f.UseTLS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is banner the only one where TLSEnabled is based on a config arg?
See my other comment for why is TLSEnabled true in a lot of modules where something may or may not be over TLS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'd figured that in cases where we have a separate L4 Dialer it was fine to set TLSEnabled
to true
since this just populates the TLSWrapper
and then it's up to Scan()
to use it or not depending on the TLSEnabled
flag.
That being said, I think my approach was more prone to misconfiguration so I've moved the other modules to be similar to Banner
.
modules/amqp091/scanner.go
Outdated
} | ||
if f.UseTLS { | ||
scanner.dialerGroupConfig.TLSFlags = &f.TLSFlags | ||
scanner.dialerGroupConfig.TLSEnabled = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another example where we are setting TLSEnabled slightly differently than banner.
I wonder if we should just have some default "TLS Optional if enabled" dialer group for consistency, and then most of the configuration is in cases where NeedSeparateL4Dialer is true.
The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incredible work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling this, excited for this functionality! I was going to jump into some actual e2e testing, but I had some design thoughts before I do. Apologies in advance if I misunderstood some design elements.
if err != nil { | ||
logrus.Errorf("could not close connection to %v: %v", conn.RemoteAddr(), err) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious about what led to the need for CloseConnAndHandleError
; did you see errors in testing? My concern is this will be very noisy with sufficient scanning volume (e.g. broken connections prior to Close
, etc.), and I'm not sure we care to know if the Close
failed. Maybe logrus.Debugf
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured it was best practice to log any error that occurred since if we silently errored when closing connections, it'd be a lot easier to run out of sockets and just seems generally like a state we'd want to know about. And since all modules need to close connections, thought I'd just write a wrapper for it to make it easier to change 😄.
That being said, I don't particularly have an issue with making it Debug
so if someone notices they're running out of sockets they at least have a mechanism to learn why. I had figured tho that closing the connection more so just freed the socket and flushed any buffers, which if we couldn't close would be something I'd want to know about. WDYT?
defer func() { | ||
err := sshClient.Close() | ||
if err != nil { | ||
log.Errorf("error closing SSH client for target %s: %v", t.String(), err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intuition is this log.Error will be too noisy; do we care if we get an error on Close()?
TLSEnabled bool | ||
TLSFlags *TLSFlags // must be non-nil if TLSEnabled is true | ||
UDPFlags *UDPFlags // must be non-nil if L4TransportProtocol is TransportUDP | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand the design correctly:
Just as scan modules registered themselves, they now also provide a way to get a module-scoped DialerGroupConfig
for when a framework registers a module (or at least, prior to scan). So a framework will in effect:
- allocate a module
- call
GetDialerGroupConfig
so it knows what type of connection is needed - call
Scan
to perform that scan
Here are some design thoughts:
With a static DialGroupConfig
scoped to the scan module level, there are some things that become rigidly defined (the TransportAgnosticDialer
is fixed to whatever transport the static config contains, for instance), while others (L4Dialer
) allow per-target configuration of transport protocol.
I worry this could be an issue, because a module may want to opportunistically scan udp or tcp, for instance. Or, to have the BaseFlags.Timeout
apply to the total scan time of all connections in aggregate that comprise a protocol scan (so individual Dial connections within a scanner may want different timeouts)
It could be we just say this behavior is explicitly not supported. However, it may be cheap to achieve.
For each Scan
, the framework passes in a zgrab2.ScanTarget
struct, which is then also passed in to all the Dial
implementations.
A lot of the DialerGroupConfig
options actually feel like ScanTarget
scoped fields to me: which transport protocol to use, if a compat separate dialer is required, udp flags, tls flags. I think that is why some elements like TLSEnable and sometimes-static sometimes-not transport protocols feels a bit off to me, as well as the need to pass TLSWrapper around in some modules. I think this may indicate that the config is attached at the wrong level.
if zgrab2.ScanTarget
was not a struct, but rather an interface, then we could rely on a framework to implement zgrab2.ScanTarget
:
e.g.:
type ScanTarget interface {
IP() net.IP
Port() *uint
// ... etc.
Dial(zgrab2.DialerGroupConfig) // ...
L4Dialer(zgrab2.DialerGroupConfig) // ...
TLSWrapper(zgrab2.DialerGroupConfig, net.Conn) // ...
}
Then, Scan
can accept anything that fulfills ScanTarget
, and the scan module can make connections by just calling Dial
, L4Dialer
, TLSWrapper
with the appropriate DialerGroupConfig
. The framework that implements ScanTarget
would then be responsible for each of the methods to do the "right thing" given the DialerGroupConfig
. The DialerGroupConfig
as part of registration (or pre-Scan call) becomes unnecessary then.
Does this makes sense, or did I overlook some design intended functionality that would make this approach less optimal than the current implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the thoughtful response here. It's a very fair point that the current design means a module can't really scan a target on both TCP and UDP and that's probably something we don't want to write-off.
I think I have a solution though for this, which lets a module provide a network
and get either TCP or UDP.
dialerGroup.L4Dialer = func(scanTarget *ScanTarget) func(ctx context.Context, network, addr string) (net.Conn, error) {
return func(ctx context.Context, network, addr string) (net.Conn, error) {
switch network {
case "udp", "udp4", "udp6":
return GetDefaultUDPDialer(config.BaseFlags, config.UDPFlags)(ctx, scanTarget)
case "tcp", "tcp4", "tcp6":
return GetDefaultTCPDialer(config.BaseFlags)(ctx, scanTarget)
default:
return nil, fmt.Errorf("unsupported network type: %s", network)
}
}
}
This way, a module can do both by toggling the network
type when they dial
and we can provide both a UDP dialer and TCP one.
Back to your suggested interface, I wonder what the difference is in passing dialers around within the ScanTarget
(whether it's a struct or interface) vs. what we have now of passing in dialers in a call to Scan()
with a ScanTarget
? I think it maps better to my mental model of a ZGrab scan target where it's the "thing" you're scanning, not the connection mechanism by which you scan.
Additionally, to your point about a module scanning both UDP
and tcp
, this would be possible with a module just using the L4 dialer
and selecting network = tcp
or network = udp
. In my head, I think a module should request a TransportAgnosticDialer
if they don't need any further configuration or use of the dialer. They don't need to re-send a query over a different L4 protocol, they just do the protocol over whatever connection is provided.
Secondly, to your point about timeouts, having more fine-grained control over timeouts seems like a good idea, but I'm not sure how this design precludes that future addition. We could still add these as some ScanTargetTimeout
and ConnectionTimeout
to differentiate between timeout for an entire ScanTarget
vs. a specific connection (that might be re-tried with TCP/UDP).
…ork-makes-dialer-groups
…at can handle both UDP and TCP
…ork-makes-dialer-groups
…ork-makes-dialer-groups
Description and Motivation
Some users of ZGrab modules, especially those who are doing scanning of L7 hosts as part of a larger scanning pipeline, desire having more control over the connections ZGrab establishes/uses in order to scan.
This PR intends to modify the interface between protocol modules and the ZGrab framework in order to give users the ability to have full control over ZGrab's connections. This could include:
In sum, we want to give the user the ability to take full control of establishing a connection to a scanned host. We also want all existing functionality of ZGrab as a CLI tool to remain unaffected.
Protocol Differences in needs for a Connection
First, let's delineate between 2 classes of protocol.
Transport-Agnostic
The majority of protocols are transport-agnostic and can run on any L4 connection or a TLS-wrapped connection. There may be standard norms (HTTPS on TCP port 443, NTP on port 123 over UDP, etc) but the protocol will work over any if both sides agree.
Non-Transport-Agnostic
Some L7 protocols have certain demands of the underlying transport. For example, HTTPS needs to be able to follow re-directs to HTTP sites and vice-versa. The HTTP module will usually require both a L4 connection and a TLS-wrapped connection to be able to follow both of these types of redirects.
There are other protocols as well, some (SMTP for ex.) want to establish an L4 connection, send a
STARTTLS
packet, and then proceed with a TLS handshake. We could not just hand this module a pre-existing, post-handshake TLS connection and expect the protocol to work.Dialer Groups
In order to accomplish this in a clean abstraction, we've settled on the idea of a
DialerGroup
that is taken as input in every call toScan()
. Modules will provide a "default" dialer group setting the dialers to be as they'd commonly expect. Transport-agnostic modules will set theTransportAgnosticDialer
to their default L4/TLS settings, minding any flags the user sets.For example,
NTP
would set the dialer to be a UDP dialer connecting to port 123, by default.From
module.go
:Protocols like HTTP will instead use the
L4Dialer
andTLSWrapper
to specify the two types of connections they require.Users that require full control over connections can set these dialers to whichever they wish and the
Scan()
function will utilize them as specified.The new
Scan()
function looks as follows:DialerGroupConfig
Rather than having module authors define dialer groups, we'll have modules define a
DialerGroupConfig
that encapsulates what a module needs from its dialer.With this information, the framework can build a default dialer group to be used with standard CLI operation.
How it all comes together
The overall flow will look as follows:
Init(scanFlags)
, a scan module will use the user-provided scan flags (entries like what dst port to scan on, TLS flags to control aspects of the TLS connection, etc) to set aDialerGroupConfig
, providing the framework with all the info it needs to build the dialers.GetDialerGroupConfig
to get a module's dialer group config for "default" operation.Scan(ctx context.Context, dialerGroup *DialerGroup, t *ScanTarget)
.Scan()
to connect to the target.This let's users of the modules themselves inject their own dialers into
Scan()
in Step 5.In conclusion, this interface let's modules define a certain "default behavior" as each L7 protocol has differing norms on what connections it usually uses, while also giving developers who use these modules in larger scanning systems the ability to fully control all aspects of a connection.
Testing
WIP