Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Challenged requests analysis #5

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,10 @@ With the query in Example 6 , you have a list of tokens utilized by different I

[This query](sql/traffic_byrbruri.sql) gives the details about the traffic being blocked by your Rate Based Rules using a URI as an AGGREGATION KEY. It calculates the traffic blocked in 5 minute intervals during the specified time slot . You can add/change additional AGGREGATION KEYS per your WAF rules to validate that rules are working as per your requirements.

#### Example 10: Understanding IPs/clients with Challenged requests for a given set of days

[This query](sql/challenges_and_tokens_byip.sql) gives the details about the IPs/clients that got Challenged by WAF due to [WAF token](https://docs.aws.amazon.com/waf/latest/developerguide/waf-tokens.html) not being accepted. It can also uncover issues with [client application integrations](https://docs.aws.amazon.com/waf/latest/developerguide/waf-application-integration.html) that leverages AWS WAF tokens. You can further extend this analysis by using [this query](sql/persistent_challenges_byip.sql) to detect the number of IPs/clients that are getting persistently Challenged given a defined threshold (e.g. 5 challenged requests).

### Tips to make Athena queries faster
To improve query performance refer to [Athena performance tuning post](https://docs.aws.amazon.com/athena/latest/ug/performance-tuning.html). It is important to reduce the data being queried. Here are some additional tips to help improve your Athena queries.

Expand Down
23 changes: 21 additions & 2 deletions sql/bot.sql
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,26 @@
SUM(CASE WHEN label_items.name LIKE '%:bot-control:bot:%' THEN 1 ELSE 0 END) is_bot_common,
SUM(CASE WHEN label_items.name LIKE '%:bot-control:signal:automated_browser%' THEN 1 ELSE 0 END) is_automated_browser,
SUM(CASE WHEN label_items.name LIKE '%:signal:known_bot_data_center%' THEN 1 ELSE 0 END) is_known_bot_data_center,
SUM(CASE WHEN label_items.name LIKE '%:signal:cloud_service_provider%' THEN 1 ELSE 0 END) is_cloud_service_provider,
SUM(CASE WHEN label_items.name LIKE '%:signal:non_browser_user_agent%' THEN 1 ELSE 0 END) is_non_browser_user_agent,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:ip:token_absent%' THEN 1 ELSE 0 END) is_targeted_token_absent,
SUM(CASE WHEN label_items.name LIKE '%:signal:non_browser_header%' THEN 1 ELSE 0 END) is_non_browser_header,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:ip:token_absent%' THEN 1 ELSE 0 END) is_targeted_volumetric_token_absent,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:low%' THEN 1 ELSE 0 END) is_targeted_session_low,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:medium%' THEN 1 ELSE 0 END) is_targeted_session_medium,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:high%' THEN 1 ELSE 0 END) is_targeted_session_high,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:maximum%' THEN 1 ELSE 0 END) is_targeted_session_maximum,
SUM(CASE WHEN label_items.name LIKE '%:targeted:signal:automated_browser%' THEN 1 ELSE 0 END) is_targeted_automated_browser,
SUM(CASE WHEN label_items.name LIKE '%:targeted:signal:browser_inconsistency%' THEN 1 ELSE 0 END) is_targeted_browser_inconsistency,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:token_reuse:ip%' THEN 1 ELSE 0 END) is_targeted_token_reuse_by_ips, -- Indicates the use of a single token among more than 5 distinct IP addresses
SUM(CASE WHEN label_items.name LIKE '%:targeted:signal:browser_automation_extension%' THEN 1 ELSE 0 END) is_targeted_browser_automation_extension,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:token_reuse:ip:low%' THEN 1 ELSE 0 END) is_targeted_token_reuse_by_ips_low,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:token_reuse:ip:medium%' THEN 1 ELSE 0 END) is_targeted_token_reuse_by_ips_medium,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:token_reuse:ip:high%' THEN 1 ELSE 0 END) is_targeted_token_reuse_by_ips_high,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:token_reuse:asn:low%' THEN 1 ELSE 0 END) is_targeted_token_reuse_by_asn_low,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:token_reuse:asn:medium%' THEN 1 ELSE 0 END) is_targeted_token_reuse_by_asn_medium,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:token_reuse:asn:high%' THEN 1 ELSE 0 END) is_targeted_token_reuse_by_asn_high,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:token_reuse:country:low%' THEN 1 ELSE 0 END) is_targeted_token_reuse_by_country_low,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:token_reuse:country:medium%' THEN 1 ELSE 0 END) is_targeted_token_reuse_by_country_medium,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:volumetric:session:token_reuse:country:high%' THEN 1 ELSE 0 END) is_targeted_token_reuse_by_country_high,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:coordinated_activity:low%' THEN 1 ELSE 0 END) is_coordinated_activity_low,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:coordinated_activity:medium%' THEN 1 ELSE 0 END) is_coordinated_activity_medium,
SUM(CASE WHEN label_items.name LIKE '%:targeted:aggregate:coordinated_activity:high%' THEN 1 ELSE 0 END) is_coordinated_activity_high,
Expand All @@ -47,6 +61,10 @@
SUM(CASE WHEN label_items.name = 'awswaf:managed:token:accepted' THEN 1 ELSE 0 END) token_valid,
SUM(CASE WHEN label_items.name = 'awswaf:managed:token:rejected' THEN 1 ELSE 0 END) token_rejected,
SUM(CASE WHEN label_items.name = 'awswaf:managed:token:absent' THEN 1 ELSE 0 END) tokeN_absent,
SUM(CASE WHEN label_items.name = 'awswaf:managed:token:rejected:not_solved' THEN 1 ELSE 0 END) token_rejected_not_solved,
SUM(CASE WHEN label_items.name = 'awswaf:managed:token:rejected:expired' THEN 1 ELSE 0 END) token_rejected_expired,
SUM(CASE WHEN label_items.name = 'awswaf:managed:token:rejected:domain_mismatch' THEN 1 ELSE 0 END) token_rejected_domain_mismatch,
SUM(CASE WHEN label_items.name = 'awswaf:managed:token:rejected:invalid' THEN 1 ELSE 0 END) token_rejected_invalid,


-- Static Assets
Expand All @@ -62,6 +80,7 @@
-- Bot Categories
SUM(CASE WHEN label_items.name LIKE '%:bot-control:bot:category:advertising%' THEN 1 ELSE 0 END) advertising,
SUM(CASE WHEN label_items.name LIKE '%:bot-control:bot:category:archiver%' THEN 1 ELSE 0 END) archiver,
SUM(CASE WHEN label_items.name LIKE '%:bot-control:bot:category:ai%' THEN 1 ELSE 0 END) ai,
SUM(CASE WHEN label_items.name LIKE '%:bot-control:bot:category:content_fetcher%' THEN 1 ELSE 0 END) content_fetcher,
SUM(CASE WHEN label_items.name LIKE '%:bot-control:bot:category:email_client%' THEN 1 ELSE 0 END) email_client,
SUM(CASE WHEN label_items.name LIKE '%:bot-control:bot:category:link_checker%' THEN 1 ELSE 0 END) link_checker,
Expand Down
101 changes: 101 additions & 0 deletions sql/challenges_and_tokens_byip.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
with challenges_and_tokens as (
select
date,
httprequest.clientip AS clientip,
COUNT(distinct label_item.name) as unique_tokens,
SUM(
CASE
WHEN action = 'CHALLENGE' THEN 1
END
) as challenged_requests,
SUM(
CASE
WHEN ARRAY_JOIN(transform(labels, l -> l.name), ', ') like 'awswaf:managed:token:rejected%' THEN 1
ELSE 0
END
) token_rejected,
SUM(
CASE
WHEN ARRAY_JOIN(transform(labels, l -> l.name), ', ') like 'awswaf:managed:token:absent' THEN 1
ELSE 0
END
) token_absent,
SUM(
CASE
WHEN ARRAY_JOIN(transform(labels, l -> l.name), ', ') like '%:bot-control:TGT_TokenAbsent%' THEN 1
ELSE 0
END
) bot_control_token_absent,
SUM(
CASE
WHEN ARRAY_JOIN(transform(labels, l -> l.name), ', ') like '%:targeted:aggregate:volumetric:ip:token_absent%' THEN 1
ELSE 0
END
) bot_control_volumetric_token_absent
FROM
"waf_logs",
UNNEST(
CASE
WHEN cardinality(labels) >= 1 THEN labels
ELSE array [ cast(row('NOLABEL') as row(name varchar)) ]
END
) AS t(label_item)
WHERE
date >= date_format(current_date - interval '7' day, '%Y/%m/%d')
AND label_item.name LIKE 'awswaf:managed:token:id:%'
group by
1,
2
)
select
distinct date,
COUNT(distinct clientip) as total_ips,
COUNT(
distinct case
when challenged_requests > 0 then clientip
end
) as ips_challenged,
COUNT(
distinct case
when unique_tokens > 0 then clientip
end
) as ips_with_tokens,
COUNT(
distinct case
when (token_rejected > 0)
and (challenged_requests > 0) then clientip
end
) as ips_challenged_with_token_rejected,
COUNT(
distinct case
when (token_absent > 0)
and (challenged_requests > 0) then clientip
end
) as ips_challenged_with_token_absent,
COUNT(
distinct case
when (bot_control_token_absent > 0)
and (challenged_requests > 0) then clientip
end
) as ips_challenged_with_bot_control_token_absent,
COUNT(
distinct case
when (bot_control_volumetric_token_absent > 0)
and (challenged_requests > 0) then clientip
end
) as ips_challenged_with_bot_control_volumetric_token_absent,
COUNT(
distinct case
when (unique_tokens = 0)
and (
token_rejected > 0
OR token_absent > 0
OR bot_control_token_absent > 0
OR bot_control_volumetric_token_absent > 0
) then clientip
end
) as ips_with_no_tokens_ever
from
challenges_and_tokens
group by
1
47 changes: 47 additions & 0 deletions sql/persistent_challenges_byip.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
WITH challenges_and_tokens AS (
select
date,
httprequest.clientip AS clientip,
count(distinct label_item.name) as unique_tokens,
SUM(
CASE
WHEN action = 'CHALLENGE' THEN 1
END
) as challenged_requests
FROM
"waf_logs",
UNNEST(
CASE
WHEN cardinality(labels) >= 1 THEN labels
ELSE array [ cast(row('NOLABEL') as row(name varchar)) ]
END
) AS t(label_item)
WHERE
date >= date_format(current_date - interval '7' day, '%Y/%m/%d')
AND label_item.name LIKE 'awswaf:managed:token:id:%'
group by
1,
2
)
select
date,
count(distinct clientip) as total_ips,
count(
distinct case
when challenged_requests >= 5 then clientip
end
) as clientip_with_5_challenges,
count(
distinct case
when (
challenged_requests >= 5
and unique_tokens > 0
) then clientip
end
) as clientip_with_5_challenges_solved
from
challenges_and_tokens
group by
1
order by
1