Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add detect sensitive info rule #1300

Merged
merged 57 commits into from
Aug 26, 2024
Merged
Show file tree
Hide file tree
Changes from 48 commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
ca24159
feat: add detect sensitive info rule
e-moran Aug 7, 2024
a1efc0a
move conversions into their own file
e-moran Aug 14, 2024
c3d5924
update to use detect function
e-moran Aug 15, 2024
efaec6a
remove generated files
e-moran Aug 15, 2024
e0ae73d
rework sensitiveInfo type signature to restrict custom types (#1344)
blaine-arcjet Aug 16, 2024
2753dd7
improve code structure
e-moran Aug 16, 2024
5ad53af
remove convert from gitignore
e-moran Aug 16, 2024
9b62117
cleanup
e-moran Aug 16, 2024
b8b24ba
restructure arcjet next request
e-moran Aug 16, 2024
fb55e2a
add guard
e-moran Aug 16, 2024
5864ae4
remove line
e-moran Aug 16, 2024
baa3a97
fix priority
e-moran Aug 16, 2024
eebbefb
add incoming message parsing and express example
e-moran Aug 19, 2024
38eb3d7
fix lint
e-moran Aug 19, 2024
cafc667
update directory
e-moran Aug 19, 2024
436a840
better stringify
e-moran Aug 19, 2024
1ef28e8
send sensitive info rule state to api
e-moran Aug 20, 2024
6a9d0dc
improve custom detect
e-moran Aug 20, 2024
a7370a5
remove extra any
e-moran Aug 20, 2024
0c64717
Merge branch 'main' into eoin/add-redact-rule
e-moran Aug 21, 2024
5279e89
pr suggestions
e-moran Aug 22, 2024
0681f51
readd express example
e-moran Aug 22, 2024
6fd3901
undo changes
e-moran Aug 22, 2024
51f97c8
Update middleware.ts
e-moran Aug 22, 2024
408d9b8
alphabetical and diff
blaine-arcjet Aug 22, 2024
d1159ec
fixup missed stuff in analyze duplication
blaine-arcjet Aug 22, 2024
090819d
get errorMessage for body error case
blaine-arcjet Aug 22, 2024
2856eb8
next impl cleanup
blaine-arcjet Aug 22, 2024
c5e15b5
unknown instead of any
blaine-arcjet Aug 22, 2024
d39a182
define clone type differently
blaine-arcjet Aug 22, 2024
7ca0944
return body if string
blaine-arcjet Aug 22, 2024
54de9c8
rename getBody in body package and remove unneeded code
blaine-arcjet Aug 22, 2024
a1a4617
rename and rework some of the node impl
blaine-arcjet Aug 22, 2024
66f0327
remove await return
blaine-arcjet Aug 22, 2024
91ac948
error message for sveltekit
blaine-arcjet Aug 22, 2024
0fb93cb
await all body to catch error and return undefined, add comment expla…
blaine-arcjet Aug 22, 2024
3bbe7e2
streamline body impl and tests
blaine-arcjet Aug 22, 2024
3f83640
handle some cases in arcjet core
blaine-arcjet Aug 22, 2024
d70d67b
undo diff changes
blaine-arcjet Aug 22, 2024
5506d8a
undo diff changes
blaine-arcjet Aug 22, 2024
c96154c
Merge remote-tracking branch 'origin/main' into eoin/add-redact-rule
blaine-arcjet Aug 22, 2024
ea0971f
update rollup to match versions in project
blaine-arcjet Aug 22, 2024
93a1491
lint
blaine-arcjet Aug 22, 2024
bcd731f
fmt
blaine-arcjet Aug 22, 2024
a096f70
pr changes
e-moran Aug 23, 2024
65473cb
unused import
e-moran Aug 23, 2024
6566026
unused import
e-moran Aug 23, 2024
1408e0d
lint
e-moran Aug 23, 2024
e351f82
update github actions
e-moran Aug 23, 2024
ebd5300
improve body tests based on eoins new tests
blaine-arcjet Aug 23, 2024
2518bdf
reset example
blaine-arcjet Aug 23, 2024
f962f05
pr suggestions
e-moran Aug 23, 2024
484d92e
Merge branch 'eoin/add-redact-rule' of github.com:arcjet/arcjet-js in…
e-moran Aug 23, 2024
db37eb3
regen example lockfile
blaine-arcjet Aug 23, 2024
98ec3c3
add reference
blaine-arcjet Aug 23, 2024
83a12a7
Update examples/express-sensitive-info/README.md
blaine-arcjet Aug 23, 2024
602cb4f
Update examples/nextjs-14-sensitive-info/app/api/arcjet/route.ts
blaine-arcjet Aug 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/.release-please-manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
"arcjet-next": "1.0.0-alpha.21",
"arcjet-node": "1.0.0-alpha.21",
"arcjet-sveltekit": "1.0.0-alpha.21",
"body": "1.0.0-alpha.21",
"decorate": "1.0.0-alpha.21",
"duration": "1.0.0-alpha.21",
"env": "1.0.0-alpha.21",
Expand Down
42 changes: 42 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,31 @@ updates:
- dependency-name: eslint
versions: [">=9"]

- package-ecosystem: npm
directory: /examples/nextjs-14-sensitive-info
schedule:
# Our dependencies should be checked daily
interval: daily
assignees:
- blaine-arcjet
reviewers:
- blaine-arcjet
commit-message:
prefix: deps(example)
prefix-development: deps(example)
groups:
dependencies:
patterns:
- "*"
ignore:
# Ignore updates to the @types/node package due to conflict between
# Headers in DOM.
- dependency-name: "@types/node"
versions: [">18.18"]
# TODO(#539): Upgrade to eslint 9
- dependency-name: eslint
versions: [">=9"]

- package-ecosystem: npm
directory: /examples/nextjs-14-app-dir-validate-email
schedule:
Expand Down Expand Up @@ -405,6 +430,23 @@ updates:
patterns:
- "*"

- package-ecosystem: npm
directory: /examples/express-sensitive-info
schedule:
# Our dependencies should be checked daily
interval: daily
assignees:
- blaine-arcjet
reviewers:
- blaine-arcjet
commit-message:
prefix: deps(example)
prefix-development: deps(example)
groups:
dependencies:
patterns:
- "*"

- package-ecosystem: npm
directory: /examples/nodejs-express-launchdarkly
schedule:
Expand Down
5 changes: 5 additions & 0 deletions .github/release-please-config.json
blaine-arcjet marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,10 @@
"component": "@arcjet/sveltekit",
"skip-github-release": true
},
"body": {
"component": "@arcjet/body",
"skip-github-release": true
},
"decorate": {
"component": "@arcjet/decorate",
"skip-github-release": true
Expand Down Expand Up @@ -127,6 +131,7 @@
"@arcjet/eslint-config",
"@arcjet/headers",
"@arcjet/ip",
"@arcjet/body",
"@arcjet/logger",
"@arcjet/protocol",
"@arcjet/rollup-config",
Expand Down
53 changes: 50 additions & 3 deletions analyze/edge-light.ts
Original file line number Diff line number Diff line change
@@ -1,13 +1,18 @@
import type { ArcjetLogger, ArcjetRequestDetails } from "@arcjet/protocol";

import * as core from "./wasm/arcjet_analyze_js_req.component.js";
import { instantiate } from "./wasm/arcjet_analyze_js_req.component.js";
import type {
ImportObject,
EmailValidationConfig,
BotDetectionResult,
BotType,
EmailValidationResult,
DetectedSensitiveInfoEntity,
SensitiveInfoEntities,
SensitiveInfoEntity,
SensitiveInfoResult,
} from "./wasm/arcjet_analyze_js_req.component.js";
import type { ArcjetJsReqSensitiveInformationIdentifier } from "./wasm/interfaces/arcjet-js-req-sensitive-information-identifier.js";

import componentCoreWasm from "./wasm/arcjet_analyze_js_req.component.core.wasm?module";
import componentCore2Wasm from "./wasm/arcjet_analyze_js_req.component.core2.wasm?module";
Expand All @@ -26,6 +31,9 @@ interface AnalyzeContext {
characteristics: string[];
}

type DetectSensitiveInfoFunction =
typeof ArcjetJsReqSensitiveInformationIdentifier.detect;

async function moduleFromPath(path: string): Promise<WebAssembly.Module> {
if (path === "arcjet_analyze_js_req.component.core.wasm") {
return componentCoreWasm;
Expand All @@ -40,9 +48,20 @@ async function moduleFromPath(path: string): Promise<WebAssembly.Module> {
throw new Error(`Unknown path: ${path}`);
}

async function init(context: AnalyzeContext) {
function noOpDetect(): SensitiveInfoEntity[] {
return [];
}

async function init(
context: AnalyzeContext,
detectSensitiveInfo?: DetectSensitiveInfoFunction,
) {
const { log } = context;

if (typeof detectSensitiveInfo !== "function") {
detectSensitiveInfo = noOpDetect;
}

const coreImports: ImportObject = {
"arcjet:js-req/logger": {
debug(msg) {
Expand All @@ -69,10 +88,13 @@ async function init(context: AnalyzeContext) {
return "unknown";
},
},
"arcjet:js-req/sensitive-information-identifier": {
detect: detectSensitiveInfo,
},
};

try {
return core.instantiate(moduleFromPath, coreImports);
return instantiate(moduleFromPath, coreImports);
} catch {
log.debug("WebAssembly is not supported in this runtime");
}
Expand All @@ -94,6 +116,9 @@ export {
* almost certain this request was not a bot.
*/
type BotDetectionResult,
type DetectedSensitiveInfoEntity,
type SensitiveInfoEntity,
type DetectSensitiveInfoFunction,
};

/**
Expand Down Expand Up @@ -161,3 +186,25 @@ export async function detectBot(
};
}
}
export async function detectSensitiveInfo(
context: AnalyzeContext,
candidate: string,
entities: SensitiveInfoEntities,
contextWindowSize: number,
detect?: DetectSensitiveInfoFunction,
): Promise<SensitiveInfoResult> {
const analyze = await init(context, detect);

if (typeof analyze !== "undefined") {
const skipCustomDetect = typeof detect !== "function";
return analyze.detectSensitiveInfo(candidate, {
entities,
contextWindowSize,
skipCustomDetect,
});
} else {
throw new Error(
"SENSITIVE_INFO rule failed to run because Wasm is not supported in this environment.",
);
}
}
54 changes: 51 additions & 3 deletions analyze/index.ts
blaine-arcjet marked this conversation as resolved.
Show resolved Hide resolved
blaine-arcjet marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -1,13 +1,18 @@
import type { ArcjetLogger, ArcjetRequestDetails } from "@arcjet/protocol";

import * as core from "./wasm/arcjet_analyze_js_req.component.js";
import { instantiate } from "./wasm/arcjet_analyze_js_req.component.js";
import type {
ImportObject,
EmailValidationConfig,
BotDetectionResult,
BotType,
EmailValidationResult,
DetectedSensitiveInfoEntity,
SensitiveInfoEntities,
SensitiveInfoEntity,
SensitiveInfoResult,
} from "./wasm/arcjet_analyze_js_req.component.js";
import type { ArcjetJsReqSensitiveInformationIdentifier } from "./wasm/interfaces/arcjet-js-req-sensitive-information-identifier.js";

import { wasm as componentCoreWasm } from "./wasm/arcjet_analyze_js_req.component.core.wasm?js";
import { wasm as componentCore2Wasm } from "./wasm/arcjet_analyze_js_req.component.core2.wasm?js";
Expand All @@ -26,6 +31,9 @@ interface AnalyzeContext {
characteristics: string[];
}

type DetectSensitiveInfoFunction =
typeof ArcjetJsReqSensitiveInformationIdentifier.detect;

// TODO: Do we actually need this wasmCache or does `import` cache correctly?
const wasmCache = new Map<string, WebAssembly.Module>();

Expand Down Expand Up @@ -54,9 +62,20 @@ async function moduleFromPath(path: string): Promise<WebAssembly.Module> {
throw new Error(`Unknown path: ${path}`);
}

async function init(context: AnalyzeContext) {
function noOpDetect(): SensitiveInfoEntity[] {
return [];
}

async function init(
context: AnalyzeContext,
detectSensitiveInfo?: DetectSensitiveInfoFunction,
) {
const { log } = context;

if (typeof detectSensitiveInfo !== "function") {
detectSensitiveInfo = noOpDetect;
}

const coreImports: ImportObject = {
"arcjet:js-req/logger": {
debug(msg) {
Expand All @@ -83,10 +102,13 @@ async function init(context: AnalyzeContext) {
return "unknown";
},
},
"arcjet:js-req/sensitive-information-identifier": {
detect: detectSensitiveInfo,
},
};

try {
return core.instantiate(moduleFromPath, coreImports);
return instantiate(moduleFromPath, coreImports);
} catch {
log.debug("WebAssembly is not supported in this runtime");
}
Expand All @@ -108,6 +130,9 @@ export {
* almost certain this request was not a bot.
*/
type BotDetectionResult,
type DetectedSensitiveInfoEntity,
type SensitiveInfoEntity,
type DetectSensitiveInfoFunction,
};

/**
Expand Down Expand Up @@ -175,3 +200,26 @@ export async function detectBot(
};
}
}

export async function detectSensitiveInfo(
context: AnalyzeContext,
candidate: string,
entities: SensitiveInfoEntities,
contextWindowSize: number,
detect?: DetectSensitiveInfoFunction,
): Promise<SensitiveInfoResult> {
const analyze = await init(context, detect);

if (typeof analyze !== "undefined") {
const skipCustomDetect = typeof detect !== "function";
return analyze.detectSensitiveInfo(candidate, {
entities,
contextWindowSize,
skipCustomDetect,
});
} else {
throw new Error(
"SENSITIVE_INFO rule failed to run because Wasm is not supported in this environment.",
);
}
}
Binary file modified analyze/wasm/arcjet_analyze_js_req.component.core.wasm
Binary file not shown.
Binary file modified analyze/wasm/arcjet_analyze_js_req.component.core2.wasm
Binary file not shown.
Binary file modified analyze/wasm/arcjet_analyze_js_req.component.core3.wasm
Binary file not shown.
28 changes: 28 additions & 0 deletions analyze/wasm/arcjet_analyze_js_req.component.d.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import type { SensitiveInfoEntity } from './interfaces/arcjet-js-req-sensitive-information-identifier.js';
export { SensitiveInfoEntity };
/**
* # Variants
*
Expand Down Expand Up @@ -35,16 +37,42 @@ export interface EmailValidationConfig {
allowDomainLiteral: boolean,
blockedEmails: Array<string>,
}
export type SensitiveInfoEntities = SensitiveInfoEntitiesAllow | SensitiveInfoEntitiesDeny;
export interface SensitiveInfoEntitiesAllow {
tag: 'allow',
val: Array<SensitiveInfoEntity>,
}
export interface SensitiveInfoEntitiesDeny {
tag: 'deny',
val: Array<SensitiveInfoEntity>,
}
export interface SensitiveInfoConfig {
entities: SensitiveInfoEntities,
contextWindowSize?: number,
skipCustomDetect: boolean,
}
export interface DetectedSensitiveInfoEntity {
start: number,
end: number,
identifiedType: SensitiveInfoEntity,
}
export interface SensitiveInfoResult {
allowed: Array<DetectedSensitiveInfoEntity>,
denied: Array<DetectedSensitiveInfoEntity>,
}
import { ArcjetJsReqEmailValidatorOverrides } from './interfaces/arcjet-js-req-email-validator-overrides.js';
import { ArcjetJsReqLogger } from './interfaces/arcjet-js-req-logger.js';
import { ArcjetJsReqSensitiveInformationIdentifier } from './interfaces/arcjet-js-req-sensitive-information-identifier.js';
export interface ImportObject {
'arcjet:js-req/email-validator-overrides': typeof ArcjetJsReqEmailValidatorOverrides,
'arcjet:js-req/logger': typeof ArcjetJsReqLogger,
'arcjet:js-req/sensitive-information-identifier': typeof ArcjetJsReqSensitiveInformationIdentifier,
}
export interface Root {
detectBot(headers: string, patternsAdd: string, patternsRemove: string): BotDetectionResult,
generateFingerprint(request: string, characteristics: Array<string>): string,
isValidEmail(candidate: string, options: EmailValidationConfig): EmailValidationResult,
detectSensitiveInfo(content: string, options: SensitiveInfoConfig): SensitiveInfoResult,
}

/**
Expand Down
Loading