Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FormAnalyzer] scoring password hints #720

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 31 additions & 9 deletions dist/autofill-debug.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

40 changes: 31 additions & 9 deletions dist/autofill.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

33 changes: 26 additions & 7 deletions src/Form/FormAnalyzer.js
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,6 @@ class FormAnalyzer {
*/
this.signals = [];

// Analyse the input that was passed. This is pretty arbitrary, but historically it's been working nicely.
this.evaluateElAttributes(input, 1, true);

// If we have a meaningful container (a form), check that, otherwise check the whole page
if (form !== input) {
this.evaluateForm();
Expand All @@ -53,15 +50,18 @@ class FormAnalyzer {
return this;
}

areLoginOrSignupSignalsWeak() {
return Math.abs(this.autofillSignal) < 10;
}

/**
* Hybrid forms can be used for both login and signup
* @returns {boolean}
*/
get isHybrid() {
// When marking for hybrid we also want to ensure other signals are weak
const areOtherSignalsWeak = Math.abs(this.autofillSignal) < 10;

return this.hybridSignal > 0 && areOtherSignalsWeak;
return this.hybridSignal > 0 && this.areLoginOrSignupSignalsWeak();
}

get isLogin() {
Expand Down Expand Up @@ -146,7 +146,7 @@ class FormAnalyzer {
}

const signupRegexToUse = this.matching.getDDGMatcherRegex(shouldBeConservative ? 'conservativeSignupRegex' : 'signupRegex');
const matchesSignup = safeRegexTest(/new.?password/i, string) || safeRegexTest(signupRegexToUse, string);
const matchesSignup = safeRegexTest(/new.?(password|username)/i, string) || safeRegexTest(signupRegexToUse, string);

// In some cases a login match means the login is somewhere else, i.e. when a link points outside
if (shouldFlip) {
Expand Down Expand Up @@ -229,6 +229,15 @@ class FormAnalyzer {
});
}

evaluatePasswordHints() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looping through divs and spans is complex and potentially expensive. What if instead we looked at the whole form textContent and run the regex once per form? We'd need to increase the TEXT_LENGTH_CUTOFF, or have a carveout for this situation, like making safeRegexTest accept a parameter and then have another higher cutoff constant we can use here.

if (this.form.textContent) {
const hasPasswordHints = safeRegexTest(this.matching.getDDGMatcherRegex('passwordHintsRegex'), this.form.textContent, 200);
if (hasPasswordHints) {
this.increaseSignalBy(5, 'Password hints');
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have given a much stronger score to hints, as they seem to be a very clear indicator.

}
}
}

/**
* Function that checks if the element is an external link or a custom web element that
* encapsulates a link.
Expand Down Expand Up @@ -312,6 +321,11 @@ class FormAnalyzer {
// Check page title
this.evaluatePageTitle();

// Evaluate attributes of form's input elements
this.form.querySelectorAll(this.matching.cssSelector('formInputsSelector')).forEach((input) => {
this.evaluateElAttributes(input, 1, true);
});

// Check form attributes
this.evaluateElAttributes(this.form);

Expand All @@ -331,11 +345,16 @@ class FormAnalyzer {
}

// A form with many fields is unlikely to be a login form
const relevantFields = this.form.querySelectorAll(this.matching.cssSelector('genericTextField'));
const relevantFields = this.form.querySelectorAll(this.matching.cssSelector('genericTextInputField'));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a rename!

if (relevantFields.length >= 4) {
this.increaseSignalBy(relevantFields.length * 1.5, 'many fields: it is probably not a login');
}

// If we can't decide at this point, try reading password hints
if (this.areLoginOrSignupSignalsWeak()) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we see more examples in triages, I might just remove this check. But for now feels safer.

this.evaluatePasswordHints();
}

// If we can't decide at this point, try reading page headings
if (this.autofillSignal === 0) {
this.evaluatePageHeadings();
Expand Down
Loading
Loading