-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: reduce the default allowed tags #350
Comments
I'd rather not mess with the default set for backward compatibility reasons. But perhaps it would be a good idea to provide static readymade sets for different use cases like https://github.com/rgrove/sanitize#configuration (restricted, basic, relaxed). What do you think? |
It could affect backward compatibility so it would have to be done for a major version, such as HtmlSanitizer v8.0. But I do think it should eventually be done, because in my opinion the defaults are too liberal to the extent that it would be silly to use the defaults. Static predefined sets could be an approach, another approach would be just to offer predefined sets as suggestions in the wiki. Either way, I do feel the defaults ought to be changed for an upcoming major version where breaking changes are to be expected. |
I agree in that we could change the defaults and up the major version. I'm unsure though what exactly we should use as guidance when selecting the default set. The original motivation was to eliminate only those elements that carry XSS potential. For example, https://github.com/rgrove/sanitize allows |
Right now it is possible to do some undesirable things such as using I think HtmlSanatizer is commonly used for comments, forum posts, blog posts and to author content in a CMS. I am not even sure there ought to be a default set at all. If a default set should exist, I would think it should be heavily restricted to perhaps just semantic elements for user-submitted comment; this would include |
I think any decision about changing the default needs to be based on deciding what is in scope for HtmlSanitizer and what isn't. Arbitrarily deciding what constitutes valid HTML feels out of scope, to me; as @mganss says, the main motivator for HtmlSanitizer is, well, sanitizing HTML. I do like the idea of having static predefined sets that can be passed to the various options. |
Good point @tiesont! Maybe move the default to a static class called var sanitizer = new HtmlSanitizer();
sanitizer.AllowedTags = Presets.AllXssSafeTags;
sanitizer.AllowedAttributes = Presets.AllXssSafeAttributes;
sanitizer.AllowedCssProperties = Presets.AllXssSafeCssProperties;
sanitizer.AllowedAtRules = Presets.AllXssSafeAtRules;
sanitizer.AllowedSchemes = Presets.AllXssSafeSchemes;
sanitizer.UriAttributes = Presets.AllXssSafeUriAttributes;
var html = "<p>Hello world!</p>";
var sanitized = sanitizer.Sanitize(html, "https://www.example.com"); Though this would kind of break the expectation that a class when newed up should be fully instantiated and ready to be used. But maybe it is fully initialized since the |
You'd probably want to follow the pattern used by a lot of .NET Core utilities - rather than pass in multiple parameters, pass in a configuration/options object (maybe even allow a factory method). The default constructor would just use the existing defaults. |
So something like: var options = new HtmlSanitizerOptions
{
AllowedTags = Presets.AllXssSafeTags;
AllowedAttributes = Presets.AllXssSafeAttributes;
AllowedCssProperties = Presets.AllXssSafeCssProperties;
AllowedAtRules = Presets.AllXssSafeAtRules;
AllowedSchemes = Presets.AllXssSafeSchemes;
UriAttributes = Presets.AllXssSafeUriAttributes;
};
var sanitizer = new HtmlSanitizer(options); |
Exactly. |
@mganss Thoughts? |
@vanillajonathan I like the idea. Would you like to implement this? Also I think it would be nice if you had readymade instances of the var s = new HtmlSanitizer(Presets.Default);
var s2 = new HtmlSanitizer(Presets.Relaxed);
var s3 = new HtmlSanitizer(Presets.Strict); |
Then you could pass in the presets to the var options = new HtmlSanitizerOptions(Presets.Strict);
var sanitizer = new HtmlSanitizer(options); This is the way used by JsonSerializerOptions. See #359. You can merge it already, it is not a breaking change, it just adds a constructor while leaving the existing constructor in place. The existing constructor could be decorated with the |
Would it make any sense to also provide a constructor which accepts a You might use it like so: var sanitizer = new HtmlSanitizer(() => {
var options = new HtmlSanitizerOptions(Presets.Strict);
// do some other stuff with options as necessary
// --> assuming actions that can't be called with initializer syntax
return options;
}); This particular example doesn't really have any benefit over just instantiating the options object and then passing it as a parameter, but I'd assume it would allow a factory delegate of some sort to be passed, too? |
Sounds reasonable. It would be an I've been hesitating on the presets. It would be singular not plural because plural are used for bitflags enum while singular for normal enums. Right now we only have the default preset so the enum would just have one variant. It could be called With pull request #370 the constructor taking arguments is removed and a new parameterless constructor is introduced which configures the If taking in an enum for presets as input parameter then the class would grow with a new if condition for every variant of the enum. Another approach would to use a factory to keep all that |
@vanillajonathan I think I'm a little confused - what are you referring to, with respect to enums? |
In the mentioned The Microsoft .NET Framework Design Guidelines guidelines state:
So this enum would be named something like Currently HtmlSanitizer doesn't have any presets, only a default configuration, so it would look something like: public enum Preset
{
Default
} But I am not sure we even should offer any enum member called public enum Preset
{
Relaxed
} If the public class HtmlSanitizerOptions
{
public HtmlSanitizerOptions { }
public HtmlSanitizerOptions(Preset preset)
{
if (preset == Preset.Relaxed)
{
AllowedTags = HtmlSanitizerDefaults.DefaultAllowedTags;
AllowedAttributes = HtmlSanitizerDefaults.DefaultAllowedAttributes;
AllowedCssProperties = HtmlSanitizerDefaults.DefaultAllowedCssProperties;
AllowedAtRules = HtmlSanitizerDefaults.DefaultAllowedAtRules;
AllowedSchemes = HtmlSanitizerDefaults.DefaultAllowedSchemes;
UriAttributes = HtmlSanitizerDefaults.DefaultUriAttributes;
}
}
} If more enum members were added to the Another approach could be a factory, such as: public static class HtmlSanitizerOptionsFactory
{
public static HtmlSanitizerOptions CreateRelaxed()
{
return new HtmlSanitizerOptions
{
AllowedTags = HtmlSanitizerDefaults.DefaultAllowedTags;
AllowedAttributes = HtmlSanitizerDefaults.DefaultAllowedAttributes;
AllowedCssProperties = HtmlSanitizerDefaults.DefaultAllowedCssProperties;
AllowedAtRules = HtmlSanitizerDefaults.DefaultAllowedAtRules;
AllowedSchemes = HtmlSanitizerDefaults.DefaultAllowedSchemes;
UriAttributes = HtmlSanitizerDefaults.DefaultUriAttributes;
}
}
} |
@vanillajonathan Ah, the enum is for the options class. That makes sense. I was trying to figure out where the HtmlSanitizer class would use an enum. I'd think your enums, a switch, and your factory proposal should keep the constructor relatively concise. |
@tiesont Re the @vanillajonathan I favor the second approach for the reasons you have mentioned. Why not have static members like so? public class HtmlSanitizerOptions
{
public static readonly HtmlSanitizerOptions Relaxed = CreateRelaxed();
private static HtmlSanitizerOptions CreateRelaxed()
{
// ...
}
} |
Well the enum could be on the But I was thinking of ditching the enum altogether and instead using a separate factory class with a factory method for each preset. This would avoid a long method with many if/else statements or switch. @mganss Yeah static members on the |
Perhaps the presets should be |
Remove all legacy/deprecated tags such as
acronym
,big
,center
,dir
,keygen
,menuitem
andstrike
.Remove non-semantic presentational HTML4 tags such as
b
,i
,u
andtt
.Remove tags dealing with forms such as
button
,datalist
,fieldset
,form
,input
,keygen
,label
,legend
,optgroup
,option
,output
andselect
. Remove UI tags such asmeter
,progress
.Remove the
main
tag just as thehtml
andbody
tags aren't allowed.The text was updated successfully, but these errors were encountered: