Skip to content
This repository has been archived by the owner on Apr 30, 2021. It is now read-only.

This commit adds a check to limit the amount of html code you will evaluate #43

Closed
wants to merge 1 commit into from
Closed

This commit adds a check to limit the amount of html code you will evaluate #43

wants to merge 1 commit into from

Conversation

tassoman
Copy link
Contributor

@tassoman tassoman commented Feb 7, 2014

This pull request is related to issue #42

// Overridable configuration,
if(false == defined('AC_MAX_VALIDATION_SIZE'))
{
define('AC_MAX_VALIDATION_SIZE', 256*1024);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how you come up with this maximum size? Every server has its exclusive available memory size at different moments. Could 256*1024 be too large or small in different cases?

This is the chunk of code I put in before for memory calculation: https://github.com/atutor/AChecker/blob/master/checker/index.php#L196-L200. Apparently, this calculation didn't cover all scenarios since we still run into the memory issue sometimes. :) But this code illustrates my thoughts: due to the variability of the available memory, rather than relying on a human to blindly guess how much space is needed, I did some monitoring which ended up with a formula that 1K of input html roughly requires 160K memory to process it (https://github.com/atutor/AChecker/blob/master/include/classes/Utility.class.php#L198). However, as what the reality proves, this is not true all the time since the number of html tags in this 1K and how simple html dom library handles those tags are also factors for needed memory.

Your thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a first time I wondered the error could be surpassing the max_post_size ini variable. But.. Usually it's abt 8M so, you could easily expect up to 8Mbs of html stuff into the text area.
I bet any dom parser is unable to parse so much code. So I ended up with an arbitrary decision (256KB) because >90% of web pages should be smaller.
I'm aware of your try to measure available memory during all the process but as I feel it's run at an higher level and will not work if stuffs broke before so, although measurement is correct, i feel it's run too late in the flow and doesn't triggers $has_enough_memory.

As you know, generally, limiting and not trusting user input is always a good practice for integrity of the system.
That's why I wouldn't parse unconditionally any size of input they throw but set a reasonable limit that could fit majority of web pages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The input html could be restricted to be less than 256K but it has nothing to do with the memory problem. The message for this checking of course should not use "NO_ENOUGH_MEMOEY".

Btw, the memory checking code with hasEnoughMemory() is at the start before any validation starts. Curious why would it be too late in the flow?

@tassoman
Copy link
Contributor Author

tassoman commented Feb 7, 2014

I agree with you that 'NO_ENOUGH_MEMORY' could be not exactly but doesn't needs any new SQL statement that would need the translation of another new string.

Forgive me when I said "late" but in the case of checker/index.php that is procedural code, it's really run 'before' than validate() method. So I would mean 'sooner'.

You can see here in the code: you check for memory availability before, then run validation() method, that eats all the 🍰 memory that in case of too much input by the user it broke.

@cindyli
Copy link
Contributor

cindyli commented Jun 19, 2018

Closed due to inactivity.

@cindyli cindyli closed this Jun 19, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants