This repository has been archived by the owner on Jan 17, 2023. It is now read-only.
Retrieve and parse web page (remote resource in general) incrementally #19
Labels
optimization
Makes the software use less resources or run faster
We only want to get some information about the URL, and we acknowledge this information won't be perfect, as we'll need to make assumptions and use heuristics to figure out where to get the information from.
Because of this, to avoid retrieving huge documents and avoid parsing huge documents, we should ideally retrieve and parse the remote resource incrementally, and stop when we have enough information about it. For example, generated links will always have a maximum length, and if we are asked to generate a link for a resource storing the complete Shakespeare works, we only need to get the first 4K at most and then we are done. A lot of CPU power and network traffic can be saved this way.
The text was updated successfully, but these errors were encountered: