Skip to content

Latest commit

 

History

History
17 lines (16 loc) · 1.32 KB

TODO.mkd

File metadata and controls

17 lines (16 loc) · 1.32 KB
  • in the downloading code, handle 404 errors more gently. Place information that given file was not available due to broken link.
  • add more unit tests for different scenarios of algorithm and tree state e.g.:
    • tree with the root only
    • a large tree
    • tree with no missing nodes and without errors
    • root's children are leafs
  • in the Unit tests, when the mechanize.Browser opens the "/root/2011-07-02" webpage, the "HTTP Error 404: File not found" error sometimes shows up. The entry in the logs looks as follows: WARNING thread=="Thread-95", exception in node "/root/2011-07-12": "Moving to child failed". Detailed message: "HTTP Error 404: File not found"..
    • probably a fault of the HTTP file server?
  • ? change the order of nodes saved in the *.xml file: order by names, not by state of the node?
  • add command-line option to try to visit once again the ERROR nodes of a tree read from a file (it might be helpful because in case of a web crawler, the pages which were not available before might be available now)

Documentation

? generate source docs with sphinx instead of epydoc?

  • In the docs/algorithm/index.tex document,
    • make a distinction between tree as a data structure shared between the threads and explored tree
    • describe more precisely the example of the upward update of the state (the last figure).