Skip to content

Latest commit

 

History

History
8 lines (7 loc) · 390 Bytes

planning.md

File metadata and controls

8 lines (7 loc) · 390 Bytes

model building for the batch process.

what do we have?

  • from phishtank, we have a json.gz compressed file, inside the json is the phish_id and the url. from the url, we can extract the domain and the path.
  • from phishing.database, we have a txt file inside tar.gz archive, which has only the url. from the url, we can extract the domain and the path.

what do we need?