Skip to content

Find duplicate files in multiple folder(s) scanning .txt or/and .torrent files and depending on the selected mode (readonly: true | false) get information about duplicated files /+ extract them into new folders

Notifications You must be signed in to change notification settings

Skippia/twin-scanner-cli

Repository files navigation

Twin scanner CLI

DescriptionTechnical StackFeaturesDX featuresDocumentationPre-requisitesQuick start

Description

  • Find duplicate files in multiple folder(s) scanning .txt or/and .torrent files and depending on the selected mode (readonly: true | false) get information about duplicated files /+ extract them into new folders.
  • Repo is implemented with emphasis on functional programming paradigm (where it's possible / adequate). The repository has undergone several major refactorings (though with minor changes to functionality). To track down this refactoring check the following branches, increasing the amount of functional code:
    1. original — imperative code
    2. refactor/functional-eslint — fullfilling eslint-plugin-functional
    3. refactor/fp-ts — huge refactoring rewriting main codebase to FP-TS

Technical Stack

Features

  • Nested scanning in one/multi folders to get info about files
  • Interactive CLI with step-by-step configuration and autocomplete for path selection
  • Supporting only .txt, .torrent file scanning or both formats simultaneously to find duplicates between multiple folders
  • Supporting readonly mode for casual listing info about duplicates without extraction them
  • Supporting of removing duplicates not only between cross folders, but into the same folder as well f.e:
    • For .torrent files: [ "cat.torrent", "cat (1).torrent", "cat (19).torrent"] => ["cat.torrent"]
    • For .txt files: remove equal and duplicate (by analogy with torrent file names logic) lines from .txt file
  • Supporting opportunity to define own custom mapper between torrent file name ([rutracker.org].3021606.torrent) and URL to torrent file locating in some txt file (https://rutracker.org/forum/viewtopic.php?t=3021606). It is assumed that text files contain only links to torrent files from which they can be downloaded, and there is a way to establish (to map) a correspondence between the name of the torrent file and the link to download it. It is also assumed that the link to the torrent file in any of the text files is redundant (duplicate) and there is no need to store information about it in this file if the torrent file that can be downloaded from this link is already stored in one of the user's folders — in this case, such a link is a duplicate and will be deleted from an original text file while the program is running in readonly: false mode.

DX features

Documentation

  • By default initial root folder for searching target folder(s) is defined as combination of first 2 subpaths to folder with cloned repository. F.e you have cloned repo in /home/username/projects/twin-scanner-cli, in such case initial root folder will be set as /home/username

Demo

  1. Setup configuration via CLI and its output: Demo

  2. Result (on the image is described file structure before and after applying CLI): Demo

Example of manual configuration

  • Setting VITE_APP_TORRENT_URL=https://rutracker.org/forum/viewtopic.php means that:

    • line in txt file https://rutracker.org/forum/viewtopic.php?t=3021606 and torrent file [rutracker.org].3021606.torrent will be considered the same during deduplication process
  • For overriding default mapper between torrent file and URL to torrent file, change extractTorrentFileNameFromURL, convertTorrentFilenameToURL functions and rebuild app.

Graph dependencies

  • Top-level
    • SVG
  • All code
    • All code

Pre-requisites

  • Linux-based OS
  • Node.js (checked on v20.15.1)
  • pnpm

Quick start

  1. Clone actual version of app (or only last version of app) on the same disk where is located folder(s) with duplicate files:
git clone https://github.com/Skippia/twin-scanner-cli.git
git clone --depth 1 https://github.com/Skippia/twin-scanner-cli.git
  1. Install dependencies
cd ./twin-scanner-cli && pnpm i
  1. Set env (url) for mapping between torrent name and torrent URL in txt files
    • Rename .env.example -> .env
    • Update env variable(s)
  2. Build project
npm run build
  1. Run project
npm run start:prod

About

Find duplicate files in multiple folder(s) scanning .txt or/and .torrent files and depending on the selected mode (readonly: true | false) get information about duplicated files /+ extract them into new folders

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published