Skip to content

aumkarrb/Image_Deduplication_Algorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image_Deduplication_Algorithm

Uses perceptual hashing to identify similar images, even if they've been resized, slightly modified, or saved in different formats (Uses std_image.h header library). Here's how it functions:

1. Perceptual Hashing - Unlike cryptographic hashes (MD5, SHA), perceptual hashes identify visually similar content:

  • Reduce each image to 8x8 grayscale
  • Calculate the average pixel value
  • Create a 64-bit hash where each bit represents if a pixel is above/below average

2. Image Comparison - Images are compared using Hamming distance (number of different bits) between their hashes

Features:

  • Processes common image formats (JPG, PNG, BMP, GIF, TIFF)
  • Configurable similarity threshold
  • Reports duplicate sets with similarity scores

About

Image_Deduplication_Algorithm using Perceptual Hashing and Hamming Distance

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published