Skip to content

Instantly share code, notes, and snippets.

@ma11hew28
Last active January 25, 2022 18:48
Show Gist options
  • Select an option

  • Save ma11hew28/571405 to your computer and use it in GitHub Desktop.

Select an option

Save ma11hew28/571405 to your computer and use it in GitHub Desktop.

Revisions

  1. ma11hew28 revised this gist Dec 15, 2016. 1 changed file with 11 additions and 16 deletions.
    27 changes: 11 additions & 16 deletions find-duplicate-files.rb
    Original file line number Diff line number Diff line change
    @@ -1,23 +1,18 @@
    # This Ruby script (regardless of where it's located on the file system) recur-
    # sively lists all duplicate files in the direcotry in which it's executed.

    require 'digest/md5'

    hash = {}

    Dir.glob("**/*", File::FNM_DOTMATCH).each do |filename|
    next if File.directory?(filename)
    # puts 'Checking ' + filename

    key = Digest::MD5.hexdigest(IO.read(filename)).to_sym
    if hash.has_key? key
    # puts "same file #{filename}"
    hash[key].push filename
    else
    hash[key] = [filename]
    end
    Dir.glob('**/*', File::FNM_DOTMATCH).each do |f|
    next if File.directory?(f)
    key = Digest::MD5.hexdigest(IO.read(f)).to_sym
    if hash.has_key?(key) then hash[key].push(f) else hash[key] = [f] end
    end

    hash.each_value do |filename_array|
    if filename_array.length > 1
    puts "=== Identical Files ===\n"
    filename_array.each { |filename| puts ' '+filename }
    end
    hash.each_value do |a|
    next if a.length == 1
    puts '=== Identical Files ==='
    a.each { |f| puts "\t" + f }
    end
  2. ma11hew28 created this gist Sep 9, 2010.
    23 changes: 23 additions & 0 deletions find-duplicate-files.rb
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,23 @@
    require 'digest/md5'

    hash = {}

    Dir.glob("**/*", File::FNM_DOTMATCH).each do |filename|
    next if File.directory?(filename)
    # puts 'Checking ' + filename

    key = Digest::MD5.hexdigest(IO.read(filename)).to_sym
    if hash.has_key? key
    # puts "same file #{filename}"
    hash[key].push filename
    else
    hash[key] = [filename]
    end
    end

    hash.each_value do |filename_array|
    if filename_array.length > 1
    puts "=== Identical Files ===\n"
    filename_array.each { |filename| puts ' '+filename }
    end
    end