DelADupe

Motivation

I have a folder where I've thrown stuff into for a looooong time. It's full of cruft and duplicated data. Some of this data also lives in my current setup, which is well organized. But there are still some pearls in there, that I don't want to delete, but need to fish out from the mess of millions of loose files.

This program will take directory A (the old rusty backup), and directory B (the sparkling current setup where everything is neat an organized). It will list all the files that are in A, and also in B. After some rigorous testing I hope to be able to actually make it delete the duplicates from dir A.

Basic operation

DelADupe takes 2 lists of folders:

keep - the folders that will be considered for deduplication, but no files will be deleted. purge - the folders that will have files deleted from.

If a file exists multiple times, but only inside keep folders, all its instances will be kept.

If a file exists multiple times, but only inside purge folders, all instances will be deleted but one (Ordering has not been considered as of the time of writing, so which one is kept should be considered random).

If a file exists multiple times in both keep and purge folders, then all the ones in keep will be kept, and all the ones in purge will be... well... purged.

Questions

Why not fdupes

Fdupes is pretty cool, but it deletes all duplications (unless you want to check one by one), and it is far from parallelized. I don't want to delete any duplicates in dir B though. I already ran fdupes in the messy directory, and that actually reduced the amount of files and space used by a lot, but there's still a lot of unique files in there that are in my current setup

Don't you prefer to have the files hard linked so you can keep the structure?

No. The objective is to eventually be able to comb through all the cruft that has been accumulating in the source directory and get rid of it. Or at least make it manageable.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.gitignore		.gitignore
.travis.yml		.travis.yml
Makefile		Makefile
README.md		README.md
dedup.go		dedup.go
flags.go		flags.go
go.mod		go.mod
go.sum		go.sum
hash.go		hash.go
integration_test.go		integration_test.go
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DelADupe

Motivation

Basic operation

Questions

Why not fdupes

Don't you prefer to have the files hard linked so you can keep the structure?

About

Releases

Languages

mamoit/deladupe

Folders and files

Latest commit

History

Repository files navigation

DelADupe

Motivation

Basic operation

Questions

Why not fdupes

Don't you prefer to have the files hard linked so you can keep the structure?

About

Topics

Resources

Stars

Watchers

Forks

Releases

Languages