# this howto was tested to work with cat /etc/os-release |grep PRETTY PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
duplicate files are a waste of disk space.
every system experiences catastrophic failures, slow downs and crashes of programs, when RAM or disk space runs out 😀
BUT: under no circumstances shall a program be designed, to allow accidents that delete ALL files 😀 (without saveguards like: “you will delete all fiels under this folder?” “are you sure?” “are you really sure?”
sometimes files are stored in a certain folder for a reason.
there are programs, that allow
1. finding duplicate files
2. then deleting one copy
3. then setting a link to the still existing copy
= disk space is saved, and all files are still accessible via their folders
hardlink is a tool which replaces copies of a file with hardlinks, therefore saving space.
# start a test dry-run on the current directory hardlink -v --dry-run .
it also per default searches in all subdirectories of the current directory:
# do it for real hardlink -v . # checking the result, nice :D it works! :D hardlink -v --dry-run . Mode: dry-run Files: 14 Linked: 0 files Compared: 0 xattrs Compared: 0 files Saved: 0 bytes Duration: 0.00 seconds
WARNING: jdupes IS NOT a drop-in compatible replacement for fdupes!
identify and delete or link duplicate files
jdupes -m . Scanning: 7 files, 1 items (in 1 specified) 6 duplicate files (in 1 sets), occupying 6 MB -L --linkhard replace all duplicate files with hardlinks to the first file in each set of duplicates
identifies duplicate files within given directorie (fdupes.manpage.txt)
su - root; apt update; apt install fdupes;
-H --hardlinks normally, when two or more files point to the same disk area they are treated as non-duplicates; this option will change this behavior
fdupes -r -m . 8 duplicate files (in 1 sets), occupying 8.4 megabytes
su - root; apt update; apt install rdfind; # dry run (no file is removed) rdfind -dryrun true ./search/in/this/folder # WARNING! THIS REMOVES FILES! MAKE BACKUP! rdfind -deleteduplicates true ./search/in/this/folder
su - root; apt-get update; apt-get install duff; # install duff
Shows normal output, with a header before each cluster of duplicate files, in this case using
- recursive search (
-r) in .folder /comics
duff -r comics 2 files in cluster 1 (43935 bytes, digest ea1a856854c166ebfc95ff96735ae3d03dd551a2) comics/Nemi/n102.png comics/Nemi/n58.png 3 files in cluster 2 (32846 bytes, digest 00c819053a711a2f216a94f2a11a202e5bc604aa) comics/Nemi/n386.png comics/Nemi/n491.png comics/Nemi/n512.png 2 files in cluster 3 (26596 bytes, digest b26a8fd15102adbb697cfc6d92ae57893afe1393) comics/Nemi/n389.png comics/Nemi/n465.png 2 files in cluster 4 (30332 bytes, digest 11ff80677c85005a5ff3e12199c010bfe3dc2608) comics/Nemi/n380.png comics/Nemi/n451.png
The header can be customized (with the
-f flag) for example outputing only the number of files that follow:
duff -r -f '%n' comics 2 comics/Nemi/n102.png comics/Nemi/n58.png 3 comics/Nemi/n386.png comics/Nemi/n491.png comics/Nemi/n512.png 2 comics/Nemi/n389.png comics/Nemi/n465.png 2 comics/Nemi/n380.png comics/Nemi/n451.png
Duff can report all but one file from each cluster of duplicates (with the
This can be used in combination with for example
rm to remove duplicates, but should only be done if you don’t care which duplicates are removed.
duff -re comics comics/Nemi/n58.png comics/Nemi/n491.png comics/Nemi/n512.png comics/Nemi/n465.png comics/Nemi/n451.png
czkawka it is the rust rewritten successor to fslint (fslint is no longer in the default Debian repo, for whatever reason)
what is neat about czkawka:
- it searches a directory for duplicate files and lists the biggest files first
- the terminal version is sufficient (imho)
- Debian 11 (yet): was unable to install the gui
- plz prepare for a lengthy install that involves downloading a lot of software and compiling it
- the terminal version is sufficient (imho)
# as default user curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf | sh # check rust is installed rustc --version rustc 1.64.0 (a55dd71d5 2022-09-19) # warning! THIS WILL DOWNLOAD AND COMPILE A LOT! cargo install czkawka_cli # run it czkawka_cli dup --directories /where/to/search/for/duplicates | less # if the gui was required # become root su - root apt update apt install software-properties-common ffmpeg apt install libgdk-pixbuf-2.0-dev libghc-pango-dev libgraphene-1.0-dev librust-pango-sys-dev libglib2.0-dev cairo-dev libcairo2-dev librust-pango-sys-dev # Ctrl+D (logoff root) cargo install cairo-dev # can try to install gui, but won't work cargo install czkawka_gui
liked this article?
- only together we can create a truly free world
- plz support dwaves to keep it up & running!
- (yes the info on the internet is (mostly) free but beer is still not free (still have to work on that))
- really really hate advertisement
- contribute: whenever a solution was found, blog about it for others to find!
- talk about, recommend & link to this blog and articles
- thanks to all who contribute!