26 Apr 2020 @ 2:51 PM 

I had to delete ~1.700.000 files in a Magento var/session/ folder

  • the Plesk interface crashed after a few minutes because it didn’t have the memory to list the folder’s content
  • after ssh-ing into the machine, the classic rm -rf var/session/* also crashed afeter a few minutes with the error: ” -sh: /usr/bin/rm: Argument list too long”

So I started looking on other solutions on the wild web and I came to this Kinamo post: Efficiently delete a million files on Linux servers that had 4 variants:

  • `-sh-4.2$ rm -rf var/session/*` -> -sh: /usr/bin/rm: Argument list too long
  • `find /yourmagicmap/* -type f -mtime +3 -exec rm -f {} \;`
  • `find /yourmagicmap/* -type f -mtime +3 -delete`
  • `-sh-4.2$ rsync -a –delete /tmp/empty/ var/session/`

Details on those variants:

  • rm: deleting millions of file is a no-can-do!
  • find -exec: an option, but slower!
  • find -delete: fast and easy way to remove loads of files.
  • rsync –delete: without doubt the quickest!

Beside that post, I found another solution proposed on a Unix StackExchange thread: Faster way to delete large number of files [duplicate] which had answers on another one: Efficiently delete large directory containing thousands of files. The solution was a delete-in-5000-files-batches script:

#!/bin/bash

# Path to folder with many files
FOLDER="/path/to/folder/with/many/files"

# Temporary file to store file names
FILE_FILENAMES="/tmp/filenames"

if [ -z "$FOLDER" ]; then
    echo "Prevented you from deleting everything! Correct your FOLDER variable!"
    exit 1
fi

while true; do
    FILES=$(ls -f1 $FOLDER | wc -l)
    if [ "$FILES" -gt 10000 ]; then
        printf "[%s] %s files found. going on with removing\n" "$(date)" "$FILES"
        # Create new list of files
        ls -f1 $FOLDER | head -n 5002 | tail -n 5000 > "$FILE_FILENAMES"

        if [ -s $FILE_FILENAMES ]; then
            while read FILE; do
                rm "$FOLDER/$FILE"
                sleep 0.005
            done < "$FILE_FILENAMES"
        fi
    else
        printf "[%s] script has finished, almost all files have been deleted" "$(date)"
        break
    fi
    sleep 5
done

 

Stats for my test:

  • batch-delete-script: 5000 files / 43 seconds -> ~100 files/s
  • rsync: 50.000 files / 6 seconds -> 8300 files/s !!

Obviously, the rsync is the fastest solution for deleting a huge number of files!

 


PS. There was another solution on one of the stackexchange threads above, that claimed that a Perl one-liner would be even faster:

perl -e 'for(<*>){((stat)[9]<(unlink))}'

(I didn’t have the chance to test it though, because I already deleted the files.)

Share
Posted By: Teodor Muraru
Last Edit: 26 Apr 2020 @ 03:08 PM

EmailPermalink
Tags
Categories: Linux, Technology


 

Responses to this post » (None)

 
Post a Comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Captcha code *


 Last 50 Posts
 Back
Change Theme...
  • Users » 1
  • Posts/Pages » 84
  • Comments » 8
Change Theme...
  • VoidVoid « Default
  • LifeLife
  • EarthEarth
  • WindWind
  • WaterWater
  • FireFire
  • LightLight

Filme vazute



    No Child Pages.

Carti citite



    No Child Pages.

Contact



    No Child Pages.

Muzica



    No Child Pages.