Published on

How to Display Filenames Sorted by Number of Characters in File in the Shell

Authors

Today I had to work on cutting the number of characters in template files I had. The most efficient way to do this is to work on the files with the most number of characters first.

Doing this manually would be a real pain. Luckily doing something like this in the shell is trivial. After a bit of Googling I came up with this:

for file in path/to/txt/files/*.txt;do charcount=$(cat $file | wc -c);echo "$file $charcount";done | sort -k 2 -r | column -t

This will output something like:

path/to/txt/files/some-file-1.txt                                    665
path/to/txt/files/some-other-file-20.txt                             386
path/to/txt/files/yet-another-file-0.txt                             334
path/to/txt/files/some-file-3.txt                                    154
path/to/txt/files/some-other-file-44.txt                             0
path/to/txt/files/yet-another-file-400.txt                           0

This is a one-liner that is easy to add to things like makefiles and package.json files. But you can break it apart as needed.

In the above:

  • for file in path/to/txt/files/*.txt;
    • This will loop through all *txt files in the path `path/to/txt/files
    • You can change the extension to whatever you want
    • file is the name of the variable. This can also have any name you want
  • do charcount=$(cat $file | wc -c)
    • do always needs to be used for for loops before the first command
    • charcount=$(cat $file | wc -c): Here we assign the character count of a file to a variable
  • echo "$file $charcount"
    • Here we output the file name ($file) with space and then the character count of that file ($charcount)
    • If you use this in something like a package.json that uses double quotes " you will need to escape the quotes as: echo \"$file $charcount\"
  • done: This is the way to terminate a bash for loop
  • | sort -k 2 -r
    • We pipe the full result of the entire for loop to sort
    • sort will use the second column -k 2 which is our character count
    • sort will sort in reverse order -r, i.e. descending order so the highest count is at the top
  • | column -t
    • This is just a nice way to format the output of the command so there is even spacing between the file name and character count for readability purposes.

References