Skip to content

Instantly share code, notes, and snippets.

@Demeter
Forked from fweez/ben.py
Created January 10, 2012 08:37
Show Gist options
  • Save Demeter/1587884 to your computer and use it in GitHub Desktop.
Save Demeter/1587884 to your computer and use it in GitHub Desktop.

Revisions

  1. @fweez fweez created this gist Jun 27, 2011.
    32 changes: 32 additions & 0 deletions ben.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,32 @@
    #!/usr/bin/env python

    import os
    import subprocess

    # First, build up the results file. it'll have the form:
    # 110 vidioc-g-dv-preset.xml
    linecount_cmd = "rm results; for i in *; do find $i -execdir wc -c '{}' \; " + \
    ">> results; done;"
    os.system(linecount_cmd)

    # Then figure out how many are in each bucket (linecount's initial digit)
    bucket_cmd = 'for i in `jot 9`; do egrep "^[ ]*$i" results| ' + \
    'wc -l; done > counts'
    os.system(bucket_cmd)

    # And how many total files there are...
    total_cmd = 'wc -l results'
    p = subprocess.Popen(total_cmd, stdout=subprocess.PIPE, shell=True)
    (total, _) = p.communicate()

    total = int(total.split()[0])

    for i,count in enumerate(file('counts', 'r').readlines()):
    print '"%d": %0.5f,' % (i + 1, 100 * (float(count) / total))

    print "Record count:", total

    os.system('echo "biggest:" && sort --general-numeric-sort -b results '
    '| tail -n 1')