• Posts
  • RSS
  • ◂◂RSS
  • Contact

  • Sorting mixed lists of numbers and strings

    June 26th, 2009
    programming, tech  [html]
    Imagine you have this list:
    fname_0006.v0_word 2
    fname_0007.v0_word 12
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 8
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0005.v0_word 24
    
    Imagine further that you want to sort it. Unfortunately, I can't get gnu sort to let me specify which fields are numeric and which now. That is, I can do:
    $ cat file.txt | sort
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 24
    fname_0005.v0_word 8
    fname_0006.v0_word 2
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0007.v0_word 12
    
    Or I can do:
    $ cat file.txt | sort -n
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 24
    fname_0005.v0_word 8
    fname_0006.v0_word 2
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0007.v0_word 12
    
    Or I can do:
    $ cat file.txt | sort -n -k1,1 -k2,2
    fname_0006.v0_word 2
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 8
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0007.v0_word 12
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0005.v0_word 24
    
    You might think this would work:
    $ cat file.txt | sort -k1,1 -kn2,2
    fname_0006.v0_word 2
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 8
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0007.v0_word 12
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0005.v0_word 24
    
    But nothing seems to make it do the right thing. So I abandoned sort for python:
    $ cat simple_sorter.py
    import fileinput
    
    def tidy(x):
        try:
            return int(x)
        except ValueError:
            return x
    
    line_bits = []
    
    for line in fileinput.input():
        line_bits.append([tidy(field) for field in line.split()])
    
    for bits in sorted(line_bits):
        print " ".join(str(bit) for bit in bits)
    
    $ cat tmp.txt | python simple_sorter.py
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 8
    fname_0005.v0_word 24
    fname_0006.v0_word 2
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0007.v0_word 12
    

    Update 2013-08-22: Thinking now, if I had to do it on the terminal I would do:

    $ cat file | awk '{print $1, $2+1000}' | sort | awk '{print $1, $2-1000}'
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 8
    fname_0005.v0_word 24
    fname_0006.v0_word 2
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0007.v0_word 12
    
    Adding 1000 (or any number with more digits than your biggest number) puts in leading digits, fixing sorting. It's basically decorate-sort-undecorate.

    Comment via: facebook

    Recent posts on blogs I like:

    How to extend pockets

    Make women's pants pockets big enough to hold a phone properly The post How to extend pockets appeared first on Otherwise.

    via Otherwise May 19, 2022

    Buckingham Palace

    I love England. Especially because of the big castle called Buckingham Palace. I got to see the outside there, but my mom showed me some pictures of the inside. I love it there. But the outside doesn't look very fancy to me. But I never knew why those …

    via Anna Wise's Blog Posts April 25, 2022

    What is causality to an evidential decision theorist?

    (Subsumed by: Timeless Decision Theory, EDT=CDT) People sometimes object to evidential decision theory by saying: “It seems like the distinction between correlation and causation is really important to making good decisions in practice. So how can a theor…

    via The sideways view April 17, 2022

    more     (via openring)


  • Posts
  • RSS
  • ◂◂RSS
  • Contact