• Posts
  • RSS
  • ◂◂RSS
  • Contact

  • Sorting mixed lists of numbers and strings

    June 26th, 2009
    programming  [html]
    Imagine you have this list:
    fname_0006.v0_word 2
    fname_0007.v0_word 12
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 8
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0005.v0_word 24
    
    Imagine further that you want to sort it. Unfortunately, I can't get gnu sort to let me specify which fields are numeric and which now. That is, I can do:
    $ cat file.txt | sort
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 24
    fname_0005.v0_word 8
    fname_0006.v0_word 2
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0007.v0_word 12
    
    Or I can do:
    $ cat file.txt | sort -n
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 24
    fname_0005.v0_word 8
    fname_0006.v0_word 2
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0007.v0_word 12
    
    Or I can do:
    $ cat file.txt | sort -n -k1,1 -k2,2
    fname_0006.v0_word 2
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 8
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0007.v0_word 12
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0005.v0_word 24
    
    You might think this would work:
    $ cat file.txt | sort -k1,1 -kn2,2
    fname_0006.v0_word 2
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 8
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0007.v0_word 12
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0005.v0_word 24
    
    But nothing seems to make it do the right thing. So I abandoned sort for python:
    $ cat simple_sorter.py
    import fileinput
    
    def tidy(x):
        try:
            return int(x)
        except ValueError:
            return x
    
    line_bits = []
    
    for line in fileinput.input():
        line_bits.append([tidy(field) for field in line.split()])
    
    for bits in sorted(line_bits):
        print " ".join(str(bit) for bit in bits)
    
    $ cat tmp.txt | python simple_sorter.py
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 8
    fname_0005.v0_word 24
    fname_0006.v0_word 2
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0007.v0_word 12
    

    Update 2013-08-22: Thinking now, if I had to do it on the terminal I would do:

    $ cat file | awk '{print $1, $2+1000}' | sort | awk '{print $1, $2-1000}'
    fname_0001.v0_word 15
    fname_0002.v0_word 23
    fname_0003.v0_word 5
    fname_0003.v0_word 7
    fname_0005.v0_word 8
    fname_0005.v0_word 24
    fname_0006.v0_word 2
    fname_0006.v0_word 9
    fname_0007.v0_word 11
    fname_0007.v0_word 12
    
    Adding 1000 (or any number with more digits than your biggest number) puts in leading digits, fixing sorting. It's basically decorate-sort-undecorate.

    Comment via: facebook

    Recent posts on blogs I like:

    Learning Worst Industry Practices

    If I have a bad idea and you have a bad idea and we exchange them, we now have two bad ideas. But more than that. If I have a bad idea and you have a good idea and we exchange them, we should both land on your good idea – but that requires both […]

    via Pedestrian Observations September 20, 2020

    Collections: Iron, How Did They Make It? Part I, Mining

    This week we are starting a four-part look at pre-modern iron and steel production. As with our series on farming, we are going to follow the train of iron production from the mine to a finished object, be that a tool, a piece of armor, a simple nail, a w…

    via A Collection of Unmitigated Pedantry September 18, 2020

    Learning Game

    I came up with this game. In the game one person thinks of something and then gives the other person a clue. And the other person writes a guess down on a blackboard or a piece of paper. Or really anything you have that's laying around that's av…

    via Lily Wise's Blog Posts September 17, 2020

    more     (via openring)


  • Posts
  • RSS
  • ◂◂RSS
  • Contact