6

I have some files that need to be sorted by name, unfortunately I can't use a regular sort, because I also want to sort the numbers in the string, so I did some research and found that what I am looking for is called natural sorting.

I tried the solution given here and it worked perfectly.

However, for strings like PresserInc-1_10.jpg and PresserInc-1_11.jpg which causes that specific natural key algorithm to fail, because it only matches the first integer which in this case would be 1 and 1, and so it throws off the sorting. So what I think might help is to match all numbers in the string and group them together, so if I have PresserInc-1_11.jpg the algorithm should give me 111 back, so my question is, is this possible ?

Here's a list of filenames:

files = ['PresserInc-1.jpg', 'PresserInc-1_10.jpg', 'PresserInc-1_11.jpg', 'PresserInc-10.jpg', 'PresserInc-2.jpg', 'PresserInc-3.jpg', 'PresserInc-4.jpg', 'PresserInc-5.jpg', 'PresserInc-6.jpg', 'PresserInc-11.jpg']

1
  • 1
    I don't get your question...Please post a more clear input and expected output Commented Jun 22, 2012 at 4:46

2 Answers 2

13

Google: Python natural sorting.

Result 1: The page you linked to.

But don't stop there!

Result 2: Jeff Atwood's blog that explains how to do it properly.

Result 3: An answer I posted based on Jeff Atwood's blog.

Here's the code from that answer:

import re

def natural_sort(l): 
    convert = lambda text: int(text) if text.isdigit() else text.lower() 
    alphanum_key = lambda key: [convert(c) for c in re.split('([0-9]+)', key)] 
    return sorted(l, key=alphanum_key)

Results for your data:

PresserInc-1.jpg
PresserInc-1_10.jpg
PresserInc-1_11.jpg
PresserInc-2.jpg
PresserInc-3.jpg
etc...

See it working online: ideone

Sign up to request clarification or add additional context in comments.

No spaces in key=alphanum_key please
Thanks, you're right shouldn't stop there :), bit tired though.. Thanks :)
Result 1: This stack overflow page.
3

If you don't mind third party libraries, you can use natsort to achieve this.

>>> import natsort
>>> files = ['PresserInc-1.jpg', 'PresserInc-1_10.jpg', 'PresserInc-1_11.jpg', 'PresserInc-10.jpg', 'PresserInc-2.jpg', 'PresserInc-3.jpg', 'PresserInc-4.jpg', 'PresserInc-5.jpg', 'PresserInc-6.jpg', 'PresserInc-11.jpg']
>>> natsort.natsorted(files)
['PresserInc-1.jpg',
 'PresserInc-1_10.jpg',
 'PresserInc-1_11.jpg',
 'PresserInc-2.jpg',
 'PresserInc-3.jpg',
 'PresserInc-4.jpg',
 'PresserInc-5.jpg',
 'PresserInc-6.jpg',
 'PresserInc-10.jpg',
 'PresserInc-11.jpg']

Full disclosure, I am the package's author.

Please disclose authorship of said library, thanks!

Your Answer

Draft saved
Draft discarded

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.