I recently saw that .NET 10 adds a numeric comparer, allowing you to sort string elements that may contain numeric values at the end of a string. Think movie sequels or number versions of an operating system. To my surprise, I could not believe it wasn’t already included in .NET, but then I sat down to try to write my own, and I now “get it”. The edge cases alone can drive you mad. Numeric ordering is subjective. Should the numbers come before Roman numerals? Should Roman numerals be parsed as numbers? What about decimals?
Anyway, I’ve added my numeric comparer implementation in this post, which uses some of the latest .NET features. Enjoy!
A List of Numbered Things
A number at the end of a text is typical for movies, books, video games, and products. These numbers denote a newer and better iteration of an item. For example, I have a combination of movies, Windows versions, and some decimal values.
var numberedThings = new List<string>
{
"Godfather", "Godfather 3", "Scream",
"Scream 2", "Scream 3", "Scream 1",
"Windows 10", "Windows 7",
"Windows 11", "Rocky 5",
"Rocky 2", "Rocky 4", "Rocky 3", "Rock",
"1.2", "1.3", "1.1", "Rocky", "Windows XP",
"Godfather 2", "1.11", "1.10", "10.0", "1.0"
};
var numericOrderer = new NumericOrderer();
var sorted = numberedThings
.OrderBy(x => x, numericOrderer)
.ToList();
foreach (var item in sorted)
{
Console.WriteLine(item);
}
The goal is to have the output result in alphanumeric ordering. For example, Godfather
should come before
Godfather 2
. Also, when it comes to strings made of numbers, I want each part of a value to be treated as a whole number. For example,
1.1
, 1.10
, and 1.11
would follow that order.
We expect the following output from the previous code based on the input.
1.0
1.1
1.2
1.3
1.10
1.11
10.0
Godfather
Godfather 2
Godfather 3
Rock
Rocky
Rocky 2
Rocky 3
Rocky 4
Rocky 5
Scream
Scream 1
Scream 2
Scream 3
Windows 7
Windows 10
Windows 11
Windows XP
Let’s look at my implementation.
Writing a NumericOrderer with Spans
Here is my NumericOrderer
based on the collection of data previously mentioned.
public sealed class NumericOrderer : IComparer<string>
{
public int Compare(string? x, string? y)
{
if (x == null && y == null) return 0;
if (x == null) return -1;
if (y == null) return 1;
var xSpan = x.AsSpan();
var ySpan = y.AsSpan();
var commonPrefixLength = xSpan.CommonPrefixLength(ySpan);
while (commonPrefixLength > 0)
{
xSpan = xSpan[commonPrefixLength..];
ySpan = ySpan[commonPrefixLength..];
commonPrefixLength = xSpan.CommonPrefixLength(ySpan);
}
if (int.TryParse(xSpan, out var xNumber) &&
int.TryParse(ySpan, out var yNumber))
{
return xNumber.CompareTo(yNumber);
}
return xSpan.CompareTo(ySpan, StringComparison.Ordinal);
}
}
There are three configuration points you should consider in this implementation.
-
CommonPrefixLength
can also take aComparer
argument, which might help ignore casing. - The
CompareTo
method currently takes aStringComparison.Ordinal
value, which is case sensitive. I suggest changing this based on your needs. - The use of
int.TryParse
compares whole numbers. If your values contain decimals, then you may want to trydouble.TryParse
.
Are there edge cases here? Very likely, but when ordering anything with “opinions”, you’ll likely have to tweak it to your liking. As I mentioned previously, Roman numerals are commonly used in place of Arabic numerals, so you may want to tweak this to support those, or you may want
Windows XP
to appear before Windows 11
in a list. You’re the sorting wizard in your magical journey.
Conclusion
This exercise made me realize that building a general-purpose numeric orderer is likely a herculean task. Instead, I’d probably implement a sorter based on your needs for that specific use case. Also, I learned that the
Span
APIs are very good and capable of breaking apart strings without creating new instances in memory. Finally, while this comparison works well for a single-point dataset, you’re likely dealing with more complex data models with other sortable fields that can be more accurately sorted (think numbers and dates). I’d likely lean more on predictably sortable values in a production setting than leaving it all to a general-purpose numeric comparer.
At the very least, I hope this implementation gives you a starting point for writing and tweaking your implementation. As always, thanks for reading, and cheers.