Flatten Strings with Regex.Replace

Ah, strings, the cause of and solution to all our programming problems. As developers, we’re always balancing our codebase’s readability with the efficient use of string literals. Let’s see how we can work with SQL, a database query language, in our code without sacrificing readability. The approach should work with other query syntax or multiline strings.

The Example

Let’s assume we’re working with data service and that we need to construct a query.

var value = @"select * from people where firstname = 'khalid';";

The query is reasonably acceptable, but as queries evolve and changes occur, we need to alter the string:

Additional where clauses
Instead of using the * we need to select columns explicitly
We need to parameterize inputs

The list of changes goes on, and while not earth-shatteringly difficult, the process of changing strings like this is annoying.

We could use the @ operator in C# to allow for multiline values.

var value = @"
    select *
    from people
    where firstname = 'khalid';
    ";

When we want to use this string, an issue arises because the @ operator respects all newline characters and whitespace. Let’s see a screenshot from our console when we write the value.

console output of string in Rider

As we can see, our variable maintained the whitespace and newline characters. While technically correct, it’s likely not our intention to keep those around in our final result.

Replacing Whitespace and Newlines

Now that we’ve seen the issue, let’s write some code to Flatten a string to a single line with all single spaces. Note, I’m using C# 9 syntax and a target-typed new constructor for Regex.

public static class StringExtensions
{
    private static readonly Regex Whitespace =
        new (@"\s+");
    
    public static string Flatten(this string value)
    {
        if (value is null or "")
            return value;
        
        return Whitespace.Replace(
            value.Trim(), 
            " "
        );
    }
}

Let’s use our new extension method.

var value = @"
    select *
    from people
    where firstname = 'khalid';
    "
    .Flatten();

As we expect, the output of value is flat.

select * from people where firstname = 'khalid';

If we’re not fans of extension methods, we can also create a specific Sql method that wraps our query string.

// public static string Sql(string value)
// {
//     return Flatten(value);
// }
var other = Sql(@"
    select *
    from people
    where firstname = 'khalid';
    ");

Here is the sample in its entirety, using C# 9 and top-level statements.

using System;
using System.Text.RegularExpressions;
using static StringExtensions;

var value = @"
    select *
    from people
    where firstname = 'khalid';
    "
    .Flatten();

var other = Sql(@"
    select *
    from people
    where firstname = 'khalid';
    ");

Console.WriteLine(value);
Console.WriteLine(other);

public static class StringExtensions
{
    private static readonly Regex Whitespace =
        new(@"\s+");

    public static string Flatten(this string value)
    {
        if (value is null or "")
            return value;

        return Whitespace.Replace(
            value.Trim(),
            " "
        );
    }

    public static string Sql(string value)
    {
        return Flatten(value);
    }
}

Conclusion

Finding a balance between readability during development is a struggle. The way we represent our queries in our codebase helps us alter them as requirements change. How we represent those same strings in logs and over network calls can help us diagnose problems and communicate more efficiently. This particular approach helps us adopt both, with minimal overhead when flattening our strings. I hope you found this post useful, and let me know in the comments if there’s a better way to do this.

Flatten Strings with Regex.Replace

The Example

Replacing Whitespace and Newlines

Conclusion

About Khalid Abuhakmeh

Read Next

Base64 Encoding With C#

Find Prime Numbers With C# 9