Presentation is an essential part of the language, let alone a software application. How we utilize script can make the difference between an enjoyable user experience or an incomprehensible mess. As developers, we likely spend more time formatting strings than we’d like to admit. I hope to save you some time as it relates to strings. In this post, we’ll see how we can convert any string to a language-native title casing, all with classes already found within the .NET Framework.
What Is TextInfo?
Most display elements and translation are not universal regarding .NET but instead driven by culture-specific instances of
CultureInfo instances define styles of dates, currency, and more. Nested within each
CultureInfo is the
TextInfo class that defines the specific behaviors attributed to a culture’s writing system. As the comment on the
TextInfo class states:
A writing system is the collection of scripts and orthographic rules required to represent a language as text. – .NET Framework
What Is Title Casing?
Title casing pertains to the style of titles for books, posts, and essays. In other words, the practice of title casing is the capitalization of each word’s first letter in a string. For example, the string literal “khalid abuhakmeh is the best” would be title-cased to “Khalid Abuhakmeh Is The Best”.
The act of title-casing might seem straightforward to many, but each culture should apply its own rules to accurately title-case a target string. Each culture provides local conventions that .NET needs to account for when converting values. Looking at the comments left on the
TextInfo type, we can see some clear examples.
For English readers, the title “The Merry Wives of Windor” is the appropriate title casing for the value, with the word “of” not being uppercased. For German readers, the translated title would be “Die lustigen Weiber von Windsor,” and both “lustigen” and “von” would not be uppercased. For folks who read Dutch, they will know that .NET must capitalize on the letters “IJ” at the beginning of each word when present. Who knew that capitalizing words could be so tricky?
While title casing can have specifics, from my testing, I’ve found that other than the unique edge cases around Dutch, that .NET capitalizes the first letter of each disparate word. In the case of “the merry wives of windor”, we get “The Merry Wives Of Windor”. Note the capitalization of each letter. The capitalized letters are an acceptable outcome. Let’s take a look at the code that generated this outcome.
using static System.Console;
using static System.Globalization.CultureInfo;
var cultureCode = "en-gb";
CurrentCulture = GetCultureInfo(cultureCode);
var info = CurrentCulture.TextInfo;
var name = "the merry wives of windor";
In the example above, we set the
CurrentCulture and the
CurrentUICulture to the British English variant (
en-GB). If we look at the constructor of
TextInfo, we’ll see it accepts a
CultureData parameter, presumably to understand title casing rules.
internal TextInfo(CultureData cultureData)
// This is our primary data source, we don't need most of the rest of this
_cultureData = cultureData;
_cultureName = _cultureData.CultureName;
_textInfoName = _cultureData.TextInfoName;
_sortHandle = CompareInfo.NlsGetSortHandle(_textInfoName);
We can also look at the
CultureInfo class to see how it creates the
public virtual TextInfo TextInfo
if (_textInfo == null)
// Make a new textInfo
TextInfo tempTextInfo = new TextInfo(_cultureData);
_textInfo = tempTextInfo;
Strangely, the XML documentation for
ToTitleCase makes a big deal around culture-specific capitalization rules. For English, the lowercasing of
of was pointed out explicitly in the documentation, but it was also upper-cased in practice. While the
CultureInfo instance passes
CultureData to the newly created
TextInfo, it doesn’t seem to have much of an effect on title casing. That said, the upper-casing of articles, adjectives, and conjunctions is not likely to deter me from using
ToTitleCase. I highly recommend this approach for anyone looking to uppercase the first letter of every word within a