Build .NET Search Experiences With Lunr-Core

Search can be the critical difference between a good app and a great app. Although search technologies like Elasticsearch, SOLR, RedisSearch, and more options have become readily available, they still require a non-trivial amount of resources to operate and maintain. The web community has a search solution for web developers in Lunr, and luckily .NET OSS community has ported the library to a NuGet package.

This post will explore what Lunr is and how we can use Lunr-Core to provide simple yet powerful search experiences for our users.

What Is Lunr?

Lunr draws inspiration from SOLR, a JAVA-based search engine platform built on Apache Lucene. Admittedly Lunr is not a replacement for SOLR; instead, Lunr creators designed the library to be small and lightweight, and free of external dependencies. The design philosophy of Lunr allows developers to use it in an array of scenarios not possible for the more robust solutions.

Lunr boasts standard search functionality such as indexing fields, tokenizers, stop words, scoring, and a document processing pipeline that allows for features like:

Full-text search support for 14 languages
Boost terms at query time or boost entire documents at index time
Scope searches to specific fields
Fuzzy term matching with wildcards or edit distance

While this all may sound complicated, the API is straightforward. Let’s take a look at the JavaScript implementation first.

var idx = lunr(function () { 
    this.field('title') 
    this.field('body')
    this.add({ 
        "title": "Twelfth-Night",
        "body": "If music be the food of love, play on: Give me excess of it…",
        "author": "William Shakespeare",
        "id": "1" 
    }) 
})

This indexing process has two search fields of title and body. Once we’ve built the index, we can search for values using the idx instance.

idx.search("love")

Lunr returns search results in a JSON array.

[ 
  { 
    "ref": "1",
    "score": 0.3535533905932737,
    "matchData": {
        "metadata": {
            "love": {
                "body": {} 
            } 
        }
    }
  } 
]

JavaScript is great and all, but we’re .NET developers! What about .NET?!

Enter Lunr-Core For .NET Search

Lunr-Core is a port of Lunr for use in .NET applications and has the fantastic benefit of being 100% compatible with Lunr. That means we can use indexes built with either the JavaScript implementation or the .NET implementation. To get started with Lunr-Core, we’ll need to install the NuGet package.

dotnet add package LunrCore

The C# API is very similar to its JavaScript counterpart. Let’s look at indexing a document.

var index = await Index.Build(async builder =>
{
    builder
        .AddField("title")
        .AddField("body");

    await builder.Add(new Document
    {
        { "title", "Twelfth-Night" },
        { "body", "If music be the food of love, play on: Give me excess of it…" },
        { "author", "William Shakespeare" },
        { "id", "1" },
    });
});

Once we build our index, we can use the index variable to perform searches.

await foreach (Result result in index.Search("love"))
{
    // do something with that result
}

Let’s consider the limitations of Lunr before jumping to a complete .NET sample.

Considerations When Using Lunr

Lunr is a fully-featured search engine library, but there are drawbacks .NET developers should consider before using Lunr.

The first drawback is Lunr lacks incremental index builds. That means adding a single document will require complete indexing of all index entries.

Lunr indexes are now immutable. Once they have been built, it is not possible to add, update or remove any documents in the index. –Lunr Docs

If our data is changing rapidly, then Lunr might not be the best choice of search technology.

Another drawback is that Lunr writes indexes to JSON, an unoptimized disk format that may be surprisingly larger than the original data. For example, an indexed CSV that’s originally 2.6MB results in an index file that is 17.3 MB on disk in the upcoming sample. If reading and writing from a disk is an expensive operation, Lunr might not be the right choice for our use case.

A Lunr-Core C# Sample

So if you’re still interested in using Lunr for your search experience, then I’ve provided a sample below. We’ll be reading U.S. cities from a CSV and indexing them. We’ll also be writing our index to disk to eliminate the cost of indexing our documents at startup.

For folks who want to run this sample locally, I’ve provided all the source code on my GitHub repository.

using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Linq;
using CsvHelper;
using CsvHelper.Configuration;
using CsvHelper.Configuration.Attributes;
using Lunr;
using Spectre.Console;

// our database
var cities = new Dictionary<string, City>();
var status = AnsiConsole
    .Status()
    .Spinner(Spinner.Known.Earth)
    .AutoRefresh(true);

// let's build our search index
Lunr.Index index = null;
const string indexName = "local.index.json";
await status.StartAsync("Thinking...", async ctx =>
{
    ctx.Status("[green]loading cities...[/]");
    var config = new CsvConfiguration(CultureInfo.InvariantCulture) {
        Delimiter = "|",
        HasHeaderRecord = true,
        MissingFieldFound = null
    };
    using var reader = new StreamReader("us_cities.csv");
    using var csv = new CsvReader(reader, config);
    
    await foreach (var city in csv.GetRecordsAsync<City>().Select((city, id) => city.WithId(id)))
    {
        cities.Add(city.Id, city);
    }

    if (File.Exists(indexName))
    {
        ctx.Status("[green]loading index from disk...[/]");
        var json = await File.ReadAllTextAsync(indexName);
        index = Lunr.Index.LoadFromJson(json);
    }
    else
    {
        ctx.Status("[green]building index...[/]");
        
        index = await Lunr.Index.Build(async builder =>
        {
            foreach (var field in City.Fields)
                builder.AddField(field);

            foreach (var (_, city) in cities)
            {
                await builder.Add(city.ToDocument());
            }
        });

        await using var file = File.OpenWrite(indexName);
        await index.SaveToJsonStream(file);
    }
});
var running = true;
Console.CancelKeyPress += (_, _) => running = false;
while (running)
{
    Console.Write("Search : ");
    var search = Console.ReadLine();

    var table = new Table()
        .Title($":magnifying_glass_tilted_left: Search Results for \"{search}\"")
        .BorderStyle(new Style(foreground: Color.NavajoWhite1, decoration: Decoration.Italic))
        .AddColumn("name")
        .AddColumn("county")
        .AddColumn("state")
        .AddColumn("alias")
        .AddColumn("score");

    await foreach (var result in index.Search(search ?? string.Empty).Take(10))
    {
        var city = cities[result.DocumentReference];
        table.AddRow(
            city.Name,
            city.County,
            city.StateAbbreviation,
            city.Alias,
            $"{result.Score:F3}"
        );
    }

    AnsiConsole.Render(table);
}

public class City
{
    // Headers:
    // City|State short|State full|County|City alias
    [Ignore] public string Id { get; set; }
    [Name("City")] public string Name { get; set; }
    [Name("State full")] public string StateName { get; set; }
    [Name("State short")] public string StateAbbreviation { get; set; }
    [Name("County")] public string County { get; set; }
    [Name("City alias")] public string Alias { get; set; }
    
    public City WithId(int id)
    {
        Id = id.ToString();
        return this;
    }

    public Document ToDocument()
    {
        return new(new Dictionary<string, object>
        {
            {"id", Id},
            {nameof(Name), Name},
            {nameof(StateName), StateName},
            {nameof(StateAbbreviation), StateAbbreviation},
            {nameof(County), County},
            {nameof(Alias), Alias}
        });
    }

    public static IEnumerable<string> Fields => new[]
    {
        nameof(Name),
        nameof(StateName),
        nameof(StateAbbreviation),
        nameof(County),
        nameof(Alias)
    }.ToList().AsReadOnly();
}

Conclusion

Lunr and, by extension, Lunr-Core are excellent for providing search experiences for mostly static datasets. It’s also a perfect option for folks building client experiences, especially as Web Assembly brings the .NET to the browser. As you saw in this post, it doesn’t take much to start providing a compelling search experience to your users. Lunr is also a great starting point to upgrade to one of the more robust solutions previously mentioned.

I hope you found this post helpful, and thank you again for reading.