One of the most powerful features of relational database engines is the ability for tables to be self-referential. Having rows reference other rows within the same table allows us to create complex hierarchies and tree structures with minimal effort.

In this short post, we’ll explore how to model a self-referencing entity with EF Core and then create a helper method that will allow us to get all downstream data starting at a particular level. To follow along, developers will need a running instance of Microsoft SQL Server.

The Recursive Database Model

We’ll start with the most familiar recursive model everyone is aware of, and that’s the organizational chart for a company. With Entity Framework Core, all we need to do is define our navigation properties. If you need a guide on defining relationships, If you read my previous blog post on modeling all relationships with EF Core.

We know that employees have managers and that managers have reporting employees.

public class Employee
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string Title { get; set; }
    public Employee Manager { get; set; }
    public int? ManagerId { get; set; }
    public List<Employee> Reports { get; set; }
}

Let’s seed some data in our EF Core DbContext using the organizational chart of Springfield Nuclear Power Plant.

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Entity<Employee>()
        .HasData(new List<Employee>()
        {
            // org chart
            // https://simpsons.fandom.com/wiki/Springfield_Nuclear_Power_Plant
            new() {Id = 1, Name = "Charles Montgomery Burns", Title = "Owner"},
            new() {Id = 2, Name = "Waylon Smithers, Jr.", Title = "Assistant", ManagerId = 1},
            new() {Id = 3, Name = "Lenny Leonard", Title = "Technical Supervisor", ManagerId = 2},
            new() {Id = 4, Name = "Carl Carlson", Title = "Safety Operations Supervisor", ManagerId = 2},
            new() {Id = 5, Name = "Inanimate Carbon Rod", Title = "Rod", ManagerId = 4},
            new() {Id = 6, Name = "Homer Simpson", Title = "Safety Inspector", ManagerId = 5}
        });
}

Now that we’re ready, let’s talk about the query we’d like to execute against our dataset.

Given an employee’s identifier number (id), we want to see all subsequent reports, regardless of the management chain.

If we were to query using our DbContext, we would only ever get one level of reports. The default approach would be expensive and wasteful.

var employee = db.Employees
.Include(e => e.Reports)
.Where(e => e.Id = id)
.ToList();

There must be a better way, right?! Of course, there is! The solution is to utilize Common Table Expressions.

Common Table Expressions (CTEs)

Common Table Expressions is a mechanism available in most relational database engines, where developers are enabled to define temporarily named result sets and reference them within the execution scope of a Select, Insert, Update, or Delete statement. They are popular amongst database developers as they can allow for complex recursive queries, which we’ll use in this post.

Most modern database engines support CTEs, with variations in SQL syntax: Oracle, SQL Server, MySQL, PostgreSQL, SQLite, and MariaDB.

Developers can find an excellent in-depth article on CTE with many examples at DatabaseStar written by Ben Brumm.

Using CTEs With EF Core

The most straightforward way to use CTEs with EF Core is to use the FromSqlRaw extension method. The FromSqlRaw extension method is available on all DbSet properties, with the caveat, our query results will satisfy all properties of our EF entity’s shape. In a previous section, we saw the definition for Employee. Let’s write a CTE that when given an employee Id, the query will return that individual’s information and downline reports.

DECLARE @id int;
SET @id = 1;
WITH organization (id, name, title, managerid, below) AS (
    SELECT id, name, title, managerid, 0
    FROM dbo.Employees    
    WHERE Employees.Id = @id         
    UNION ALL
    SELECT e.id, e.name, e.title, e.managerid, o.below + 1
    FROM dbo.Employees e    
    INNER JOIN organization o 
        ON o.Id = e.ManagerId
)
SELECT * FROM organization

There are a few critical elements in the SQL Query:

  • organization is our CTE
  • We define the shape of our CTE within the parenthesis: (id, name, title, managerid, and below)
  • the column value below is an accumulator used to denote the level at which we are currently in the recursive call.
  • The parameter @id is used to start our recursive call at a particular row, limiting our result set.

Now, let’s move this query into our DbContext using a helper function.

public Task<List<Employee>> AllReports(int id) =>
    Employees.FromSqlRaw(
    @"WITH organization (id, name, title, managerid, below) AS (
        SELECT id, name, title, managerid, 0
        FROM dbo.Employees    
        WHERE Employees.Id = {0}         
        UNION ALL
        SELECT e.id, e.name, e.title, e.managerid, o.below + 1
        FROM dbo.Employees e    
        INNER JOIN organization o 
            ON o.Id = e.ManagerId
    )
    SELECT * FROM organization", id)
    .AsNoTrackingWithIdentityResolution()
    .ToListAsync()
    ;

We take the employee’s id we need the downline reports for and then return the results. To use this method now, we can call it like any other method.

var org = await db.AllReports(id);

Let’s see this code in action in a simple console application.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Microsoft.EntityFrameworkCore;
using Spectre.Console;

namespace EntityFrameworkCoreRecursiveCTE
{
    static class Program
    {
        static async Task Main(string[] args)
        {
            var db = new Database();
            
            void AddBranch(Tree tree, TreeNode node, Employee employee)
            {
                var current = node == null
                    ? tree.AddNode($"[yellow]{employee.Name} ({employee.Title})[/]")
                    : node.AddNode($"[blue]{employee.Name}({employee.Title})[/]");

                if (employee.Reports == null)
                    return;

                foreach (var report in employee.Reports)
                {
                    AddBranch(tree, current, report);
                }
            }

            while (true)
            {
                AnsiConsole.Write("Please enter Employee #: ");
                var value = Console.ReadLine();

                if (!int.TryParse(value, out var id)) {
                    Console.WriteLine($"Employee {value} not found.");
                    continue;
                }

                var org = await db.AllReports(id);
                var tree = new Tree("Springfield Nuclear Power Plant");

                if (org.Any())
                {
                    AddBranch(tree, null, org.First());
                    AnsiConsole.Render(tree);
                }
                else
                {
                    Console.WriteLine($"Employee {value} not found.");
                }

            }
        }
    }

    public class Database : DbContext
    {
        public DbSet<Employee> Employees { get; set; }

        public Task<List<Employee>> AllReports(int id) =>
            Employees.FromSqlRaw(
            @"WITH organization (id, name, title, managerid, below) AS (
                SELECT id, name, title, managerid, 0
                FROM dbo.Employees    
                WHERE Employees.Id = {0}         
                UNION ALL
                SELECT e.id, e.name, e.title, e.managerid, o.below + 1
                FROM dbo.Employees e    
                INNER JOIN organization o 
                    ON o.Id = e.ManagerId
            )
            SELECT * FROM organization", id)
            .AsNoTrackingWithIdentityResolution()
            .ToListAsync()
            ;

        protected override void OnModelCreating(ModelBuilder modelBuilder)
        {
            modelBuilder.Entity<Employee>()
                .HasData(new List<Employee>()
                {
                    // org chart
                    // https://simpsons.fandom.com/wiki/Springfield_Nuclear_Power_Plant
                    new() {Id = 1, Name = "Charles Montgomery Burns", Title = "Owner"},
                    new() {Id = 2, Name = "Waylon Smithers, Jr.", Title = "Assistant", ManagerId = 1},
                    new() {Id = 3, Name = "Lenny Leonard", Title = "Technical Supervisor", ManagerId = 2},
                    new() {Id = 4, Name = "Carl Carlson", Title = "Safety Operations Supervisor", ManagerId = 2},
                    new() {Id = 5, Name = "Inanimate Carbon Rod", Title = "Rod", ManagerId = 4},
                    new() {Id = 6, Name = "Homer Simpson", Title = "Safety Inspector", ManagerId = 5}
                });
        }

        protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        {
            optionsBuilder
                .UseSqlServer("server=localhost,11433;database=efcore_cte;user=sa;password=Pass123!;");
        }
    }

    public class Employee
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public string Title { get; set; }
        public Employee Manager { get; set; }
        public int? ManagerId { get; set; }
        public List<Employee> Reports { get; set; }
    }
}

Running this program with an employee Id of 1 will result in the following output.

Please enter Employee #: 1
Springfield Nuclear Power Plant
└── Charles Montgomery Burns (Owner)
    └── Waylon Smithers, Jr.(Assistant)
        ├── Lenny Leonard(Technical Supervisor)
        └── Carl Carlson(Safety Operations Supervisor)
            └── Inanimate Carbon Rod(Rod)
                └── Homer Simpson(Safety Inspector)

We can rerun our application with the Id to 5 results in the following output.

Please enter Employee #: 5
Springfield Nuclear Power Plant
└── Inanimate Carbon Rod (Rod)
    └── Homer Simpson(Safety Inspector)

Problems With CTEs and FromSqlRaw

Generally, FromSqlRaw would allow developers to wrap additional criteria around the initial query. That is not the case for CTEs. Attempting to add any LINQ criteria will result in an InvalidOperationException with the following message.

System.InvalidOperationException: 'FromSqlRaw' or 'FromSqlInterpolated' was called with non-composable SQL and with a query composing over it. Consider calling 'AsEnumerable' after the method to perform the composition on the client side.

The workaround for this issue is to place our CTE SQL within a stored procedure or SQL View. After defining our stored procedure, we can call the same helper method, which we redefined to support the change.

public IQueryable<Employee> AllReports(int id) =>
    Employees.FromSqlRaw("Exec dbo.AllReports {0}", id);

With this change, we can begin to add back LINQ criteria to our C# code.

var result = await db
    .AllReports(1)
    .Where(e => e.Name.Contains("Homer"))
    .ToListAsync();

Conclusion

CTEs are a powerful way to explore recursive data within a SQL-based relational database engine. EF Core supports saving recursive entities, and as we’ve seen in this post, it’s relatively straightforward to model. There are caveats to what FromSqlRaw can accomplish, so if developers need additional LINQ capability, they would be better adding any CTE implementations into a SQL view or stored procedure.

Hopefully, you found this post helpful and fun. As always, thanks for reading, and please share my content if you found it useful.