Entity Framework Core 7 introduced developers to JSON column support. I even wrote about it on the JetBrains .NET blog, and it has been generally well-received. However, recently a user tried my sample project, only to find that they could not filter on a collection of JSON values. Given the current Microsoft SQL Server provider, the following LINQ statement does not work.

var results = await db.Posts
    .Where(x => x.Tags.Any(t => t.Value == "CSharp"))
    .ToListAsync();

Entity Framework Core will throw the following exception.

The LINQ expression 'DbSet<Post>()
    .Where(p => EF.Property<List<Tag>>(p, "Tags")
        .AsQueryable()
        .Any(o => o.Value == "CSharp"))' 
could not be translated.
Either rewrite the query in a form that can be translated, 
or switch to client evaluation explicitly by inserting a 
call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', 
or 'ToListAsync'. See https://go.microsoft.com/fwlink/?linkid=2101038 
for more information.

If you’re experiencing this issue and need to filter on a JSON array, I have several SQL queries to help you work around this issue. Let’s get started.

SQL to the Rescue

Luckily, in later versions of Entity Framework, we have access to the FromSql methods that can allow us to seed the From clause with a sub-query. The advantage of this approach is we can still use many of the features of LINQ from a starting result set. But before we get to the C# code, let’s look at some queries against our data model.

var blogPost = new Post
{
    Title = "Writing about .NET",
    Author =
    {
        Name = "Khalid Abuhakmeh",
        ImageUrl = "/khalidabuhakmeh.jpg",
        SocialMediaUrl = "@khalidabuhakmeh@mastodon.social"
    },
    Tags =
    {
        new Tag { Value = "dotnet" },
        new Tag { Value = "drop" }, 
        new Tag { Value = "dot" }
    }
};

Tags are stored as a JSON Column in the following form.

[{"Value":"dotnet"},{"Value":"drop"},{"Value":"dot"}]

So how do you filter a row based on whether a tag exists in your collection of tag values? Well, there are a few ways. I’ve included them below with the Total Cost according to running them on a dataset of 500 rows. Of course, a lower value is better, but check it against your dataset and consult your local DBA.

-- Total Cost: 0.00527086
SELECT *
FROM Posts p
CROSS APPLY OPENJSON(p.Tags)
WITH (Tag VARCHAR(300) '$.Value') as Tag
Where Tag = 'dotnet'

-- Total Cost: 0.0651177
SELECT *
From Posts posts
WHERE (SELECT Tag as tag
       FROM OPENJSON(Tags) WITH (Tag VARCHAR(300) '$.Value') AS Tag
       WHERE Tag = 'dotnet') = 'dotnet'

-- Total Cost: 0.0304298
SELECT *
From Posts posts
WHERE EXISTS 
  (SELECT Tag as tag
   FROM OPENJSON(Tags) WITH (Tag VARCHAR(300) '$.Value') AS Tag
   WHERE Tag = 'dotnet')

-- Total Cost: 0.0297937
SELECT *
From Posts posts
WHERE EXISTS 
 (SELECT Value
  FROM OPENJSON(Tags) WITH (Value VARCHAR(300) '$.Value') AS Value
  WHERE Value in ('dotnet', 'drop'))

I’ve included variants that support multiple tag filtering or several values. Use the one for your use case.

Now that we have a SQL query, how do we use it in Entity Framework Core?

Well, I had mentioned the use of FromSql, and we can do just that.

var values = new object[] { "drop", "dot" };
var placeholders = string.Join(",", values.Select((_, i) => $"{{{i}}}").ToArray());

var list = await db.Posts.FromSqlRaw(
        $"SELECT * From Posts posts WHERE EXISTS (SELECT Value FROM OPENJSON(Tags) WITH (Value VARCHAR(300) '$.Value') AS Value WHERE Value in ({placeholders}))", values)
    .ToListAsync();

I decided to use the Exists versions of my SQL queries, as they are the most straightforward to reason about, and it supports multiple tags. I also needed to parameterize the user input, which should keep the query safe against SQL injection attacks.

You can add additional filters to the base query, like the following.

var list = await db.Posts.FromSqlRaw("<SQL>")
    .Where(p => p.Id == 2)
    .ToListAsync();

The generated SQL query will appear in the console’s output.

info: 5/8/2023 14:21:20.802 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command) 
      Executed DbCommand (91ms) [Parameters=[p0='drop' (Size = 4000), p1='dot' (Size = 4000)], CommandType='Text', CommandTimeout='30']
      SELECT [c].[Id], [c].[CreateAt], [c].[Permalink], [c].[Title], [c].[UpdatedAt], JSON_QUERY([c].[Author],'$'), JSON_QUERY([c].[Tags],'$')
      FROM (
          SELECT * From Posts posts WHERE EXISTS (SELECT Value FROM OPENJSON(Tags) WITH (Value VARCHAR(300) '$.Value') AS Value WHERE Value in (@p0,@p1))
      ) AS [c]
      WHERE [c].[Id] = 1

Awesome! Well, I have one more option for you, but I do think it’s the inferior option. That said, for the sake of completeness, here it is.

Using Likes To Filter Rows

The solution presented above will use the JSON features of Microsoft SQL Server to accurately filter rows based on the values in a JSON object. If you’re dealing with simpler data, as I am in this post, a LIKE query might be good enough.

var value = "%\"dotnet\"%";
var one = await db.Posts
     .FromSqlRaw("SELECT * FROM Posts WHERE Tags LIKE {0}", value)
     .FirstOrDefaultAsync();

This approach certainly works, but has the disadvantage of using double-sided wildcards, which is known for being less-than optimal when it comes to performance. You also have to deal with creating the filter value with the use of % and " characters.

It’s an option if you need it, but I really do recommend the previous approach.

Conclusion

While Entity Framework Core introduced JSON support, as I wrote about in my JetBrains blog post, it sadly is missing collection filtering. However, using the FromSql variant methods, you can get it working while retaining the strengths of LINQ and query filtering. In addition, you can take the extra step to add much of the functionality mentioned here into an extension method. The one drawback to using FromSql is that it becomes the foundation for all future filtering, which may make it challenging to optimize your LINQ queries without adding more to the original SQL statement. In conclusion, I’m pretty happy with this approach, but looking forward to the day that Entity Framework Core supports this functionality out of the box.

I hope you enjoyed this post, and thank you for reading and sharing my content with others.