Quantcast
Channel: Entity Framework – The Reformed Programmer
Viewing all 76 articles
Browse latest View live

New features for unit testing your Entity Framework Core 5 code

$
0
0

This article is about unit testing applications that use Entity Framework Core (EF Core), with the focus on the new approaches you can use in EF Core 5. I start out with an overview of the different ways of unit testing in EF Core and then highlight some improvements brought in EF Core 5, plus some improvements I added to the EfCore.TestSupport version 5 library that is designed to help unit testing of EF Core 5.

NOTE: I am using xUnit in my unit tests. xUnit is well supported in NET and widely used (EF Core uses xUnit for its testing).

TL;DR – summary

NOTE: In this article I use the term entity class (or entity instance) to refer to a class that is mapped to the database by EF Core.

Setting the scene – why, and how, I unit test my EF Core code

I have used unit testing since I came back to being a developer (I had a long stint as a tech manager) and I consider unit testing one of the most useful tools to produce correct code. And when it comes to database code, which I write a lot of, I started with a repository pattern which is easy to unit test, but I soon moved away from the repository patten to using Query Objects. At that point had to work out how to unit test my code that uses the database. This was critical to me as I wanted my tests to cover as much of my code as possible.

In EF6.x I used a library called Effort which mocks the database, but when EF Core came out, I had to find another way. I tried EF Core’s In-Memory database provider, but the EF Core’s SQLite database provider using an in-memory database was MUCH better. The SQLite in-memory database (I cover this later) very easy to use for unit tests, but has some limitations so sometimes I have to use a real database.

The reason why I unit test my EF Core code is to make sure they work as I expected. Typical things that I am trying to catch.

  • Bad LINQ code: EF Core throws an exception if can’t translate my LINQ query into database code. That will make the unit test fail.
  • Database write didn’t work. Sometimes my EF Core create, update or delete didn’t work the way I expected. Maybe I left out an Include or forgot to call SaveChanges. By testing the database

While some people don’t like using unit tests on a database my approach has caught many, many errors which would be been hard to catch in the application. Also, a unit test gives me immediate feedback when I’m writing code and continues to check that code as I extend and refactor that code.

The three ways to unit test your EF Core code

Before I start on the new features here is a quick look at the three ways you can unit test your EF Core code. The figure (which comes from chapter 17 on my book “Entity Framework Core in Action, 2nd edition”) compares three ways to unit test your EF Core code, with the pros and cons of each.

As the figure says, using the same type of database in your unit test as what your application uses is the safest approach – the unit test database accesses will respond just the same as your production system. In this case, why would I also list using an SQLite in-memory database in unit tests? While there are some limitations/differences from other databases, it does have many positives:

  1. The database schema is always up to date.
  2. The database is empty, which is a good starting point for a unit test.
  3. Running your unit tests in parallel works because each database is held locally in each test.
  4. Your unit tests will run successfully in the Test part of a DevOps pipeline without any other settings.
  5. Your unit tests are faster.

NOTE: Item 3, “running your unit tests in parallel”, is an xUnit feature which makes running unit tests much quicker. It does mean you need separate databases for each unit test class if you are using a non in-memory database. The library EfCore.TestSupport has features to obtain unique database names for SQL Server databases.

The last option, mocking the database, really relies on you using some form pf repository pattern. I use this for really complex business logic where I build a specific repository pattern for the business logic, which allows me to intercept and mock the database access – see this article for this approach.

The new features in EF Core 5 that help with unit testing

I am going to cover four features that have changes in the EfCore.TestSupport library, either because of new features in EF Core 5, or improvements that have been added to the library.

  • Creating a unique, empty test database with the correct schema
  • How to make sure your EF Core accesses match the real-world usage
  • Improved SQLite in-memory options to dispose the database
  • How to check the SQL created by your EF Core code

Creating a unique, empty test database with the correct schema

To use a database in a xUnit unit test it needs to be:

  • Unique to the test class: that’s needed to allow for parallel running of your unit tests
  • Its schema must match the current Model of your application’s DbContext: if the database schema is different to what EF Core things it is, then your unit tests aren’t working on the correct database.
  • The database’s data should be a known state, otherwise your unit tests won’t know what to expect when reading the database. An empty database is the best choice.

Here are three ways to make sure database fulfils these three requirements.

  1. Use an SQLite in-memory which is created every time.
  2. Use a unique database name, plus calling EnsureDeleted, and then EnsureCreated.
  3. Use unique database name, plus call the EnsureClean method (only works on SQL Server).

1. Use an SQLite in-memory which is created every time.

The SQLite database has a in-memory mode, which is applied by setting the connection string to “Filename=:memory:”. The database is then hidden in the connection string, which makes it unique to its unit test and its database hasn’t been created yet. This is quick and easy, but you production database uses another type of database then it might not work for you.

The EF Core documentation on unit testing shows one way to set up a in-memory SQLite database, but I use the EfCore.TestSupport library’s static method called SqliteInMemory.CreateOptions<TContext> that will setup the options for creating an SQLite in-memory database, as shown in the code below.

[Fact]
public void TestSqliteInMemoryOk()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<BookContext>();
    using var context = new BookContext(options);

    context.Database.EnsureCreated();

    //Rest of unit test is left out
}

The database isn’t created at the start, so you need to call EF Core’s EnsureCreated method at the start. This means you get a database that matches the current Model of your application’s DbContext.

2. Unique database name, plus Calling EnsureDeleted, then EnsureCreated

If you are use a normal (non in-memory) database, then you need to make sure the database has a unique name for each test class (xUnit runs test classes in parallel, but methods in a test class are run serially). To get a unique database name the EfCore.TestSupport has methods that take a base SQL Server connection string from an appsetting.json file and adds the class name to the end of the current name (see code after next paragraph, and this docs).

That solves the unique database name, and we solve the “matching schema” and “empty database” by calling the EnsureDeleted method, then the EnsureCreated method. These two methods will delete the existing database and create a new database whose schema will match the EF Core’s current Model of the database. The EnsureDeleted / EnsureCreated approach works for all databases but is shown with SQL Server here.

[Fact]
public void TestEnsureDeletedEnsureCreatedOk()
{
    //SETUP
    var options = this.CreateUniqueClassOptions<BookContext>();
    using var context = new BookContext(options);
    
    context.Database.EnsureDeleted();
    context.Database.EnsureCreated();

    //Rest of unit test is left out
}

The EnsureDeleted / EnsureCreated approach used to be very slow (~10 seconds) for a SQL Server database, but since the new SqlClient came out in NET 5 this is much quicker (~ 1.5 seconds), which makes a big difference to how long a unit test would take to run when using this EnsureDeleted + EnsureCreated version.

3. Unique database name, plus call the EnsureClean method (only works on SQL Server).

While asking some questions on the EFCore GitHub Arthur Vickers described a method that could wipe the schema of an SQL Server database.  This clever method removed the current schema of the database by deleting all the SQL indexes, constraints, tables, sequences, UDFs and so on in the database. It then, by default, calls EnsuredCreated method to return a database with the correct schema and empty of data.

The EnsureClean method is deep inside EF Core’s unit tests, but I extracted that code and build the other parts needed to make it useful and it is available in version 5 of the EfCore.TestSupport library. The following listing shows how you use this method in your unit test.

[Fact]
public void TestSqlDatabaseEnsureCleanOk()
{
    //SETUP
    var options = this.CreateUniqueClassOptions<BookContext>();
    using var context = new BookContext(options);
    
    context.Database.EnsureClean();

    //Rest of unit test is left out
}

EnsureClean approach is a faster, maybe twice as fast as the EnsureDeleted + EnsureCreated version, which could make a big difference to how long your unit tests take to run. It also better in situations where your database server won’t allow you to delete or create new databases but does allow you to read/write a database, for instance if your test databases were on SQL Server where you don’t have admin privileges.

How to make sure your EF Core accesses match the real-world usage

Each unit test is a single method that has to a) setup the database ready for testing, b) runs the code you are testing, and the final part, c) checks that the results of the code you are testing are correct. And the middle part, run the code, must reproduce the situation in which the code you are testing is normally used. But because all three parts are all in one method it can be difficult to create the same state that the test code is normally used in.

The issue of “reproducing the same state the test code is normally used in” is common to unit testing, but when testing EF Core code this is made more complicated by the EF Core feature called Identity Resolution. Identity Resolution is critical in your normal code as it makes sure you only have one entity instance of a class type that has a specific primary key (see this example). The problem is that Identity Resolution can make your unit test pass even when there is a bug in your code.

Here is a unit test that passes because of Identity Resolution. The test of the Price at the end of the unit test should fail, because SaveChanges wasn’t called (see line 15). The reason it passed is because the entity instance in the variable called verifyBook was read from the database, but because the tracked entity instances inside the DbContext was found, and that was returned instead of reading from the database.

[Fact]
public void ExampleIdentityResolutionBad()
{
    //SETUP
    var options = SqliteInMemory
        .CreateOptions<EfCoreContext>();
    using var context = new EfCoreContext(options);

    context.Database.EnsureCreated();
    context.SeedDatabaseFourBooks();

    //ATTEMPT
    var book = context.Books.First();
    book.Price = 123;
    // Should call context.SaveChanges()

    //VERIFY
    var verifyBook = context.Books.First();
    //!!! THIS IS WRONG !!! THIS IS WRONG
    verifyBook.Price.ShouldEqual(123);
}

In the past we fixed this with multiple instances of the DbContext, as shown in the following code

public void UsingThreeInstancesOfTheDbcontext()
{
    //SETUP
    var options = SqliteInMemory         
        .CreateOptions<EfCoreContext>(); 
    options.StopNextDispose();
    using (var context = new EfCoreContext(options)) 
    {
        //SETUP instance
    }
    options.StopNextDispose();   
    using (var context = new EfCoreContext(options)) 
    {
        //ATTEMPT instance
    }
    using (var context = new EfCoreContext(options)) 
    {
        //VERIFY instance
    }
}

But there is a better way to do this with EF Core 5’s ChangeTracker.Clear method. This method quickly removes all entity instances the DbContext is currently tracking. This means you can use one instance of the DbContext, but each stage, SETUP, ATTEMPT and VERIFY, are all isolated which stops Identity Resolution from giving you data from another satge. In the code below there are two potential errors that would slip through if you didn’t add calls to the ChangeTracker.Clear method (or used multiple DbContexts).

  • Line 15: If the Include was left out the unit test would still pass (because the Reviews collection was set up in the SETUP stage).
  • Line 19: If the SaveChanges was left out the unit test would still pass (because the VERIFY read of the database would have bee given the book entity from the ATTEMPT stage)
public void UsingChangeTrackerClear()
{
    //SETUP
    var options = SqliteInMemory
        .CreateOptions<EfCoreContext>();
    using var context = new EfCoreContext(options);

    context.Database.EnsureCreated();             
    var setupBooks = context.SeedDatabaseFourBooks();              

    context.ChangeTracker.Clear();                

    //ATTEMPT
    var book = context.Books                      
        .Include(b => b.Reviews)
        .Single(b => b.BookId = setupBooks.Last().BookId);           
    book.Reviews.Add(new Review { NumStars = 5 });

    context.SaveChanges();                        

    //VERIFY
    context.ChangeTracker.Clear();                

    context.Books.Include(b => b.Reviews)         
        .Single(b => b.BookId = setupBooks.Last().BookId)            
        .Reviews.Count.ShouldEqual(3);            
}

This is much better than the three separate DbContext instances because

  1. You don’t have to create the three DbContext’s scopes (saves typing. Shorter unit test)
  2. You can use using var context = …, so no indents (nicer to write. Easier to read)
  3. You can still refer to previous parts, say to get its primary key (see use of setupBooks on line 16 and 25)
  4. It works better with the improved SQLite in-memory disposable options (see next section)

Improved SQLite in-memory options to dispose the database

You have already seen the SqliteInMemory.CreateOptions<TContext> method earlier but in version 5 of the EfCore.TestSupport library I have updated it to dispose the SQLite connection when the DbContext is disposed. The SQLite connection holds the in-memory database so disposing it makes sure that the memory used to hold the database is released.

NOTE: In previous versions of the EfCore.TestSupport library I didn’t do that, and I haven’t had any memory problems. But the EF Core docs say you should dispose the connection, so I updated the SqliteInMemory options methods to implement the IDisposable interface.

It turns out the disposing of the DbContext instance will dispose the options instance, which in turn disposes the SQLite connection. See the comments at the end of the code.

[Fact]
public void TestSqliteInMemoryOk()
{
    //SETUP
    var options = SqliteInMemory.CreateOptions<BookContext>();
    using var context = new BookContext(options);

    //Rest of unit test is left out
} // context is disposed at end of the using var scope,
  // which disposes the options that was used to create it, 
  // which in turn disposes the SQLite connection

NOTE: If you use multiple instances of the DbContext based on the same options instance, then you need to use one of these approaches to delay the dispose of the options until the end of the unit test.

How to check the SQL created by your EF Core code

If I’m interested in the performance of some part of the code I am working on, it often easier to look at the SQL commands in a unit test than in the actual application. EF Core 5 provides two ways to do this:

  1. The AsQueryString method to use on database queries
  2. Capturing EF Core’s logging output using the LogTo method

1. The AsQueryString method to use on database queries

The AsQueryString method will turn an IQueryable variable built using EF Core’s DbContext into a string containing the database commands for that query. Typically, I output this string the xUnit’s window so I can check it (see line 24). In the code below also contains a test so you can see the created SQL code – I don’t normally do that, but I put it in so you could see the type of output you get.

public class TestToQueryString
{
    private readonly ITestOutputHelper _output;

    public TestToQueryString(ITestOutputHelper output)
    {
        _output = output;
    }

    [Fact]
    public void TestToQueryStringOnLinqQuery()
    {
        //SETUP
        var options = SqliteInMemory.CreateOptions<BookDb
        using var context = new BookDbContext(options);
        context.Database.EnsureCreated();
        context.SeedDatabaseFourBooks();

        //ATTEMPT 
        var query = context.Books.Select(x => x.BookId); 
        var bookIds = query.ToArray();                   

        //VERIFY
        _output.WriteLine(query.ToQueryString());        
        query.ToQueryString().ShouldEqual(               
            "SELECT \"b\".\"BookId\"\r\n" +              
            "FROM \"Books\" AS \"b\"\r\n" +              
            "WHERE NOT (\"b\".\"SoftDeleted\")");        
        bookIds.ShouldEqual(new []{1,2,3,4});            
    }
}

2. Capturing EF Core’s logging output using the LogTo method

EF Core 5 makes it much easier to capture the logging that EF Core outputs (before you needed to create a ILoggerProvider class and register that with EF Core). But now you can add the LogTo method to your options and it will return a string output for every log. The code below shows how to do this, with the logs output to the xUnit’s window.

[Fact]
public void TestLogToDemoToConsole()
{
    //SETUP
    var connectionString = 
        this.GetUniqueDatabaseConnectionString();
    var builder =                                   
        new DbContextOptionsBuilder<BookDbContext>()
        .UseSqlServer(connectionString)             
        .EnableSensitiveDataLogging()
        .LogTo(_output.WriteLine);


    // Rest of unit test is left out

The LogTo has lots of different ways to filter and format the output (see the EF Core docs here). I created versions of the EfCore.TestSupport’s SQLite in-memory and SQL Server option builders to use LogTo, and I decided to use a class, called LogToOptions, to manage all the filters/formats for LogTo (the LogTo requires calls to different methods). This allowed me to define better defaults (defaults to: a) Information, not Debug, log level, b) does not include datetime in output) for logging output and make it easier for the filters/formats to be changed.

I also added a feature that I use a lot, that is the ability to turn on or off the log output. The code below shows the SQLite in-memory option builder with the LogTo output, plus using the ShowLog feature. This only starts output logging once the database has been created and seeded – see highlighted line 17 .

[Fact]
public void TestEfCoreLoggingCheckSqlOutputShowLog()
{
    //SETUP
    var logToOptions = new LogToOptions
    {
        ShowLog = false
    };
    var options = SqliteInMemory
         .CreateOptionsWithLogTo<BookContext>(
             _output.WriteLine);
    using var context = new BookContext(options);
    context.Database.EnsureCreated();
    context.SeedDatabaseFourBooks();

    //ATTEMPT 
    logToOptions.ShowLog = true;
    var book = context.Books.Single(x => x.Reviews.Count() > 1);

    //Rest of unit test left out
}

Information to existing user of the EfCore.TestSupport library

It’s no longer possible to detect the EF Core version via the netstandard so now it is done via the first number in the library’s version. For instance EfCore.TestSupport, version 5.?.? works with EF Core 5.?.?.* At the same time the library was getting hard to keep up to date, especially with EfSchemaCompare in it, so I took the opportunity to clean up the library.

BUT that clean up includes BREAKING CHANGES, mainly around SQLite in-memory option builder. If you use SqliteInMemory.CreateOptions you MUST read this document to decided whether you want to upgrade or not.

NOTE: You may not be aware, but your NuGet packages in your test projects override the same NuGet packages installed EfCore.TestSupport library. So, as long as you add the newest versions of the EF Core NuGet libraries, then the EfCore.TestSupport library will use those. The only part that won’t run is the EfSchemaCompare, but has now got its own library so you can use that directly.

Conclusion

I just read a great article on the stackoverfow blog which said

“Which of these bugs would be easier for you to find, understand, repro, and fix: a bug in the code you know you wrote earlier today or a bug in the code someone on your team probably wrote last year? It’s not even close! You will never, ever again debug or understand this code as well as you do right now, with your original intent fresh in your brain, your understanding of the problem and its solution space rich and fresh.

I totally agree and that’s why I love unit tests – I get feedback as I am developing (I also like that unit tests will tell me if I broke some old code too). So, I couldn’t work without unit tests, but at the same time I know that unit tests can take a lot of development time. How do I solve that dilemma?

My answer to the extra development time isn’t to write less unit tests, but to build a library and develop patterns that makes me really quick at unit testing. I also try to make my unit tests fast to run – that’s why I worry about how long it takes to set up the database, and why I like xUnit’s parallel running of unit tests.

The changes to the EfCore.TestSupport library and my unit test patterns due to EF Core 5 are fairly small, but each one reduces the number of lines I have to write for each unit test or makes the unit test easier to read. I think the ChangeTracker.Clear method is the best improvement because it does both – it’s easier to write and easier to read my unit tests.

Happy coding.


How to update a database’s schema without using EF Core’s migrate feature

$
0
0

This article is aimed at developers that want to use EF Core to access the database but want complete control over their database schema. I decided to write this article after seeing the EF Core Community standup covering the EF Core 6.0 Survey Results. In that video there was a page looking at the ways people deploy changes to production (link to video at that point), and quite a few respondents said they use SQL scripts to update their production database.

It’s not clear if people create those SQL scripts themselves or use EF Core’s Migrate SQL scripts, but Marcus Christensen commented during the Community standup (link to video at that point) said “A lot of projects that I have worked on during the years, has held the db as the one truth for everything, so the switch to code first is not that easy to sell.

To put that in context he was saying that some developers want to retain control of the database’s schema and have EF Core match the given database. EF Core can definitely do that, but in practice it gets a bit more complex (I know because I have done this on real-world applications).

TL;DR – summary

  • If you have a database that isn’t going to change much, then EF Core’s reverse engineering tool can create classes to map to the database and the correct DbContext and configurations.
  • If you are change the database’s schema as the project progresses, then the reverse engineering on its own isn’t such a good idea. I cover three approaches to cover this:
    • Option 0: Have a look at the EF Core 5’s improved migration feature to check it work for you – I will save you time if it can work for your project.
    • Option 1: Use Visual Studio’s extension called EF Core Power Tools. This is reverse engineering on steroids and is designed for repeated database’s schema changes.
    • Option 2: Use the EfCore.SchemaCompare library. This lets you to write EF Core code and update database schema manually and tells you where they differ.  

Setting the scene – what are the issues of updating your database schema yourself?

If you have a database that isn’t changing, then EF Core’s reverse engineering tool as a great fit. This reads your SQL database and creates the classes to map to the database (I call these classes entity classes) and a class you use to access the database, with EF Core configurations/attributes to define things in the EF Core to match your database.

That’s fine for a fixed database as you can take the code the reverse engineering tool output and edit it to work the way you want it to. You can (carefully) alter the code that the reverse engineering tool produces to get the best out of EF Core’s features, like Value Converters, Owned type, Query Filters and so on.

The problems come if you are enhancing the database as the project progresses, because the EF Core’s reverse engineering works, but some things aren’t so good:

  1. The reverse engineering tool has no way to detect useful EF Core features, like Owned Types, Value Converters, Query Filters, Table-per-Hierarchy, Table-per-Type, table splitting, and concurrency tokens, which means you need to edit the entity classes and the EF Core configurations.
  2. You can’t edit the entity classes or the DbContext class because you will be replacing them the next time you change your database. One way around this is to add to the entity classes with another class of the same name – that works because the entity classes are marked as partial.
  3. The entity classes have all the possible navigational relationships added, which can be confusing if some navigational relationships would typically not be added because of certain business rules. Also, you can’t change the entity classes to follow a Domain-Driven Design approach.
  4. A minor point, you need to type in the reverse engineering command, which can be long, every time. I only mention because Option 1 will solve that for you.

So, if you want to have complete control over your database you have a few options, one of which I created (see Option 2). I start with a non-obvious approach considering the title of this article – that is using EF Core to create a migration and tweaking it. I think its worth a quick look at this to make sure you’re not taking on more work than need to – simply skip Option 0 if you are sure you don’t want to use EF Core migrations.

Option 0: Use EF Core’s Migrate feature

I have just finished updating my book Entity Framework Core in Action to EF Core 5 and I completely rewrote many of the chapters from scratch because there was so much change in EF Core (or my understanding of EF Core) – one of those complete rewrites was the chapter on handling database migrations.

I have say I wasn’t a fan of EF Core migration feature, but after writing the migration chapter I’m coming around to using the migration feature. Partly it was because I more experience on real-world EF Core applications, but also some of the new features like the MigrationBuilder.Sql() gives me more control of what the migration does.

The EF Core team want you to at least review, and possibly alter, a migration. Their approach is that the migration is rather like a scaffolded Razor Page (ASP.NET Core example), where it’s a good start, but you might want to alter it. There is a great video on EF Core 5 updated migrations and there was a discussion about this (link to video at the start of that discussion).

NOTE: If you decide to use the migration feature and manually alter the migration you might find Option 2 useful to double check your migration changes still matches EF Core’s Model of the database.

So, you might like to have a look at EF Core migration feature to see if it might work for you. You don’t have to change much in the way you apply the SQL migration scripts, as EF Core team recommends having the migration be apply by scripts anyway.

Option 1: Use Visual Studio’s extension called EF Core Power Tools

In my opinion, if you want to reverse engineer multiple times, then you should use Erik Ejlskov Jensen (known as @ErikEJ on GitHub and Twitter) Visual Studio’s extension called EF Core Power Tools. This allows you to run EF Core’s reverse engineering service via a friendly user interface, but that’s just the start. It provides a ton of options, some not even EF Core’s reverse engineering like reverse engineering SQL stored procs. All your options are stored, which makes subsequent reverse engineering just select and click. The extension also has lots other features creating a diagram of your database based on what your entity classes and EF Core configuration.  

I’m not going to detail all the features in the EF Core Power Tools extension because Erik has done that already. Here are two videos as a starting point, plus a link to the EF Core Power Tools documentation.

So, if you are happy with the general type of output that reverse engineering produces, then the EF Core Power Tools extension is a very helpful tool with extra features over the EF Core reverse engineering tool. EF Core Power Tools also specifically designed for continuous changes to the database, and Erik used it that way in the company he was working for.

NOTE: I talked to Erik and he said they use a SQL Server database project (.sqlproj) to keep the SQL Server schema under source control, and the resulting SQL Server .dacpac files to update the database and EF Core Power Tools to update the code. See this article for how Erik does this.

OPTION 2: Use the EfCore.SchemaCompare library

The problem with any reverse engineering approach is that you aren’t fully in control of the entity classes and the EF Core features. Just as developers want complete control over the database, I also want complete control of my entity classes and what EF Core features a can use. As Jeremy Likness said on the EF Core 6.0 survey video when database-first etc were being talked about  “I want to model the domain properly and model the database property and then use (EF Core) fluent API to map to two together in the right way” (link to video at that point).

I feel the same, and I built a feature I refer to as EfSchemaCompare – the latest version of this (I have version going back to EF6!) is in the repo https://github.com/JonPSmith/EfCore.SchemaCompare. This library compares EF Core’s view of the database based on the entity classes and the EF Core configuration against any relational database that EF Core supports. That’s because, like EF Core Power Tools, I use EF Core’s reverse engineering service to read the database, so no extra coding for me to do.

This library allows me (and you) to create my own SQL scripts to update the database while using any EF Core feature I need in my code. I can then run the EfSchemaCompare tool and it tells me if my EF Core code matches the database. If they don’t it gives me detailed errors so that I can fix either the database or the EF Core code. Here is a simplified diagram on how EfSchemaCompare works.

The plus side of this is I can write my entity classes any way I like, and normally I use a DDD pattern for my entity classes. I can also use many of the great EF Core features like Owned Types, Value Converters, Query Filters, Table-per-Hierarchy, table splitting, and concurrency tokens in the way I want to. Also, I control the database schema – in the past I have created SQL scripts and applied them to the database using DbUp.

The downside is I have to do more work. The reverse engineering tool or the EF Core migrate feature could do part of the work, but I have decided I want complete control over the entity classes and the EF Core features I use. As I said before, I think the migration feature (and documentation) in EF Core 5 is really good now, but for complex applications, say working with a database that has non-EF Core applications accessing it, then the EfSchemaCompare tool is my go-to solution.

The README file in the EfCore.SchemaCompare repo contains all the documentation on what the library checks and how to call it. I typically create a unit test to check a database – there are lots of options to allow you to provide the connection string of the database you want to test against your entity classes and EF Core configuration provided by your application’s DbContext class.

NOTE: The EfCore.SchemaCompare library only works with EF Core 5. There is a version in the EfCore.TestSuport library version 3.2.0 that works with EF Core 2.1 and EF Core 3.? and the documentation for that can be found in the Old-Docs page. This older version has more limitations than the latest EfCore.SchemaCompare version.

Conclusion

So, if you, like Marcus Christensen, consider “the database as the one truth for everything”, then I have described two (maybe three) options you could use. Taking charge of your database schema/update is a good idea, but it does mean you have to do more work.

Using EF Core’s migration tool, with the ability to alter the migration is the simplest, but some people don’t like that. The reverser engineering/EF Core Power Tools is the next easiest, as it will write the EF Core code for you. But if you want to really tap into EF Core’s features and/or DDD, then these approaches don’t cut it. That’s why I created the many versions of the EfSchemaCompare library.

I have used the EfSchemaCompare library on real-world applications, and I have also worked on client projects that used EF Core’s migration feature. The migration feature is much simpler but sometimes it’s too easy, which means you don’t think enough about what the best schema for your database would be. But that’s not the problem of the migration feature, its our desire/need to quickly move on, because you can change any migration EF Core produces if you want to.

I hope this article was useful to you on your usage of EF Core. Let me know your thoughts on this in the comments.

Happy coding.

My experience of using modular monolith and DDD architectures

$
0
0

This article is about my experience of using a Modular Monolith architecture and Domain-Driven Design (DDD) approach on a small, but complex application using ASP.NET Core and EF Core. This isn’t a primer on modular monolith or DDD (but there is a good summary of each with links) but gives my views of the good and bad aspects of each approach.

  1. My experience of using modular monolith and DDD architectures (this article).
  2. My experience of using the Clean Code architecture with a Modular Monolith.

I’m also interested in architecture approaches that help me to build applications with a good structure, even when I’m under some time pressure to finish – I want to have rules and patterns than help me to “do the right thing” even when I am under pressure. I’m about to explore this aspect because the I was privileged (?!) to be working on a project that was late😊.

TL;DR – summary

  • The Modular Monolith architecture breaks up the code into independent modules (C# projects) for each of the features needed in your application. Each module only link to other modules that specifically provides services it needs.
  • Domain-Driven Design (DDD) is a holistic approach to understanding, designing and building software applications.
  • I build an application using ASP.NET Core and EF Core using both of these architectural approaches. After this application was finished, I analysed how each approach had worked under time pressure.
  • It was the first time I had used the Modular Monolith architecture, but it came out with flying colours. The code structure consists of 22 projects, and each (other than the ASP.NET Core front-end) are focused on one specific part of the application’s features.
  • I have used DDD a lot and, as I expected, it worked really well in the application. The classes (called entities by DDD) have meaningful named methods to update the data, so it is easy to understand.
  • I also give you my views of the good, bad and possible “cracks” that each approach has.

The architectural approaches covered in this article

At the start of building my application I knew the application would go through three major changes. I therefore wanted architectural approaches that makes it easier to enhance the application. Here are the main approaches I will be talking about:

  1. Modular Monolith: building independent modules for each feature.
  2. DDD: Better separation and control of business logic

NOTE: I used a number of other architectures and patterns that I am not going to describe. They are layered architecture, domain events / integration events, and CQRS database pattern to name but a few.

1. Modular Monolith

A Modular Monolith is an approach where you build and deploy a single application (that’s the “Monolith” part), but you build application in a way that breaks up the code into independent modules for each of the features needed in your application. This approach reduces the dependencies of a module in such as way that you can enhance/change a module without it effecting other modules. The figure below shows you 9 projects that implement two different features in the system. Notice they have only one common (small) project.

The benefits of a Modular Monolith approach over a normal Monolith are:

  • Reduced complexity: Each module only links to code that it specifically needs.
  • Easier to refactor: Changing a module has less or no effect on other modules.
  • Better for teams: Easier for developers to work on different parts of the code.

Links to more detailed information on modular monoliths

NOTE: An alternative to a monolith is to go to a Microservice architecture. Microservice architecture allow lots of developers to work in parallel, but there are lots of issues around communications between each Microservice – read this Microsoft document comparing a Monolith vs. Microservice architecture. The good news is a Modular Monolith architecture is much easier to convert to a Microservice architecture because it is already modular.

3. Domain-Driven Design (DDD)

DDD is holistic approach to understanding, designing and building software application. It comes from the book Domain-Driven Design by Eric Evans. Because DDD covers so many areas I’m only going to talk about the DDD-styled classes mapped to the database (referred to as entity classes in this article) and a quick coverage of bounded contexts.

DDD says your entity classes must control of the data that it contains; therefore, all the properties are read-only with constructors / methods used to create or change the data in an entity class. That way the entity class’s constructors / methods can ensure the create / update follows the business rules for this entity.

NOTE: The above paragraph is super-simplification of what DDD says about entity classes and there is so much more to say. If you are new to DDD then google “DDD entity” and your software language, e.g., C#.

Another DDD term is bounded contexts which is about separating your application into separate parts with very controlled interaction between different bounded context. For example, my application is an e-commerce web site selling book, and I could see that the display of books was different to the customer ordering some books. There is shared data (like the product number and the price the book is sold for) but other than that they are separate.

The figure below shows how I separated the displaying of book and the ordering of books at the database level.

Using DDD’s bounded context technique, the Book bounded context can change its data (other than the product code and sale price) without it effecting the Orders bounded context, and the Orders code can’t change anything in the Books bounded context.

The benefits of DDD over a non-DDD approach are:

  • Protected business rules: The entity classes methods contain most of the business logic.
  • More obvious: entity classes containing methods with meaningful named to call.
  • Reduces complexity: Bounded context breaks an app into separate parts.

Links to more detailed information on DDD

Setting the scene – the application and the time pressure

In 2020 I was updating my book “Entity Framework Core in Action” and my example application was an ASP.NET Core application that sells books called the Book App. In the early chapters the Book App is very simple, as I am describing the basics of EF Core. But in the last section I build a much more complex Book App that I progressively performance tuned, starting with 700 books, then 100,000 books and finally ½ million books. For the Book App to perform well it through three significant enhancement stages.  Here is example of Book App features and display with four different ways to display the books to compare their performance.

At the same time, I was falling behind on getting the book finished. I had planned to finish all the chapters by the end of November 2020 when EF Core 5 was due out. But I only started the enhanced Book App in August 2020 so with 6 chapters to write I was NOT going to finish the book in November. So, the push was on to get things done! (In the end I finished writing the book just before Christmas 2020).

NOTE: The ASP.NET Core application I talk about in this article is available on GitHub. It is in branch Part3 of the repo https://github.com/JonPSmith/EfCoreinAction-SecondEdition and can be run locally.

My experience of each architectural approach

As well as experiencing each architectural approach while upgrade the application (what Neal Ford calls evolutionary architecture) I also had the extra experience of building the Book App under a serious time pressure. This type of situation is a great for learning whether the approaches worked or not, which is what I am now going to describe. But first here is a summary for you:

  1. Modular Monolith = Brilliant. First time I had used it and I will use it again!
  2. DDD = Love it: Used it for years and its really good.

Here are more details on each of these three approaches where I point out:

  1. What was good?
  2. What was bad?
  3. How did it fair under time pressure?

1. Modular Monolith

I was already aware of the modular monolith approach, but I hadn’t used this approach in application before. My experience was it worked really well: in fact, it was much better than I thought it would be. I would certainly use this approach again on any medium to large application.

1a. Modular Monolith – what was good?

The first good thing is how the modular monolith compared with the traditional “N-Layer” architecture (shortened to layered architecture). I have used the layered architecture approach many times, usually with four projects, but the modular monolith application has 22 projects, see figure to the right.

This means I know the code in each project is doing one job, and there are only links to other projects that are relevant to a project. That makes it easier to find and understand the code.

Also, I’m much more inclined to refactor the old code as I’m less likely to break something else. In contrast the layered architecture on its own has lots of different features in one project and I can’t be sure what its linked to, which makes me more declined refactor code as it might affect other code.

1b. Modular Monolith – what was bad?

Nothing was really wrong with the modular monolith but working out the project naming convention took some time, but it was super important (see the list of project in the figure above). By having the right naming convention, the name told me a) where in the layered architecture the project was in and b) the end of the name told me what it does. If you are going to try using a modular monolith approach, I recommend you think carefully about your naming convention.

I didn’t change name too often because of a development tool issue (Visual Studio) as you could rename the project, but the underlying folder name wasn’t changed, which makes the GitHub/folder display look wrong. The few I did change required me to rename the project, and then outside Visual Studio rename the folder and then hand-editing the solution file, which is horrible!

NOTE: I have since found a really nice tool that will do a project/folder rename and updates the solution/csproj files on applications that use Git for source control.

Also, I learnt to not end project name with the name of a class e.g., Book class, as that caused problems if you referred to the Book class in that project. That’s why you see projects ending with “s” e.g., “…Books” and “…Orders”.

1c. Modular Monolith – how did it fair under time pressure?

It certainly didn’t slow me down (other the time deliberating over the project naming convention!) and it might have made me a bit faster than the layered architecture because I knew where to go to find the code. If I had come back to work on the app after a few months, then it would be much quicker to find the code I am looking for and it will be easier to change without effecting other features.

I did find me breaking the rules at one point because I was racing to finish the book. The code ended up with 6 different ways to query the database for the book display, and there were a few common parts, like some of the constants used in the sort/filter display and one DTO. Instead of creating a BookApp.ServiceLayer.Common.Books project I just referred to the first project. That’s my bad, but it shows that while the modular monolith approach really helps separate the code, but it does rely on the developers following the rules.

NOTE: I had to go back to the Book App to add a feature to two of the book display query so I took the opportunity to create a project called BookApp.ServiceLayer.DisplayCommon.Books which holds all the common code. That has removed the linking between each query feature and made it clear what code is shared.

2. Domain-Driven Design

I have used DDD many years and I find it an excellent approach which is focuses on the business (domain) issues rather that the technical aspects of the application. Because I have used DDD so much it was second nature to me, but I try to define what works, didn’t work etc. in the Book App, and some feedback from working on client’s applications.

2a. Domain-Driven Design – what was good?

DDD is so massive I am going to only talk about one of the key aspects I use every day – that is the entity class. DDD says your entity classes should be in complete control over the data inside it, and its direct relationships. I therefore make all the business classes and the classes mapped to the database to have read-only properties and use constructors and methods to create or update the data inside.

Making the entity’s properties are read-only means your business logic/validation must be in the entity too. This means:

  • You know exactly where to look for the business logic/validation is – its in the entity with its data.
  • You change an entity by calling a appropriately named method in the entity, e.g. AddReview(…). That makes it crystal clear what you are doing and what parameters it needs.
  • The read-only aspect means you can ONLY update the data via the entity’s method. This is DRY, but more importantly its obvious where to find the code.
  • Your entity can never be in an invalid state because the constructors / methods will check the create/update and return an error if it would make the entity’s data invalid.

Overall, this means using a DDD entity class is so much clearer to create and update than changing a few properties in a normal class. I love the clarity that the names methods provide – its obvious what I am doing.

I already gave you an example of the Books and Orders bounded contexts. That worked really well and was easy to implement once a understood how to use EF Core to map a class to a table as if it was a SQL view.

2b. Domain-Driven Design – what was bad?

One of the downsides of DDD is you have to write more code. The figure below shows this by  comparing a normal (non-DDD) approach on the left against a DDD approach on the right in an ASP.NET Core/EF Core application.

Its not hard to see that you need to write more code, as the non-DDD version on the left is shorter. It might only be five or ten lines, but that mounts up pretty quickly when you have a real application with lots of entity classes and methods.

Having been using DDD for a long time I have built a library called EfCore.GenericServices, which reduces the amount of code you have to write. It does this replacing the repository and the method call with a service. You still have to write the method in the entity class, but that library reduces the rest of the code to a DTO / ViewModel and a service call. The core below how you would use the EfCore.GenericServices library in an ASP.NET Core action method.

[HttpPost]
[ValidateAntiForgeryToken]
public async Task<IActionResult> AddBookReview(AddReviewDto dto, 
    [FromServices] ICrudServicesAsync<BookDbContext> service)
{
    if (!ModelState.IsValid)
    {
        return View(dto);
    }
    await service.UpdateAndSaveAsync(dto);

    if (service.IsValid)
        //Success path
        return View("BookUpdated", service.Message);

    //Error path
    service.CopyErrorsToModelState(ModelState, dto);
    return View(dto);
}

Over the years the EfCore.GenericServices library has saved me a LOT of development time. It can’t do everything but its great at handling all the simple Create, Update and Deleting (known as CUD) leaving me to work on more complex parts of the application.

2c. Domain-Driven Design – how did it fair under time pressure?

Yes, DDD does take a bit more time but its (almost) impossible to bypass the way DDD works because of the design. In the Book App it worked really well, but I did use an extra approach known as domain events (see this article about this approach) which made some of the business logic easier to implement.

I didn’t break any DDD rules in the Book App, but two projects I worked on both clients found calling DDD methods for every update didn’t work for their front-end. For instance, one client wanted to use JSON Patch to speed up the front-end (angular) development.

To handle this I came up with the hybrid DDD style where non-business properties are read-write, but data with business logic/validation has to go through methods calls. The hybrid DDD style is a bit of a compromise over DDD, but certainly speeded up the development for my clients.  In retrospect both projects worked well, but I do worry that the hybrid DDD does allow a developer in a hurry to make a property read-write when they shouldn’t. If every entity class is locked down, then no one can break the rules.

Conclusion

I always like to analyse any applications I work on after the project is finished. That’s because when I am still working on a project there is often pressure (often from myself) to “get it done”. By analysing a project after its finished or a key milestone is met, I can look back more dispassionately and see if there is anything to learn from the project.

My main take-away from building the Book App is that the Modular Monolith approach was very good, and I would use it again. The modular monolith approach provides small, focused projects. That means I know all the code in the project is doing one job, and there are only links to other projects that are relevant to a project. That makes it easier to understand, and I’m much more inclined to refactor the old code as I’m less likely to break something else.

I would say that the hardest part of using the Modular Monolith approach was working out the naming convention of the projects, which only goes to prove the quote “There are only two hard things in Computer Science: cache invalidation and naming things” is still true😊.

DDD is an old friend to me and while it needed a bit more lines of code written the result is a rock-solid design where every entity makes sure its data in a valid state. Quite a lot of the validation and simple business logic can live inside the entity class, but business logic that cuts across multiple entities or bounded context can be a challenge. I have an approach to handle that, but I also used a new feature I learnt from a client’s project about a year ago and that helped too.

I hope this article helps you consider these two architectural approaches, or if you are using these approaches, they might spark some ideas. I’m sure many people have their own ideas or experiences so please do leave comments as I’m sure there is more to say on this subject.

Happy coding.

My experience of using the Clean Code architecture with a Modular Monolith

$
0
0

In this article I look at my use of a Clean Code architecture with the modular monolith architecture covered in the first article. Like the first article this isn’t primer on Clean Code and modular monolith but is more about how I adapted the Clean Code architecture to provide the vertical separation of the features in the modular monolith application.

  1. My experience of using modular monolith and DDD architectures.
  2. My experience of using the Clean Code architecture with a Modular Monolith (this article).

Like the first article I’m going to give you my impression of the good and bad parts of the Clean Code architecture, plus a look at whether the time pressure of the project (which was about 5 weeks later) made me “break” any rules.

TL;DR – summary

  • The Clean Code architecture is like the traditional layered architecture, but with a series of rules that improve the layering.
  • I build an application using ASP.NET Core and EF Core using the Clean Code architecture with the modular monolith approach. After this application was finished, I analysed how each approach had worked under time pressure.
  • I had used the Clean Code architecture once before on a client’s project, but not with the modular monolith approach.
  • While the modular monolith approach had the biggest effect on the application’s structure without the Clean Code layers the code would not be so good.
  • I give you my views of the good, bad and possible “cracks under time pressure” for the Clean Code architecture.
  • Overall I think the Clean Code architecture adds some useful rules to the traditional layered architecture, but I had to break one of those rules you make it work with EF Core.

A summary of the Clean Code architecture

NOTE: I don’t describe the modular monolith in this article because I did that in the first article. Here is a link to the modular monolith intro in the first article.

The Clean Code approach (also called the Hexagonal Architecture and Onion Architecture) is an development of the traditional “N-Layer” architecture (shortened to layered architecture). The Clean Code approach talks about “onion layers” wrapped around each other and has the following main rules:

  1. The business classes (typically the classes mapped to a database) are in the inner-most layer of the “onion”.
  2. The inner-most layer of the onion should not have no significant external code e.g., NuGet packages, added to it. This is designed to keep the business logic as clean and simple as possible.
  3. Only the outer layer can access anything outside of the code. That means:
    1. The code that users access, e.g. ASP.NET Core, is in the outer layer
    1. Any external services, like the database, email sending etc. is in the outer layer.
  4. Code in inner layers can’t reference any outer layers.

The combination of rules 3 and 4 could cause lots of issues as lower layers will need to access external services. This is handled by adding interfaces to the inner-most layer of the onion and registering the external services using dependency injection (DI).

The figure below shows how I applied the Clean Code to my application, with is an e-commerce web site selling book, called the Book App.

NOTE: I detail the modification that I make to Clean Code approach around the persistence (database) layer later in the article.

Links to more detailed information on clean code (unmodified)

Setting the scene – the application and the time pressure

In 2020 I was updating my book “Entity Framework Core in Action” I build an ASP.NET Core application that sells books called Book App. In the early chapters is very simple, as I am describing the basics of EF Core, but in the last section I build a much more complex Book App that I progressively performance tuned, starting with 700 books, then 100,000 books and finally ½ million books. For the Book App to perform well it through three significant enhancement stages.  Here is example of Book App features and display with four different ways to display the books to compare their performance.

At the same time, I was falling behind on getting the book finished. I had planned to finish all the chapters by the end of November 2020 when EF Core 5 was due out. But I only started the enhanced Book App in August 2020 so with 6 chapters to write I was NOT going to finish the book in November. So, the push was on to get things done! (In the end I finished writing the book just before Christmas 2020).

My experience of using Clean Code with a Modular Monolith

I had used a simpler Clean Code architecture on a client’s project I worked on, so I had some ideas of what I would do. Clean Code was useful, but its just another layered architecture with more rules and I had to break one of its key rules to make it work with EF Core. Overall I think I would use my modified Clean Code architecture again in a larger application.

A. What was good about Clean Code?

To explain how Clean Code helps we need to talk about the main architecture – the modular monolith goals. The modular monolith focuses on features (Kamil Grzybek called them modules). One way to work would have one project per feature, but that has some problems.

  • The project would be more complex, as it has everything inside it.
  • You could end up with duplicating some code.

The Separation of Concerns (SoC) principal says breaking up a feature parts that focus on one part of the feature is a better way to go. So, the combination of modular monolith and using layers provides a better solution. The figure below shows two modular monolith features running vertically, and the five Clean Code layers running horizontally. The figure has a lot in it, but it’s there to show:

  • Reduce complexity: A feature can be split up into projects spread across the Clean Code layers, thus making the feature easier to understand, test and refactor.
  • Removing duplication: Breaking up the features into layer stops duplication – feature 1 and 2 share the Domain and Persistence layers.

The importance of the Service Layer in the Clean Code layers

Many years ago, I was introduced to the concept of the Service Layer. There are many definitions of the Service Layer (try this definition), but for me it’s a layer that knows about the lower / inner layer data structures and knows about the front-end data structures and it can adapt between the two structures (see LHS of the diagram above). So, the Service Layer isolates lower layers from having to know how the front-end works.

For me a Service Layer is a very important level.

  • It holds all the business logic or database accessed that the front-end needs, normally providing as services. This makes it much easier to unit test these services.
  • It takes on the job of adapting data to / from the front end. This means this layer that has to care about the two different data structures.

NOTE: Some of my libraries, like EfCore.GenericServices and EfCore.GenericBizRunner are designed to work as a Service Layer type service i.e., both libraries adapts between the lower / inner layer data structures to the front-end data structures.

Thus, the infrastructure layer, which is just below the Service Layer, contains for services that are still working in the entity class view. In the Book App these projects contained code to seed the database, handle logging and providing event handling. While services in the Service Layer worked with both lower / inner layer data structures and front-end data structures.

To end the “good” part of Clean Code I should say that a layered architecture could also provide the layering that the Clean Code defined. It’s just that the Clean Code has some more rules, most of which are useful.

B. Clean Code – what was bad?

The main problem was fitting the EF Core DbContext into the Clean Code architecture. Clean Code says that the database should be on the outer ring, with interfaces for the access. The problem is there is no simple interface that you can use for the application’s DbContext. Even if you using a repository pattern (which I don’t, and here is why) then you have a problem that the application’s DbContext has to be define deep in the onion.

My solution was to put the EF Core right after to the inner circle (name Domain) holding the entity classes – I called that layer persistence, as that’s what DDD calls it. That breaks one of the key rules of the Clean Code, but other than that it works fine. But other external services, such as an external email service, I would follow the Clean Code rules and add an interface in the inner (Domain) circle and register the service using DI.

Clean Code – how did it fair under time pressure?

Appling the Clean Code and Modular Monolith architectures together took a little more time to think thought (I covered this in this section of the first article), but the end result was very good (I explain that in this section of the first article). The Clean Code layers broke a modular monolith feature into different parts, thus making the feature easier to understand and removing duplicate code.

The one small part of the clean architecture approach I didn’t like, but I stuck to, is that the Domain layer shouldn’t have any significant external packages, for instance a NuGet library, added to it. Overall, I applaud this rule as it keeps the Domain entities clean, but it did mean I had to do more work when configuring the EF Core code, e.g. I couldn’t use EF Core’s [Owned] attribute on entity classes. In a larger application I might break that rule.

So, I didn’t break any Clean Code rules because of the time pressure. The only rules I changed were make it work with EF Core, but I might break the “Domain layer and no significant external packages” in the future.

Conclusion

I don’t think the Clean Code approach has as big of effect on the structure that the modular monolith did (read the first article), but Clean Code certainly added to the structure by breaking modular monolith features into smaller, focused projects. The combination of the two approaches gave a really good structure.

My question is: does the Clean Code architecture provide good improvements over a traditional layered architecture, especially as I had to break one of its key rules to work with EF Core? My answer is that using the Clean Code approach has made me a bit more aware of how I organise my layers, for instance I now have an infrastructure layer that I didn’t have before, and I appreciate that.

Please feel free to comment on what I have written about. I’m sure there are lots of people who have more experience with the Clean Code architecture than me, so you can give your experience too.

Happy coding.

Five levels of performance tuning for an EF Core query

$
0
0

This is a companion article to the EF Core Community Standup called “Performance tuning an EF Core app” where I apply a series of performance enhancements to a demo ASP.NET Core e-commerce book selling site called the Book App. I start with 700 books, then 100,000 books and finally ½ million books.

This article, plus the EF Core Community Standup video, pulls information from chapters 14 to 16 from my book “Entity Framework Core in Action, 2nd edition” and uses code in the associated GitHub repo https://github.com/JonPSmith/EfCoreinAction-SecondEdition.

NOTE: You can download the code and run the application described in this article/video via the https://github.com/JonPSmith/EfCoreinAction-SecondEdition GitHub repo. Select the Part3 branch and run the project called BookApp.UI. The home page of the Book App has information on how to change the Book App’s settings for chapter 15 (four SQL versions) and chapter 16 (Cosmos DB).

Other articles that are relevant to the performance tuning shown in this article

TL;DR – summary

  • The demo e-commerce book selling site displays books with various sort, filter and paging that you might expect to need. One of the hardest of the queries is to sort the book by their average votes (think Amazon’s star ratings).
  • At 700 books a well-designed LINQ query is all you need.
  • At 100,000 books (and ½ million reviews) LINQ on its own isn’t good enough. I add three new ways to handle the book display, each one improving performance, but also takes more development effort.
  • At ½ million books (and 2.7 million reviews) SQL on its own has some serious problems, so I swap to a Command Query Responsibility Segregation (CQRS) architecture, with the read-side using a Cosmos DB database (Cosmos DB is a NOSQL database)
  • The use of Cosmos DB with EF Core highlights
    • How Cosmos DB is different from a relational (SQL) database
    • The limitations in EF Core’s Cosmos DB database provider
  • At the end I give my view of performance gain against development time.

The Book App and its features

The Book App is a demo e-commerce site that sells books. In my book “Entity Framework Core in Action, 2nd edition” I use this Book App as an example of using various EF Core features. It starts out with about 50 books in it, but in Part3 of the book I spend three chapters on performance tuning and take the number of books up to 100,000 book and then to ½ million books. Here is a screenshot of the Book App running in “Chapter 15” mode, where it shows four different modes of querying a SQL Server database.

The Book App query which I improve has the following Sort, Filter, Page features

  • Sort: Price, Publication Date, Average votes, and primary key (default)
  • Filter: By Votes (1+, 2+, 3+, 4+), By year published, By tag, (defaults to no filter)
  • Paging: Num books shown (default 100) and page num

Note: that a book can be soft deleted, which means there is always an extra filter on the books shown.

The book part of the database (the part of the database that handles orders isn’t shown) looks like this.

First level of performance tuning – Good LINQ

One way to load a Book with its relationships is by using Includes (see code below)

var books = context.Books
    .Include(book => book.AuthorsLink
        .OrderBy(bookAuthor => bookAuthor.Order)) 
            .ThenInclude(bookAuthor => bookAuthor.Author)
    .Include(book => book.Reviews)
    .Include(book => book.Tags)
    .ToList();

By that isn’t the best way to load books if you want good performance. That’s because a) you are loading a lot of data that you don’t need and b) you would need to do sorting and filter in software, which is slow. So here are my five rules for building fast, read-only queries.

  1. Don’t load data you don’t need, e.g.  Use Select method pick out what is needed.
    See lines 18 to 24 of my MapBookToDto class.
  2. Don’t Include relationships but pick out what you need from the relationships.
    See lines 25 to 30 of my MapBookToDto class.
  3. If possible, move calculations into the database.
    See lines 13 to 34 of my MapBookToDto class.
  4. Add SQL indexes to any property you sort or filter on.
    See the configuration of the Book entity.
  5. Add AsNoTracking method to your query (or don’t load any entity classes).
    See line 29 in ListBookService class

NOTE: Rule 3 is the hardest to get right. Just remember that some SQL commands, like Average (SQL AVE) can return null if there are no entries, which needs a cast to a nullable type to make it work.

So, combining the Select, Sort, Filter and paging my code looks like this.

public async Task<IQueryable<BookListDto>> SortFilterPageAsync
    (SortFilterPageOptions options)
{
    var booksQuery = _context.Books 
        .AsNoTracking() 
        .MapBookToDto() 
        .OrderBooksBy(options.OrderByOptions) 
        .FilterBooksBy(options.FilterBy, options.FilterValue); 

    await options.SetupRestOfDtoAsync(booksQuery); 

    return booksQuery.Page(options.PageNum - 1, 
        options.PageSize); 
}

Using these rules will start you off with a good LINQ query, which is a great starting point. The next sections are what to do if that doesn’t’ give you the performance you want.

When the five rules aren’t enough

The query above is going to work well when there aren’t many books, but in chapter 15 I create a database containing 100,000 books with 540,000 reviews. At this point the “five rules” version has some performance problems and I create three new approaches, each of which a) improves performance and b) take development effort. Here is a list of the four approaches, with the Good LINQ version as our base performance version.

  1. Good LINQ: This uses the “five rules” approach. We compare all the other version to this query.
  2. SQL (+UDFs): This combines LINQ with SQL UDFs (user-defined functions) to move concatenations of Author’s Names and Tags into the database.
  3. SQL (Dapper): This creates the required SQL commands and then uses the Micro-ORM Dapper to execute that SQL to read the data.
  4. SQL (+caching): This pre-calculates some of the costly query parts, like the averages of the Review’s NumStars (referred to as votes).

In the video I describe how I build each of these queries and the performance for the hardest query, this is sort by review votes.

NOTE: The SQL (+caching) version is very complex, and I skipped over how I built it, but I have an article called “A technique for building high-performance databases with EF Core” which describes how I did this. Also, chapter 15 on my book “Entity Framework Core in Action, 2nd edition” covers this too.

Here is a chart in the I showed in the video which provides performances timings for three queries from the hardest (sort by votes) down to a simple query (sort by date).

The other chart I showed was a breakdown of the parts of the simple query, sort by date. I wanted to show this to point out that Dapper (which is a micro-ORM) is only significantly faster than EF Core if you have better SQL then EF Core produces.

Once you have a performance problem just taking a few milliseconds off isn’t going to be enough – typically you need cut its time by at least 33% and often more. Therefore, using Dapper to shave off a few milliseconds over EF Core isn’t worth the development time. So, my advice is and study the SQL that EF Core creates and if you know away to improve the SQL, then Dapper is a good solution.

Going bigger – how to handle ½ million or more books

In chapter 16 I build what is called a Command Query Responsibility Segregation (CQRS) architecture. The CQRS architecture acknowledges that the read side of an application is different from the write side. Reads are often complicated, drawing in data from multiple places, whereas in many applications (but not all) the write side can be simpler, and less onerous. This is true in the Book App.

To build my CQRS system I decided to make the read-side live in a different database to the write-side of the CQRS architecture, which allowed me to use a Cosmos DB for my read-side database. I did this because Cosmos DB designed for performance (speed of queries) and scalability (how many requests it can handle). The figure below shows this two-database CQRS system.

The key point is the data saved in the Cosmos DB has as many of the calculations as possible pre-calculated, rather like the SQL (+cached) version – that’s what the projection stage does when a Book or its associated relationships are updated.

If you want to find out how to build a two-database CQRS code using Cosmos DB then my article Building a robust CQRS database with EF Core and Cosmos DB describes one way, while chapter 16 on my book provides another way using events.

Limitations using Cosmos DB with EF Core

It was very interesting to work with Cosmos DB with EF Core as there were two parts to deal with

  • Cosmos DB is a NoSQL database and works differently to a SQL database (read this Microsoft article for one view)
  • The EF Core 5 Cosmos DB database provider has many limitations.

I had already look at these two parts back in 2019 and written an article, which I have updated to EF Core 5, and renamed it to “An in-depth study of Cosmos DB and the EF Core 3 to 5 database provider”.

Some of the issues I encountered, listed with the issues that made the biggest change to my Book App are:

  • EF Core 5 limitation: Counting the number of books in Cosmos DB is SLOW!
  • EF Core 5 limitation: EF Core 5 cannot do subqueries on a Cosmos DB database.
  • EF Core 5 limitation: No relationships or joins.
  • Cosmos difference: Complex queries might need breaking up
  • EF Core 5 limitation: Many database functions not implemented.
  • Cosmos difference: Complex queries might need breaking up.
  • Cosmos difference: Skip is slow and expensive.
  • Cosmos difference: By default, all properties are indexed.

I’m not going to go though all of these – the “An in-depth study of Cosmos DB and the EF Core 3 to 5 database provider” covers most of these.

Because of the EF Core limitation on counting books, I changed the way that that paging works. Instead of you picking what page you want you have a Next/Prev approach, like Amazon uses (see figure after list of query approaches). And to allow a balanced performance comparison with the SQL version and the Cosmos DB version I added the best two SQL approaches, but turned of counting too (SQL is slow on that).

It also turns out that Cosmos DB can count very fast so I built another way to query Cosmos DB using its NET (pseudo) SQL API. With this the Book App had four query approaches.

  1. Cosmos (EF): This accesses the Cosmos DB database using EF Core (with some parts using the SQL database where EF Core didn’t have the features to implement parts of the query.
  2. Cosmos (Direct): This uses Cosmos DB’s NET SQL API and I wrote raw commands – bit like using Dapper for SQL.
  3. SQL (+cacheNC): This uses the SQL cache approach using the 100,000 books version, but with counting turned off to compare with Cosmos (EF).
  4. SQL (DapperNC): This uses Dapper, which has the best SQL performance, but with counting turned off to compare with Cosmos (EF).

The following figure shows the Book App in CQRS/Cosmos DB mode with the four query approaches, and the Prev/Next paging approach.

Performance if the CQRS/Cosmos DB version

To test the performance, I used an Azure SQL Server and Cosmos DB service from a local Azure site in London. To compare the SQL performance and the Cosmos DB performance I used databases with a similar cost (and low enough it didn’t cost me too much!). The table below shows what I used.

Database typeAzure service namePerformance unitsPrice/month
Azure SQL ServerStandard20 DTUs$37
Cosmos DBPay-as-you-gomanual scale, 800 RUs$47

I did performance tests on the Cosmos DB queries while I was adding books to the database to see if the size of the database effected performance. Its hard to get a good test of this as there is quite a bit of variation in the timings.

The chart below compares EF Core calling Cosmos DB, referred to as Cosmos (EF), against using direct Cosmos DB commands via its NET SQL API – referred to as Cosmos (Direct).

This chart (and other timing I took) tells me two things:

  • The increase in the number in the database doesn’t make much effect on the performance (the Cosmos (Direct) 250,000 is well within the variation)
  • Counting the Books costs ~25 ms, which is much better than the SQL count, which added about ~150 ms.

The important performance test was to look at Cosmos DB against the best of our SQL accesses. I picked a cross-section of sorting and filtering queries and run them on all four query approaches – see the chart below.

From the timings in the figure about here some conclusions.

  1. Even the best SQL version, SQL (DapperNC), doesn’t work in this application because any sort or filter on the Reviews took so long that the connection timed out at 30 seconds.
  2. The SQL (+cacheNC) version was at parity or better with Cosmos DB (EF) on the first two queries, but as the query got more complex it fell behind in performance.
  3. The Cosmos DB (direct), with its book count, was ~25% slower than the Cosmos DB (EF) with no count but is twice as fast as the SQL count versions.

Of course, there are some downsides of the CQRS/Cosmos DB approach.

  • The add and update of a book to the Cosmos DB takes a bit longer: this is because the CQRS requires four database accesses (two to update the SQL database and two to update the Cosmos database) – that adds up to about 110 ms, which is more than double the time a single SQL database would take. There are ways around this (see this part of my article about CQRS/Cosmos DB) but it takes more work.
  • Cosmos DB takes longer and costs more if you skip items in its database. This shouldn’t be a problem with the Book App as many people would give up after a few pages, but if your application needs deep skipping through data, then Cosmos DB is not a good fit.

Even with the downsides I still think CQRS/Cosmos DB is a good solution, especially when I add in the fact that implementing this CQRS was easier and quicker than building the original SQL (+cache) version. Also, the Cosmos concurrency handling is easier than the SQL (+cache) version.

NOTE: What I didn’t test is Cosmos DB’s scalability or the ability to have multiple versions of the Cosmos DB around the work. Mainly because it’s hard to do and it costs (more) money.

Performance against development effort

In the end it’s a trade-off of a) performance gain and b) development time. I have tried to summarise this in the following table, giving a number from 1 to 9 for difficultly (Diff? in table) and performance (Perf? In the table).

The other thing to consider is how much more complexity does your performance tuning add to your application. Badly implemented performance tuning can make an application harder to enhance and extend. That is one reason why use like the event approach I used on the SQL (+cache) and CQRS / Cosmos DB approaches because it makes the least changes to the existing code.

Conclusion

As a freelance developer/architect I have had to performance tune many queries, and sometimes writes, on real applications. That’s not because EF Core is bad at performance, but because real-world application has a lot of data and lots of relationships (often hierarchical) and it takes some extra work to get the performance the client needs.

I have already used a variation of the SQL (+cache) on a client’s app to improve the performance of their “has the warehouse got all the parts for this job?”. And I wish Cosmos DB was around when I built a multi-tenant service that needed to cover the whole of the USA.

Hopefully something in this article and video will be useful if (when!) you need performance tune your application.

NOTE: You might like to look at the article “My experience of using modular monolith and DDD architectures” and its companion article to look at the architectural approaches I used on the Part3 Book App. I found the Modular Monolith architectural approach really nice.

Happy coding.

Using PostgreSQL in development: Part 1 – Installing PostgreSQL on Windows

$
0
0

I started work on a library that uses a PostgreSQL, which means I need a PostgreSQL database to test it against. The best solution is to run a PostgreSQL server on my Windows development PC as that gives me a fast, free access to PostgreSQL database.

There are three good ways to run a PostgreSQL server on Windows: running a Windows version of PostgreSQL, using docker or installing PostgreSQL in Windows Subsystem for Linux, known as WSL. I used the WSL approach because it’s supposed to the quicker than the Windows version and this article described what I has to do to make PostgreSQL run properly.

I didn’t think installing PostgreSQL would be simple, and it wasn’t, and I had to do a lot of Googling to find the information I needed. The information was scattered, and some were out of date, which is why decided to write an article listing the steps to do get PostgreSQL working on Windows. The second article will then cover how to access a PostgreSQL with Entity Framework Core and unit test against a PostgreSQL database.

  • Using PostgreSQL in dev: Part 1 – Installing PostgreSQL on Windows (this article)
  • Using PostgreSQL in dev: Part 2 – Testing against a PostgreSQL database (coming soon)

TL;DR; – The steps to installing PostgreSQL on Windows 10

  1. Installing WSL2 on Windows, which provides a Linux subsystem running on Windows. This was easy and painless.
  2. Installing PostgreSQL in the Linux subsystem. This was easy too.
  3. Starting / stopping the PostgreSQL software on Linux. You just have to remember about five Linux commands.
  4. Getting the Windows pgAdmin app up and running. This is where I had a problem and it took quite a while to work out what was wrong and how to fix it.
  5. Creating a PostgreSQL server via pgAdmin. Had to be careful to get the correct values set up, but once I had done that once it remembers it for next time.
  6. See the second article on how to access a PostgreSQL database from a .NET application running on Windows, and some suggestion on how to unit test against a real PostgreSQL database.

Setting the scene – why am I installing PostgreSQL

I’m a software developer, mainly working on ASP.NET Core and EF Core, and I use Windows as my development system. Up until now I have only used PostgreSQL via Azure, but I wanted a local PostgreSQL server to unit test for my library AuthPermissions.AspNetCore which will support SQL Server or PostgreSQL.

I haven’t used Linux before and I don’t really want to learn Linux unless I need to, so I wanted to find ways to use Windows commands as much as possible. Using WSL does need me to learn a few Linux commands, such as how to start / stop the PostgreSQL code and there is a PostgreSQL Linux app called psql which uses terminal commands to manage the PostgreSQL server etc. But I found a application that is like Microsoft’s SSMS (SQL Server Management Studio) application with a nice GUI front-end called pgAdmin and it even has a Windows version. That means I only need to learn a few Linux commands and manage PostgreSQL from Windows, which is great.

Here are the steps I took to install PostgreSQL using Windows Subsystem for Linux.

1. Installing WSL2 Linux (Ubuntu) on Windows 10/11

Installing WSL was easy and painless for me, and the Microsoft WSL documentation was excellent. You must be running Windows 10 version 2004 and higher (Build 19041 and higher) or Windows 11, and the command wsl –install via PowerShell or Windows Command Prompt run as administrator.

You then need to restart your PC, and after a long time the Linux subsystem will ask for a username and password. You will need the password every time you start the Linux system, so make sure you remeber it.

Couple of extra information:

2. Installing PostgreSQL on Linux

Again, I found a great Microsoft document on how to install PostgreSQL (and other databases) in your Linux Subsystem. The list below shows the things I did to install PostgreSQL.

  1. Update your Ubuntu packages: sudo apt update
  2. Install PostgreSQL (and the -contrib package which has some helpful utilities) with: sudo apt install postgresql postgresql-contrib
  3. Set the PostgreSQL password: sudo passwd postgres.
    The nice thing about that is you can change the password any time if you forget it. This password is used whenever you access PostgreSQL’s servers and databases.

TIP: Linux doesn’t use ctrl-c / ctrl-v for cut/paste. I found that I could use the normal ctril-c to cut on Windows and then if I right-clicked in Linux it would paste – that allowed me to copy commands from the docs into Linux.

3. Starting / stopping PostgreSQL on Linux

The previous Microsoft document on how to install PostgreSQL also provided the 3 commands you need once PostgreSQL is installed:

  • Checking the status of your database: sudo service postgresql status
    It says online if running, or down if stopped.
  • Start running your database: sudo service postgresql start
  • Stop running your database: sudo service postgresql stop

TIP: You don’t need to have the Linux subsystem window open to access PostgreSQL once you have started – the Linux subsystem will stay running in the background.

4. Getting the Windows pgAdmin app up and running

I found the Windows version of pgAdmin here and installed it. The pgAdmin app started OK, but I immediately had a problem of connecting to the PostgreSQL on the Linux subsystem. When I tried to create a PostgreSQL server, I got the error “password authentication failed for user “postgres” – see below

Copyright: Stack Overflow – see  this stack overflow question.

This was frustrating and took quite a while to work out what the problem is. I tried a number of things but in the end it turns out you that the authentication mode of PostgreSQL doesn’t work with the connection to Windows (see this stack overflow answer) and you need to edit PostgreSQL’s pg_hba.conf file to change what is known as the auth-method (see PostgreSQL docs on the pg_hba.conf file) from md5  to trust for the IP local connections. See code below which shows the new version – NOTE you need to scroll to the right to see the change from md5 to trust.

# TYPE  DATABASE        USER            ADDRESS                 METHOD
# IPv4 local connections:
host    all             all             127.0.0.1/32            trust
# IPv6 local connections:
host    all             all             ::1/128                 trust

The problem with the stack overflow answers I found is they were written before WSL was around, so they assume you are running PostgreSQL directly from Windows. To get to the files in the Linux subsystem I needed to use the \\wsl$ prefix to access the WSL files. In the end I found the PostgreSQL’s pg_hba.conf at \\wsl$\Ubuntu\etc\postgresql\12\main (note that the \12\ part refers to the version of PostgreSQL application).

NOTE: Editing the pg_hba.conf file needs a high level of access, but I found VSCode could edit it, or you can edit the file via NotePad run as administrator.

5. Creating a PostgreSQL server

Once the password trust problem was fixed, I tried again to create a PostgreSQL server and it worked. When the pgAdmin application is started it asks for a password, which can be anything you like (and you can change it easily). This password lets you save PostgreSQL passwords, which is useful.

To create a PostgreSQL server, I clicked the Quick Link -> Add New Server link on the main page, which then showed a Create – Server popup with various tabs. I had to fill the following tabs/field

  1. On the General tab you need to fill the Name field with your chosen name. I used PostgreServer
  2. On the Connection tab you need to set
    • Host name/address: 127.0.0.1
    • Password: the same password as given in step 2 when setting the PostgreSQL password
    • Save password: worth turning on

You now have a PostgreSQL Server, but no databases other than the system database called postgres. You can create database using pgAdmin, but in the next article I show how your application and unit tests can create databases.

NOTE: I found this article by Chloe Sun which has pictures of the pgAdmin pages, which might be useful.

Conclusion

Most of the steps were fairly easy until I hit step 4, and which point I had to do a lot of digging to find the solution. The Microsoft documents were great, but they didn’t provide all the information I needed. Thanks to all the people that answered stock overflow questions and people who write articles about this, especially Chloe Sun who wrote a similar article to this.

The next article shows you how to access a PostgreSQL database using EF Core and describes the unit test methods I added to my EfCore.TestSupport 6.0.0-preview001. These unit test helpers allow you to create a unique PostgreSQL database for each xUnit test class.

Happy coding.

How to safely apply an EF Core migrate on ASP.NET Core startup

$
0
0

There are many ways to migrate a database using EF Core, and one of the most automatic approaches is to call EF Core’s MigrateAsync method on startup of your ASP.NET Core application – that way you don’t forget it.  But the “migrate on startup” approach has a big problem if your ASP.NET Core is running multiple instances of your app (known as  Scale Out in Azure and Horizontally Scaling in Amazon). That’s because trying to apply multiple migrations at the same time doesn’t work – either it will fail or worse, it might cause data corruption to your database.

This article describes a library designed to updates global resources, e.g., a database, on startup of your application that handles applications that have for multiple instances running. Migration is a one thing it can do, but as you will see there are other examples, such as adding an admin user when you first deploy your application.

This open-source library is available as the NuGet package called Net.RunMethodsSequentially and the code is available on https://github.com/JonPSmith/RunStartupMethodsSequentially. I use the shorter name of RunSequentially to refer to this library.

TL;DR; – Summary of this article

  • The RunSequentially library manages the execution of the code you run on the startup of your ASP.NET Core application. You only need this library when you have multiple instances running on your web host because it will make sure your startup code, which in every instance, aren’t run at the same time.
  • The library uses a lock on a global resource (i.e. an resource all the app instances can access) which your startup code are executed serially, and never in parallel.  This means you can migrate and/or seed your database without nasty things happening, e.g. running multiple EF Core Migrations at a same time could (will!) causes problems.
  • But be aware, each instance of your applications will run your startup code, the library just guarantees they won’t run at the same to. That means if you are seeding a database you need to check it hasn’t already been added.
  • To use the RunSequentially library you have to do three things:
    • You write the code you want to run within a global lock – these are referred to as startup services.
    • Select a global resource you can lock on – a database is a good fit, but its not yet created you can fall back to locking on a FileSystem Directory, like ASP.NET Core’s wwwRoot directory.
    • You add code to the ASP.NET Core  configuration code in the Program class (net6) to register and configure the RunSequentially setup.
  • Because the RunSequentially library uses ASP.NET Core’s HostedService its run before the main host starts, which is perfect. But the HostedServices doesn’t give good feedback if you have an exception in the RunSequentially code. To help with this I have included a tester class in the library to which allows you to check your code / configuration works before you publish your application to production.

Setting the scene – why did I create this library?

Back in December 2018 I wrote an article called “A better way to handle authorization in ASP.NET Core” that described several additions I added to ASP.NET Core to provide management of users in a large SaaS (Software as a Service) I created for a client. This article is still the top article on my blog three years on, and many people have used this approach.

Many developers asked me to create a library containing the “better ways” features, but there were a few parts that I didn’t know how do in a generic library that anyone could use. One of parts was the adding setup data to the database when there were multiple instances of ASP.NET Core were running. But in March 2021 GitHub @zejji provided a solution I hadn’t known about using the DistributedLock library.

The DistributedLock library made it possible to create a “better ways” library which is called AuthPermissions.AspNetCore (shortened to as AuthP). The AuthP library is very complex, so I released version 1 in August 2021 which only works with single instances of ASP.NET Core, knowing I had a plan to handle multiple instances. I am currently working version 2 which is using the RunSequentially library.

Rather than put the RunSequentially code directly into the AuthP library I created its own library, which means other developers can use RunSequentially library to use the “migrate on startup” approach on applications that has have multiple instances.

How the RunSequentially library works, and what to watch out for

The library allows you create services which I refer to as startup services which are run sequentially on startup of your application. These startup services are run within a DistributedLock global lock which means your startup services can’t run all at the same time but are run sequentially. This stops the problem of multiple instances of the application trying to update one common resource at the same time.

But be aware every startup service will be run on every application’s instance, for example if your application is running three instances then your startup service will be run three times. This means your startup services should check if the database has already been updated, e.g. if your service adds an admin user to the the authentication database it should first check that that admin user isn’t already been added (NOTE: EF Core’s Migrate method checks if the database needs to be updated, which stops your database being migrated multiple times).

Another warning is that you must NOT apply a database migration that changes the database such that the old version of the application’s software won’t work (these types of migrations are known as a breaking change) cannot be applied to an application which is running multiple instances. That’s because Azure etc. replace each of the running instance one by one, which means a migration applied by the first instance that has breaking changes will “break” the other running instances that hasn’t yet have its software updated. 

This breaking change issue isn’t specific to this library but because this library is all about running applications that have multiple instances, then you MUST ensure you not applying a migration that has a break change  (see the five-stage app update in this article for how you should handle a breaking changes to your database).

Using the RunSequentially library in your ASP.NET Core application

This breaks down into three stages:

  1. Adding the RunSequentially library to your application
  2. Create your startup services, e.g. one migrate your database and say another to add an admin user.
  3. Register the RegisterRunMethodsSequentially extension method to your dependency injection provider, and select:
    1. What global resource(s) you want to lock on.
    1. Register your startup service(s) you want run on startup.

Let’s look at these in turn.

1. Adding the RunSequentially library to your application

This is simple, you add the NuGet package called Net.RunMethodsSequentially into your application. This library uses the Net6 framework.

2. Create your startup services

You need to create what the RunSequentially library calls startup services, which contain the code you want to apply to a global resource. Typically, it’s a database but it could be a common file store, azure blob etc.

To create a startup service that will be run while in a lock you need to create a class that inherits the IStartupServiceToRunSequentially interface. This has a method defined has a ValueTask ApplyYourChangeAsync(IServiceProvider scopedServices), which is the method where you put your code to update a shared resource. The example code below is a RunSequentially startup service, i.e. it implements the IStartupServiceToRunSequentially interface, and shows how you would migrate a database using EF Core’s MigrateAsync method.

public class MigrateDbContextService : 
    IStartupServiceToRunSequentially
{
    public int OrderNum { get; }

    public async ValueTask ApplyYourChangeAsync(
        IServiceProvider scopedServices)
    {
        var context = scopedServices
            .GetRequiredService<TestDbContext>();

        //NOTE: The Migrate method will only update 
        //the database if there any new migrations to add
        await context.Database.MigrateAsync();
    }
}

The ApplyYourChangeAsync method has a parameter holding a scoped service provider, which is a copy of the normal services used in the main application. From this you can get the services you need to apply your changes to the shared resource, in this case a database.

NOTE: You can use the normal constructor DI injection approach, but as a scoped service provider is already set up to run the RunSequentially code you might like to use that service provider instead.

The OrderNum is a way to define the order you want your startup services are run. If startup services have the same OrderNum value (default == zero), the startup services will be run in the order they were registered.

This means most cases you just register them in the order you want them to run, but in complex situations the OrderNum value can be useful to define the exact order your startup services are run in via a OrderBy(service => service.OrderBy)inside the library. For example, my AuthP library some startup services are optionally registered by the library and other startup services are provided by the developer: in this case the OrderNum value is used to make sure the startup services are run in the correct order.

3. Register the RegisterRunMethodsSequentially and your startup services

You need to register the RunSequentially code and your startup services with your dependency Injection (DI) provider. For ASP.NET Core this the configuration of the applciation happens in the Program class (net6) (previously in the Startup class before net6). This has three parts:

  1. Selecting/registering the global locks.
  2. Registering the RunSequentially code with your DI provider
  3. Register the startup services

Before I describe each of these parts the annotated code below shows process.

NOTE: The above was taken from the Program class in a test ASP.NET Core in the RunSequentially’s repo.

And now the detail of the three parts.

3.1 Selecting/registering the global locks

The RunSequentially library relies on creating a lock on a resource that all the web application’s instances can access, and it uses the DistributedLock library to manage these global locks. Typical global resources are databases (DistributedLock supports 6 database types), but the DistributedLock library also supports a FileSystem Directory and Windows WaitHandler.

However, the RunSequentially library only natively supports the three main global lock types, known as TryLockVersions, and a no locking version, which is useful in testing:

  • SQL Server database, AddSqlServerLockAndRunMethods(string connectionString)
  • PostgreSQL database, AddPostgreSqlLockAndRunMethods(string connectionString)
  • FileSystem Directory, AddFileSystemLockAndRunMethods(string directoryFilePath)
  • No Locking (useful for testing): AddRunMethodsWithoutLock()

NOTE: If you want to use another lock type in the DistributedLock library you can easily create your own RunSequentially TryLockVersion version. They aren’t that hard to write, and you have three versions in the RunSequentially library to copy from.

GitHub @zejji pointed out there is a problem if the database isn’t created yet, so I created a two-step approach to gaining a lock:

  1. Check if the resource exists. If doesn’t exist, then try the next TryLockVersion
  2. If the resource exists, then lock that resource and run the startup services

This explains why I use two TryLockVersion in my setup: the first tries to access a database (which is normally already created), but if the database doesn’t exist, then it uses the FileSystem Directory lock (NOTE: the FileSystem Directory lock can be slower than a database lock – see this comment in the FileSystem docs, so I use the database lock first).

So, putting this all together the code in your Program class would look something like this.

var builder = WebApplication.CreateBuilder(args);

// … other parts of the configuration left out.

var connectionString = builder.Configuration
     .GetConnectionString("DefaultConnection");
var lockFolder = builder.Environment.WebRootPath;

builder.Services.RegisterRunMethodsSequentially(options =>
    {
        options.AddSqlServerLockAndRunMethods(connectionString);
        options.AddFileSystemLockAndRunMethods(lockFolder);
    })…
// … other parts of the configuration left out.

The code in question are:

  • Lines 5 to 7: This gets the connection string for the database, and the FilePath for the application’s wwwroot directory.
  • Lines 11 to 12: These register two TryLockVersion versions. It first will try to create a global the database, but if the database hasn’t been created yet it will try to create a global lock on the wwwroot directory.

3.2 Registering the RunSequentially code with your DI provider

The RunSequentially library needs to register the GetLockAndThenRunServices as an ASP.NET Core IHostedService, which is run after the configuration, but before the main web host is run (see Andrew Lock’s useful article that explains all about IHostedService). This is done via the RegisterRunMethodsSequentially extension method.

NOTE: For non-ASP.NET Core applications, and for unit testing, you can set the RegisterAsHostedService properly in the options to false. This will register the GetLockAndThenRunServices class as a normal (transient) service.

See the code shown in the last section to see that the  RegisterRunMethodsSequentially extension method takes in the builder.Services so that it can register itself and the other parts of the RunSequentially parts.

3.3 Register the startup services

The final step is to register your startup services, which you do using the RegisterServiceToRunInJob<YourStartupServiceClass> method. There are a couple of rules designed to catch error situations. The library will throw an RunSequentiallyException if:

  • You didn’t register any startup services
  • If you registered the same class multiple times

As explained in the 2. Create your startup services section by default your startup service are run in the order they are registered. For instance, you should register your startup service that migrates your database first, followed by registering a startup service that accesses that database.

Using the “run startup services in the order they were registered” approach works in most cases, but my AuthP library I needed a way to define the run order, which is why in version 1.2.0 I added the OrderNum property as described in 2. Create your startup services section.

Checking that your use of RunSequentially will work

I have automated tests for the RunSequentially library, but before I revealed this library to the world, I wanted to be absolutely sure that an ASP.NET Core application using the RunSequentially library worked on Azure with was scaled out to multiple instances. I therefore created a test ASP.NET Core where I could try this out on Azure.

NOTE: You can try this out yourself by cloning the RunStartupMethodsSequentially repo and running the WebSiteRunSequentially ASP.NET Core project, either on Azure with scale out, or on your own development PC using the approaches shown in this stack overflow answer.

The WebSiteRunSequentially web app didn’t work at first because I hadn’t set the Azure Database Firewall properly, which causes an exception. The library relies on ASP.NET Core’s IHostedService to run things before the main host starts, but if you have an exception you don’t get any useful feedback!. So, I decided to add a tester that allows you to copy the RunSequentially code in your Program class and run it in an automated test.

The code below is from one of my xUnit Test projects, where I copied the setup code from the WebSiteRunSequentially Program class and put in a test. I did have to make a few changes to get a connection string and the tester class provides a local directly instead of the ASP.NET Core wwwRoot directory. Other than those two changes its identical to the Program class code.

[Fact]
public async Task ExampleTester()
{
    //SETUP
    var dbOptions = this.CreateUniqueClassOptions<WebSiteDbContext>();
    using var context = new WebSiteDbContext(dbOptions);
    context.Database.EnsureClean();

    var builder = new RegisterRunMethodsSequentiallyTester();

    //ATTEMPT
    //Copy your setup in your Program here 
    //---------------------------------------------------------------
    var connectionString = context.Database.GetConnectionString();  //CHANGED
    var lockFolder = builder.LockFolderPath;                        //CHANGED

    builder.Services.AddDbContext<WebSiteDbContext>(options =>
        options.UseSqlServer(connectionString));

    builder.Services.RegisterRunMethodsSequentially(options =>
    {
        options.AddSqlServerLockAndRunMethods(connectionString);
        options.AddFileSystemLockAndRunMethods(lockFolder);
    })
        .RegisterServiceToRunInJob<StartupServiceEnsureCreated>()
        .RegisterServiceToRunInJob<StartupServiceSeedDatabase>();
    //----------------------------------------------------------------

    //VERIFY
    await builder.RunHostStartupCodeAsync();
    context.CommonNameDateTimes.Single().DateTimeUtc
        .ShouldBeInRange(DateTime.UtcNow.AddSeconds(-1), DateTime.UtcNow);
}

The RegisterRunMethodsSequentiallyTester class is in the RunSequentially library so it’s easy to access. But be aware, while the code is the same, but you might not use the same database as you production system so this type of test might still miss a problem.

If you want a much more complex test, then look at the last test in the AuthP’s TestExamplesStartup test class. This shows you might have a bit of work to register the other parts of the system to make it work.

NOTE: You might find my EfCore.TestSupport library helpful as it has methods to set up test databases etc. That’s what I used at the start of the test shown above.

Having fixed the Firewall, I published the WebSiteRunSequentially web app to an Azure App Service plan with the scale out was manually set to three instances of the application. Its job is to update a single common entity and also added logs of what in done in another entity. The screenshot shows the results.

I restarted the Azure App Service, as that seems to reset all the running instances, and the logs tells you what happens:

  1. The first instance of application runs the “update startup service” and it found that common entity has been updated a while ago (I set a time of 5 minutes), so assumes it’s the first to service to be run in the set of three so it sets the Common entity’s Stage to 1.
  2. When the second instance of application is allowed to start up it runs the same “update startup service”, but now it finds that the common entity was updated four seconds ago, so it assumes that another instance had just updated it and updated the increments the Stage by 1, which in this case makes Stage equal 2.
  3. When the third (and final) instance of application start up the “update startup service” it too finds that the common entity was updated about 2 minutes ago, so it assumes that another instance had just updated so it assumes that another instance had just updated it and updated the increments the Stage by 1, which in this case makes Stage equal 3.

That process and results shows me that the RunSequentially library works as expected, and at the same time I learnt a few things to watch out for. For instance, I hadn’t set the Azure Database Firewall properly, which causes an exception before the ASP.NET Core host was up. This meant I only got a generic “its broken” with no information, which is why I added the tester class into the RunSequentially library.

Conclusion

The RunSequentially library is designed to run code at startup that won’t interfere with the same code run in other instances of your ASP.NET Core application. The RunSequentially library is small, but it still provides an excellent solution to handle updating global resource, with the “migrate on startup” being one of the main uses. Its also quite quick, the SQL Server check-and-lock code only took 1 ms on a local SQL Server database.

I’m also using the RunSequentially library in my AuthPermissions.AspNetCore library in a quite complex setup. This can have something like six different startup services available for migrating / seeding various parts of the database: some registered by the AuthP code and some added by the developer. Making that work property and efficiently requires a couple iterations of the RunSequentially library (currently at version 1.3.0) and only now I am recommending RunSequentially for real use, although there have been nearly 500 downloads of the NuGet already.

I’m interested in what else other developers might use this library for. I have an idea of managing what ASP.NET Core’s BackgroundService across applications with multiple instances. At the moment the same BackgroundService will run in each instance, which is either wasteful or actually causes problems. I could envision a system using a unique GUID in each instance of the application and a database entity like the one in my test ASP.NET Core app to make sure certain BackgroundServices would only run in one of the application’s instances. This would mean I don’t have to fiddle with WebJobs anymore, which would be nice 😊.

If you come up with a use for this library that outside the migrate / seed situations, then please leave a comment on this article. I, and other readers might find that useful.

Building ASP.NET Core and EF Core multi-tenant apps – Part1: the database

$
0
0

Multi-tenant applications are everywhere – your online banking is one example and GitHub is another. They consist of one set of code that is used by every user, but each user’s data is private to that user. For instance, my online banking app is used by millions of users, but only I (and the bank) can see my accounts.

This first article focuses on how you can keep each user’s data private from each other when using EF Core. The other articles in “Building ASP.NET Core and EF Core multi-tenant apps” series are:

  1. The database: Using a DataKey to only show data for users in their tenant (this article)
  2. Administration: different ways to add and control tenants and users
  3. Versioning your app: Creating different versions to maximise your profits
  4. Handling customer problems: customer support, data logging, and soft delete
  5. Hierarchical multi-tenant: Handling tenants that have sub-tenants

Also read the original article that introduced the library called AuthPermissions.AspNetCore library (shortened to AuthP in these articles) which provide pre-built (and tested) code to help you build multi-tenant apps using ASP.NET Core and EF Core.

TL;DR; – Summary of this article

  • This article focuses on setting up a database to use in a multi-tenant application so that each tenant’s data is separate from other tenants. The approach described uses ASP.NET Core, EF Core and the AuthP library.
  • The AuthP services and example code used in this article covers:
    • Create, update and delete an AuthP tenant
    • Assign a AuthP user to a tenant, which causes a DataKey Claim to be added to a logged-in user.
    • Provides services and code to set up an EF Core global query filter to only return data that matches the logged-in user’s DataKey claim.
  • You provide:
    • Access to a database using EF Core.
    • An ASP.NET Core front-end application to display and manage the data
  • There is an example ASP.NET Core multi-tenant application in the AuthP repo, called Example3 which uses ASP.NET Core’s individual accounts authentication provider. The Example3 web site will migrate/seed the database on startup to load demo data to provide you with a working multi-tenant application to try out.
  • Here are links to various useful information and code.

Defining the terms in this article

We start with defining key terms / concepts around multi-tenant application. This will help you as your read these articles.

The first term is multi-tenant, which is the term used for an application that provides the same services to multiple customers – known as tenants. Users are assigned to a tenant and when a user logs in, they only see the data linked to their tenant.

The figure below shows a diagram of the Example3 application in the AuthPermissions.AspNetCore repo that provides invoice management. Your application single database, which is divided between three tenants are shown, brown, blue and green. By assigning a unique DataKey to each tenant the application can filter the data in the database and only show invoices rows that have the same DataKey as the user has.

The positives of the multi-tenant design it can many customers on one application / database which reduces the overall costs of your hardware / cloud infrastructure. The downside is more complex code to keep each tenant’s data private, but as you will see EF Core has some great features to help with keeping data private.

NOTE: The broader term of Software as a Service (SaaS) can cover multi-tenant application, but SaaS also cover a single instance of the application for each user.

There are two main ways to separate each tenant’s data:

  • All tenant’s data on one database, with software filtering of each tenant data access, which is what this article uses.
  • Each tenant has its own database, and software directs each tenant to their database. This is known as sharding

There are pros and cons for both approaches (you can read this document on the pros/cons on a software filtering approach). The AuthP library implements software filtering approach with specific code to ensure that each tenant is isolated from each other.

How to implement a single multi-tenant application with software filtering

There are eight steps to implementing a ASP.NET Core multi-tenant application using EF Core and the AuthP library. They are:

  1. Register the AuthP library in ASP.NET Core: You need to register the AuthP library in your ASP.NET Core Program class (net6), or Startup class for pre-6 code.
  2. Create a tenant: You need a tenant class that holds the information about that tenant, especially the tenant value, referred to as the DataKey, that is used to filter the data for this tenant.
  3. Assign User to Tenant: You then need to a way to link a user to a tenant, so that the tenant’s DataKey is assigned to a user’s claims when they log in.
  4. Inject the user’s DataKey into your DbContext: You create a service that can extract the user’s DataKey claim value from a logged-in user and inject that service into your DbContext.
  5. Mark filtered data: You need to add a property to each class mapped to the database (referred to as entity classes in this article) that holds the DataKey.
  6. Add EF Core’s query filter: In the application’s DbContext you set up a global query filter for each entity class that is private to the tenant.
  7. Mark new data with user’s DataKey: When new entity classes are added you need to provide a way to set the DataKey with the correct value for the current tenant.
  8. Sync AuthP and your application’s DbContexts – To ensure the AuthP DbContext and your application’s DbContext are in sync the two DbContexts must go to one database. This also saves you the cost of using two databases.

This article uses the AuthP’s library so some of these steps are provided by that library, with you writing code to link the AuthP’s feature to your code that provides the specific features you provide to your customers.

1. Register the AuthP library in ASP.NET Core

To use the AuthP library you need to do three things:

  • Add the AuthPermissions.AspNetCore NuGet package in your ASP.NET Core project, and any other projects that need to access the AuthP library.
  • You need to create the Permissions enum that will control what features your users can access to your application features – see this section of the original article in the AuthP library, or the Permissions explained documentation.
  • Register AuthP to the ASP.NET Core dependency injection (DI) provider, which when using net6 is in the ASP.NET Core program file.

I’m not going to go through all the AuthP’s registering options because there are lots of different options, but below I show a simplified version of the Example multi-tenant application. For full information on the setup of the AuthP library go to the AuthP Starup code document page which covers all the possible options and a more detailed coverage of the SetupAspNetCoreAndDatabase method that manages the migration of the various databases.

services.RegisterAuthPermissions<Example3Permissions>(options =>
    {
        options.TenantType = TenantTypes.SingleLevel;
        options.AppConnectionString = connectionString;
        options.PathToFolderToLock = _env.WebRootPath;
    })
    .UsingEfCoreSqlServer(connectionString)
    .IndividualAccountsAuthentication()
    .RegisterTenantChangeService<InvoiceTenantChangeService>()
    .RegisterFindUserInfoService<IndividualAccountUserLookup>()
    .RegisterAuthenticationProviderReader<SyncIndividualAccountUsers>()
    .SetupAspNetCoreAndDatabase(options =>
    {
        //Migrate individual account database
        options.RegisterServiceToRunInJob<StartupServiceMigrateAnyDbContext<ApplicationDbContext>>();

        //Migrate the application part of the database
        options.RegisterServiceToRunInJob<StartupServiceMigrateAnyDbContext<InvoicesDbContext>>();
    });

The various lines to look at are

  • Line 3. You need to tell AuthP that your application uses a normal (single-level) multi-tenant
  • Line 4. The AppConnectionString is used in step 8 when you want to create, update or delete a tenant
  • Line 5. To handle migrating (and seeding) databases on startup it needs a global resource to create a lock. In cases where the database doesn’t exist yet, the backup is to lock a global directory, so I provide a path to the WebRootPath – see step 8

2. Create a tenant

The AuthP library contains the IAuthTenantAdminService service which contains the AddSingleTenantAsync method that creates a new tenant using the tenant name you provide (and must be unique). Once the new tenant is created its DataKey can be accessed using the tenant’s GetTenantDataKey method, which returns a string with the tenant primary key and a full stop, e.g. “123.”.  

3. Assign a AuthP User to Tenant

Again, the AuthP library has its own set of user information which includes a (optional) link to a Tenant and the user’s Roles / Permissions (covered in the first AuthP article). To add a new user and link them to a tenant, you must provide the tenant they should be linked to. There are various ways to add a user and I cover that that in the Part 2 article, which is about administration and adding user, including an approach that lets an admin user send an invite to join the tenant via person’s email.

Once a user is linked to a tenant, then when that user logs in a DataKey claim is added to their claims.

4. Inject the user’s DataKey into your DbContext

You want to transfer the DataKey claim into your application’s DbContext so that only database rows that has the same DataKey at the user’s DataKey claim as readable (see step 6).

In ASP.NET Core the way to do this is via the IHttpContextAccessor service. The code below, which is provided by the AuthP library, extracts the DataKey of the current user.

public class GetDataKeyFromUser : IGetDataKeyFromUser
{
    /// <summary>
    /// This will return the AuthP' DataKey claim. 
    /// If no user, or no claim then returns null
    /// </summary>
    /// <param name="accessor"></param>
    public GetDataKeyFromUser(IHttpContextAccessor accessor)
    {
        DataKey = accessor.HttpContext?.User
              .GetAuthDataKeyFromUser();
    }

    /// <summary>
    /// The AuthP' DataKey, can be null.
    /// </summary>
    public string DataKey { get; }
}

Then, in your application’s DbContext constructor you add parameter of type IGetDataKeyFromUser, which the .NET dependency injection (DI) provider will provide the GetDataKeyFromUser class shown above, which was registered a to the DI provider by the AuthP library.

The InvoicesDbContext code below comes from Example3 in the AuthP repo and shows the start of a DbContext that has a second parameter to get the user’s DataKey.

public class InvoicesDbContext : DbContext
{
    public string DataKey { get; }

    public InvoicesDbContext(
        DbContextOptions<InvoicesDbContext> options,
        IGetDataKeyFromUser dataKeyFilter)
        : base(options)
    {
        DataKey = dataKeyFilter?.DataKey ?? "Impossible DataKey";
    }
    //… other code left out
}

The DataKey could be null if no one logged in or the user hasn’t got an assigned tenant (or other sitations such as on startup or background services accessing the database), so you should provide a string that won’t match any of the possible DataKeys. As AuthP only uses numbers and a full stop any string containing alphabetical characters won’t match any DataKey.

5. Mark filtered data

When creating a multi-tenant application, you need to add a DataKey to each class that is used in the tenant database (referred to tenant entities). I recommend adding an interface that contains a string DataKey to each tenant entity – as you will see in the next step that helps. The code below is taken from the Invoice class in the Example3.InvoiceCode project.

public class Invoice : IDataKeyFilterReadWrite
{
    public int InvoiceId { get; set; }

    public string DataKey { get; set; }
    
    //other properties left out
}

NOTE: The IDataKeyFilterReadWrite interface says that you can read and write the DataKey. If you are using DDD entities, it would need a different interface.

6. Add EF Core’s query filter

Having an interface on each tenant entity allows you to automate the setting up of the EF Core’s query filter, and its database type, size and index. This automated approach makes sure you don’t forget to apply the DataKey query filter in your application, because a missed DataKey query filter creates a big hole the security of your multi-tenant application.

The code below is taken from Example3.InvoiceCode’s InvoicesDbContext class and shows a way to automate the query filter.

public class InvoicesDbContext : DbContext, IDataKeyFilterReadOnly
{
    public string DataKey { get; }

    public InvoicesDbContext(DbContextOptions<InvoicesDbContext> options, 
        IGetDataKeyFromUser dataKeyFilter)
        : base(options)
    {
        DataKey = dataKeyFilter?.DataKey ?? " Impossible DataKey ";
    }

    //other code removed…

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        // … other configurations removed  

        foreach (var entityType in modelBuilder.Model.GetEntityTypes())
        {
            if (typeof(IDataKeyFilterReadWrite)
                 .IsAssignableFrom(entityType.ClrType))
            {
                entityType.AddSingleTenantReadWriteQueryFilter(this);
            }
            else
            {
                throw new Exception(
                    $"You missed the entity {entityType.ClrType.Name}");
            }
        }
    }
}

NOTE: Lines 20 to 29 set up the DataKey of your entities and 27 throws an exception if any entity doesn’t implement the IDataKeyFilterReadWrite interface. This ensures that you don’t forget to add the IDataKeyFilterReadWrite to every entity class in your application’s database.

The AddSingleTenantReadWriteQueryFilter extension method is part of the AuthP library and does the following:

  • Adds query filter to the DataKey for an exact match to the DataKey provided by the service IGetDataKeyFromUser described in step 3.
  • Sets the string type to varchar(12). That makes a (small) improvement to performance.
  • Adds an SQL index to the DataKey column to improve performance.

7. Mark new data with user’s DataKey

When a tenant user adds a new row in the tenant database, it MUST set the DataKey of the current user that is saving to the database. If you don’t the data will be saved, but no one will see it. The typical approach is to override the SaveChanges(bool) and SaveChangesAsync(bool, CancellationToken) versions of the SaveChanges / SaveChangesAsync, as shown below (but you could also use EF Core’s SaveChanges Interceptor).

public override int SaveChanges(
    bool acceptAllChangesOnSuccess)
{
    this.MarkWithDataKeyIfNeeded(DataKey);
    return base.SaveChanges(acceptAllChangesOnSuccess);
}

public override async Task<int> SaveChangesAsync(
    bool acceptAllChangesOnSuccess,
    CancellationToken cancellationToken = default(CancellationToken))
{
    this.MarkWithDataKeyIfNeeded(DataKey);
    return await base.SaveChangesAsync(
        acceptAllChangesOnSuccess, cancellationToken);
}

The MarkWithDataKeyIfNeeded extension method in Example3 sets the DataKey property of any new entities with the current user’s DataKey string. The code below shows how that is done.

public static void MarkWithDataKeyIfNeeded(
    this DbContext context, string accessKey)
{
    foreach (var entityEntry in context.ChangeTracker.Entries()
        .Where(e => e.State == EntityState.Added))
    {
        var hasDataKey = entityEntry.Entity 
             as IDataKeyFilterReadWrite;
        if (hasDataKey != null && hasDataKey.DataKey == null)
            // The DataKey is only updatedif its null
            // This allow for the code to defining the DataKey
            hasDataKey.DataKey = accessKey;
    }
}

8. Make sure AuthP and your application’s DbContexts are in sync

It is very important that the AuthP database and your application’s database are kept in sync. For instance, if AuthP creates a new tenant and then your application’s DbContext had a problem creating that tenant, then you have a mismatch – AuthP thinks that the tenant data is OK, but when you try to access that data, you will get an error (or worse, you don’t get an error, but the data is wrong).

For this reason, the AuthP requires that:

  1. You must create a ITenantChangeService service, which you register at startup with AuthP via the RegisterTenantChangeService< ITenantChangeService>() method. This has methods to handle create, update and delete of your application’s DbContext. If you want an example have a look at the InvoiceTenantChangeService class from Example3.
  2. Your application’s DbContext and the AuthP’s DbContext must be on the same database. This allows the AuthP tenant code to use a transaction that contains the AuthP database changes and ITenantChangeServices applied to within your application’s DbContext. If the changes to your application’s database fail, then the AuthP changes are rolled back.

The figure below shows the steps that AuthP when you create, update or delete a AuthP tenant.

Linking two DbContexts to one database requires you have to define a unique migration history table. By doing that you can migrate each DbContexts separately. The code below, taken from Example3, shows how to register a DbContext and defining the migration history table name for this DbContext.

services.AddDbContext<InvoicesDbContext>(options =>
    options.UseSqlServer(
        configuration.GetConnectionString("DefaultConnection"), 
           dbOptions =>dbOptions
               .MigrationsHistoryTable(InvoicesDbContextHistoryName)));

NOTE: The AuthP’s DbContext uses the name “__AuthPermissionsMigrationHistory” for its migration history table.

When it comes to migrating the AuthP’s DbContext and your application’s DbContext (and the Individual Accounts database if you use that) you have a few approaches. You can apply migrations in your CI/CD pipeline using EF Core 6’s migrate.exe approach. But with the release of version 2 of the AuthP library you can migrate databases on startup within a global lock (please read this article).

AuthP version 2 uses the Net.RunMethodsSequentially library so that you can migrate EF Core databases even if your application is running multiple instances. Here is the Example3 code that register of the AuthP library in its Startup class and this documentation page covers migrations in detail.

services.RegisterAuthPermissions<Example3Permissions>(options =>
    {
        options.TenantType = TenantTypes.SingleLevel;
        options.AppConnectionString = connectionString;
        options.PathToFolderToLock = _env.WebRootPath;
    })
    .UsingEfCoreSqlServer(connectionString)
    //Other registration methods left out for clarity
    .RegisterTenantChangeService<InvoiceTenantChangeService>()
    .SetupAspNetCoreAndDatabase(options =>
    {
        //Migrate individual account database
        options.RegisterServiceToRunInJob<StartupServiceMigrateAnyDbContext<ApplicationDbContext>>();
        //Migrate the application part of the database
        options.RegisterServiceToRunInJob<StartupServiceMigrateAnyDbContext<InvoicesDbContext>>();
    });

NOTE: Line 9 registers your ITenantChangeService needed to keep your applications tenant data is in sync with what the AuthP’s tenant definitions.

This migrates:

  • The AuthP database
  • The individual account database (Example3 uses ASP.NET Core’s individual account provider)
  • Example3’s application database

In fact there are other seeding of databases too, but I’m not showing them as some of them are only to add example data so that if you run the example it will look like real application with users  – look at the Startup class for the full detail of what it does.

NOTE: The same rules apply to EF Core 6’s migrate.exe approach and AuthP’s Net.RunMethodsSequentially approach: if there is an instance of the application running, then the migration must not include a migration that will not work on the previous and new version of your application.

Conclusion

This is a long article because it steps though all the things to need to do to build a multi-tenant application. Many stages are handled by the AuthP library or EF Core, and where that isn’t the case there are linked to example code from Example3 multi-tenant application in the AuthP’s repo. Also there is a lot of documentation (and a few videos) for the AuthP library and a plenty of runnable examples in the AuthP repo.

The AuthP library isn’t small and simple, but that’s because building real multi-tenant applications aren’t small or simple. I know, because I was asked by a client to design a multi-tenant application to handle thousands of users, and you need a lot of things to manage and support that many tenants and users.

So, on top of the documentation and examples I have starting this “Building ASP.NET Core and EF Core multi-tenant apps” series that takes you through various parts of a multi-tenant application, including some of the features I think you will need in a real multi-tenant application

Do have a look at the AuthP Roadmap, which gives you an idea of what has been done, and what might change in the future. Also, please do give me feedback on what you want more information or features you need, either via comments on these articles or add an issue to the AuthP’s repo.

Happy coding.


Building ASP.NET Core and EF Core multi-tenant apps – Part2: Administration

$
0
0

In Part 1 of this series, you learnt how to set up the multi-tenant application database so that each tenant had its own set of data, which is private to them. This article explores the administration of a multi-tenant application and the different ways to add new users to a tenant.

The other articles in “Building ASP.NET Core and EF Core multi-tenant apps” series are:

  1. The database: Using a DataKey to only show data for users in their tenant
  2. Administration: different ways to add and control tenants and users (this article)
  3. Versioning your app: Creating different versions to maximise your profits
  4. Handling customer problems: customer support, data logging, and soft delete
  5. Hierarchical multi-tenant: Handling tenants that have sub-tenants

Also read the original article that introduced the library called AuthPermissions.AspNetCore library (shortened to AuthP in these articles) which provide pre-built (and tested) code to help you build multi-tenant apps using ASP.NET Core and EF Core.

TL;DR; – Summary of this article

  • Any real application has a lot of code dedicated to administering the application and its users and staff to use that administration code to help your users.
  • Multi-tenant applications most likely have lots of users in many groups (called tenants), which make it important to think about making the admin quicker by automating the key pinch-points, like new companies wanting a new tenant and adding more users to their tenant.
  • The AuthP library is designed to spread some of the administration, such as adding new users to a tenant, down to a user in the tenant. This means a company (tenant) can self-manage their users.
  • The AuthP library relies on ASP.NET Core for registering a new user and securely logging a user in. AuthP uses the userId (a unique string) to add extra data to manage the tenants and what features in your application that a user can use.
  • I describe four ways to automate the linking a logged in use to a tenant. Some are good for users that sign up to your service at any time, and other approaches work for a company where users are registered to a common login service, like Azure Active Directory.
  • All the examples shown in this article come from the AuthP repo, especially from
  • Example3 which is a multi-tenant app, and Example5, which shows how to work with Azure Active Directory.
  • Here are links to various useful information and code.

Introduction to administration features

If you are building a multi-tenant application to provide a service to many companies, assuming you are charging for your service, you want lots of tenant and users. Even if you are building an application for a company then it must have lots of divisions to require a multi-tenant design. In both cases there is going to be a lot of administration to be done, such as setting up new tenants, adding user to tenants, altering what users can do as you application is improved.

If you don’t think about the admin parts right at the start you are going to have problems when lots of users join. I should know because I was asked by a client to design a multi-tenant application to handle thousands of users, and admin code was hard work. That’s why I built into the AuthP library ways to automate admin if I could, or pass some admin jobs to the owners of each tenant. But in the end, the features that can’t be automated have to manually executed by your staff.

This article contains both my approach to handling admin in a multi-tenant application and shows AuthP’s admin features and code in the AuthP repo that can automate some of the user admin features.

Administration use in an AuthP’s multi-tenant application

In an AuthP’s multi-tenant application you need to manage:

  • AuthP Roles, which define a group of features, e.g. creating, editing a Role.
  • AuthP Tenants, which controls the database filtering, e.g. creating a new tenant.
  • AuthP users, which defines the Roles and Tenant that each user has, e.g. adding a new user linked to a specific tenant.

NOTE: Read the original article, which explains what a Role is and how they control what features in your application that certain users can access.

A few of these features can be automated, but you ned some person / people to manage the tenants, users and what they can access. I refer to these users as admin users and they have Roles that allows them to access features that the normal users aren’t allowed to access, such as AuthP admin services.

My experience on a very large multi-tenant application suggests that some tenant owners would appreciate the ability to manage their own users, especially adding and removing users in their tenant which also relieves the load on your admin team. Therefore, the AuthP library support multiple types of admin users. The list shows these admin users ordered with the most useful first:

  • App admin: They have access to the admin services across all of the users and all of the tenants. NOTE: an app-level user is not linked to a tenant.
  • Tenant admin: They have limited access the admin services and can only see users in their tenant.
  • Customer support: In bigger applications it worth having some customer support (also known as help desk users). Their focus is on help tenant users with problems.
  • Super admin: When you first deploy your application you need a user to start adding more user, called the Super admin. AuthP library provides a way add that user.

But to make this work, with no possibility of the tenant admin accessing advanced features, you need a clear view of what each admin user can do. The next sections give you a guide to setting up each admin type.

NOTE: Part 3 in this series covers versioning your multi-tenant application which introduces the tenant Roles feature which provides another control over the Roles available in a tenant.

App Admin – the people who manage what users can do

App Admin users can potentially access every admin feature in your application, which means they need to understand what each admin function does and know what the consequences of each function.

Having said that, their day-to-day job is to manage:

  • Create, update and delete AuthP’s Roles. Only app admin users are allowed to manage the Roles because Roles control what features in your application a user can access.
  • Manage removing users / tenants where they have stopped using you application
  • They might have to manually create / update tenants, but as you will see later that can be automated.

App admins are the people that can see every Role, Tenant and User – something that the Tenant admin user can’t do. Here is a list of users that an admin user would see.

Tenant Admin – taking over some admin of users in their tenant

The tenant admin is there to manage the users in their tenant. This helps the tenant, because they don’t have to wait for the app admin to manage their users, and it helps reduce the amount of work on the app users to. Typical things a tenant admin user can do are:

  • List all the users in their tenant
  • Changing the Roles of the users in their tenant
  • Invite a user to join the tenant (see the invite a new user section link later in this article).

The AuthP’s method for listing the users requires the DataKey of the current user to be provided (see code below).

var dataKey = User.GetAuthDataKeyFromUser();
var userQuery = _authUsersAdmin.QueryAuthUsers(dataKey);
var usersToShow = await AuthUserDisplay
    .TurnIntoDisplayFormat(userQuery.OrderBy(x => x.Email))
    .ToListAsync();

If the DataKey isn’t null, then it only returns users with that DataKey, which means a tenant admin user will only see the users in their tenant. The screenshot below shows the tenant admin version of the user list. If you compare the app admin list of users (above) with the tenant admin’s version, you will see:

  • Only the users in the 4UInc tenant are shown
  • The tenant admin isn’t shown in the list
  • Instead of Edit & Delete the only thing they can do is change a user’s Roles.

NOTE: Some Roles contain powerful admin services, so in version 2 of the AuthP library these more powerful Roles are hidden from the tenant user. This means we can allow the tenant admin user to manage the tenant user’s Roles because any Roles they shouldn’t have access to are filtered out. This is described in Part 3.

Customer support – helping solve user’s problems

The other type of admin user staff you might have are customer support users (also known as help desk users). Customer support users are there to help the tenant users when they have problems. I cover this in more detail in the Part 4 article, which covers handling customer problems, but here is a overview of what the Part 4 article covers.

There are some things you can put into your application, like soft delete instead deleting directly from the database, that allows you to undo some things or at least find out who changed something they shouldn’t have. The AuthP library also has a feature that allows a customer support user to access the tenant’s data in a same way as the tenant user can. This allows the customer support user to see the problem from the user’s point of view.

See the Part 4 article for more information.

SuperAdmin – solving the first deployment of your application

This admin user type solves the problem that when you deploy your application on first time there aren’t any user, because the database is empty. This means you can’t log in to add users – this is a catch-22 problem.

The AuthP library has a solution to this using its bulk load feature to add a Super Admin user who has access to all the application’s features that are protected by the HasPermission attribute / method. This documentation details the process to create a Super Admin user when deploying an application for the first time.

Managing your users in your tenants

We started with the higher-level administration feature, like setting up Roles and deleting tenants, but the day-to-day admin job is having new users creating or joining a tenant. If possible, you want to automate as much as you can to a) make it easier to a company sign up to your application, and b) to minimise what the app admin has to do. In this section I describe different approaches for adding tenant users, with the first two automating the process.

But before we look at the different ways to add tenant users, we need to talk about how ASP.NET Core manages users. ASP.NET Core breaks up what a user can do into two parts:

  1. Authentication features, which handles registration of a new use and a secure login of an existing user.
  2. Authorisation features, which controls what features the logged-in user can use in your application.

ASP.NET Core’s authentication features a few authenticate providers like the individual user accounts service, but ASP.NET Core also supports OAuth 2.0 and OpenID, which are the industry-standard protocol for authorization, which supports Azure Active Directory, Google login and many many more. This provides a large set of ways to authenticate a user when they are logging in.

And the AuthP library provides the authorization part of a multi-tenant application (see the first article which explains this). AuthP stores each user’s authorization data, such the user’s link to a tenant, in its own DbContent and AuthP has admin code to manage the AuthP authorization date. The link between the authenticated logged-in user and AuthP authorization data is via the authorization user’s Id, referred to as the userId.

This gives you the best of both worlds. You want Microsoft, Google etc to handle the authentication of the user, with all the problems of hacking etc., while the AuthP library provide the extra authorization features to work with more complex situations, like a multi-tenant application.

How a authenticated user is linked to AuthP authorization data depends on whether the list of authenticated users can be read or not. For instance, if you use an Azure Active Directory (Azure AD) you can arrange access to the list of users, while if you are using Google as the source of authenticated user you can’t read the list of all the uses. Here are four ways to handle adding users.

  1. Take over the registering of a new user, for instance code that adds the new user to the individual user accounts authentication provider and then creates the AuthP’ user too. This a good approach when you want your users sign up to your multi-tenant app.
  2. Have the tenant admin user invite a new user to join a tenant. This approach allows an admin tenant user sends a url to a user which when clicked adds the registered the user (authentication) and sets up their tenant and Roles (authorization).
  3. Syncing AuthP’s users against the authenticated list of users, for instance if you use an Azure Active Directory (Azure AD). This needs a source of authenticated user you can read and works best when you have a fairly static list of users, say within an application used by a company.
  4. Handle external login providers, for instance if you allow users to login via Google. This a good approach if you want to allow your users to log in using their Facebook, Twitter, Google etc. logins.

The next four sections explain each approach.

1. Take over the registering of a new user

This works if you have a private authentication provider, such as ASP.NET Core’s individual user accounts authentication provider, Azure AD, Auth0, etc. It also works well if you want new users to register / pay for the use of your application.

The steps in implementing the register a new user are:

  1. Ask for the user for the information you need to register with your private authentication provider, e.g. email and password.
  2. Register this user as a new user in your private authentication provider and obtain the new user’s userId.
  3. Now add a AuthP user, using the email / name and the userId of the registered authentication user.
  4. In a multi-tenant application, you could create a new tenant too, as the user has paid for the use of your application. Also, you can make this user the admin tenant for the new tenant which allows them to invite further users (if that feature is enabled).

Here the code in the AddUserAndNewTenantAsync method in the UserRegisterInviteService. NOTE: I made some simplification to make it easier to understand the code.

//Add a new user, in this case using individual users account
var userStatus = await GetIndividualAccountUserAndCheckNotAuthUser(
    dto.Email, dto.Password);
if (status.CombineStatuses(userStatus).HasErrors)
    return status;

//Now we can create the tenant
var tenantStatus = await _tenantAdminService.AddSingleTenantAsync(
     dto.TenantName);
if (status.CombineStatuses(tenantStatus).HasErrors)
    return status;

//This creates an AuthP user
status.CombineStatuses(await _authUsersAdmin.AddNewUserAsync(
    userStatus.Result.Id, dto.Email, null, roles, dto.TenantName));

The screenshot below shows a new user wanting to sign up to the Invoice Manager. There are three versions to choose with different features and prices. Once the user picks one of the versions (NOTE: versioning your application is covered in Part 3 for this series) they provide their email and password, plus the name of the company to use as the tenant name. Then steps 2 to 4 are run and the user logs in to their new tenant.

Example3 in the AuthP repo provides an example of this approach. A new user can pick which version of the application they want and, after “paying” for the version, then the steps shown above are executed. The code can be found in the AddUserAndNewTenantAsync method in the UserRegisterInviteService class.

NOTE: You can try this yourself. Clone the AuthP repo and run the Example3 ASP.NET Core project. On the home page there is a “Sign up now!” tab in the NavBar with will take you to the “Welcome to the Invoice Manager” page where you can select which version you want and then give an valid email and password plus the name you want to have for your tenant.

2. Have the admin user invite a new user to join a tenant

Just like the last approach, this works if you have a private authentication provider, such as ASP.NET Core’s individual user accounts authentication provider, Azure AD, Auth0, etc. This allows a tenant admin (or an App admin) to send an invite to a new user to join a tenant.

The steps in implementing the invitation to a new user to a tenant are:

  1. The tenant admin user creates a url to send to a user to join the admin’s tenant. The URL contains an encrypted parameter of the user’s email and the primary key of the tenant.
  2. The tenant admin emails this url to the user that they want to join the tenant.
  3. When the new user clicks the url they are directed to a page which askes for their email and password, plus the hidden encrypted parameter from the url.
  4. The code then:
    1. Checks that the encrypted parameter contains the same email as the tenant admin user provided in step 1 so that the link has been copied / hijacked.
    1. Registers a new user in your private authentication provider using the data provided by user in step 2. From this you can obtain the new user’s userId.
    1. Adds a AuthP user, using the email / name and the userId of the registered authentication user with a link to the tenant defined in the invite.

The code for this is also the UserRegisterInviteService class.

Step 1, creating encrypted parameter code

The code in the UserRegisterInviteService uses the AuthP’s IEncryptDecryptService to create the parameter (see below). Encrypting the invite means that only the person with the email can use the invite, and information about the tenant isn’t leaked if the email is intercepted.

public string InviteUserToJoinTenantAsync(
    int tenantId, string emailOfJoiner)
{
    var verify = _encryptorService.Encrypt(
        $"{tenantId},{emailOfJoiner.Trim()}");
    return Base64UrlEncoder.Encode(verify);
}

Then, in the controller I use the code below to create the url to go into the email sent to the user that you want to invite

//Thanks to https://stackoverflow.com/questions/30755827/getting-absolute-urls-using-asp-net-core
public string AbsoluteAction(IUrlHelper url,
    string actionName,
    string controllerName,
    object routeValues = null)
{
    string scheme = HttpContext.Request.Scheme;
    return url.Action(actionName, controllerName, 
           routeValues, scheme);
}

The figure below shows the pages for the first two steps in this approach.

Step 4 adds the new user to the tenant code,

The AcceptUserJoiningATenantAsync method code shown below finishes the step 4. NOTE: the code has been heavily simplified to make it easier to understand.

public async Task<IStatusGeneric<IdentityUser>>
    AcceptUserJoiningATenantAsync(
    string email, string password, string inviteParam)
{
    var status = new StatusGenericHandler<IdentityUser>();

    // …decrypt of encrypted inviteParam and various checks left out 
    
    // find the tenant from the encrypted tenantId parameter
    var tenant = await _tenantAdminService.QueryTenants()
        .SingleOrDefaultAsync(x => x.TenantId == tenantId);

    //Add a new individual users account user
    var userStatus = await GetIndividualAccountUserAndCheckNotAuthUser(
        email, password);
    if (status.CombineStatuses(userStatus).HasErrors)
        return status;

    //This creates an AuthP user
    status.CombineStatuses(await _authUsersAdmin.AddNewUserAsync(
        userStatus.Result.Id, email, null,
        roles, tenant.TenantFullName));

    //… final checks left out
}

NOTE:  You can try this by running the Example3 ASP.NET Core project and logging in using any of the seeded tenant admins, e.g. admin@4uInc.com (password = email), and then click the “Invite User” on the NavBar. This will take you to the “Welcome to the Invoice Manager” page where you can select which version you want and then give your email and password plus the name you want to have for your tenant.

3. Syncing AuthP’s users against the authenticated list of users

This needs a source of authenticated user you can read and works best when you have a fairly static list of users, say within an application used by a company. Also, only an app admin can use this approach as you need to set up which tenant a new user should be linked to.

To use the sync approach, you have to:

3.a Implement / register AuthP’s ISyncAuthenticationUsers service

The ISyncAuthenticationUsers requires you to create a service that will return a list of users with the UserId, Email, and UserName from the authenticated users. The code below shows the sync service for an individual user accounts authentication provider.

public class SyncIndividualAccountUsers : ISyncAuthenticationUsers
{
    private readonly UserManager<IdentityUser> _userManager;

    public SyncIndividualAccountUsers(UserManager<IdentityUser> userManager)
    {
        _userManager = userManager;
    }

    /// <summary>
    /// This returns the userId, email and UserName of all the users
    /// </summary>
    /// <returns>collection of SyncAuthenticationUser</returns>
    public async Task<IEnumerable<SyncAuthenticationUser>> 
        GetAllActiveUserInfoAsync()
    {
        return await _userManager.Users
            .Select(x => new SyncAuthenticationUser(x.Id, x.Email, x.UserName))
            .ToListAsync();
    }
} 

NOTE: If you want to use an Azure AD for authenticating users, then see the SyncAzureAdUsers service and the video “AuthPermissions Library – Using Azure Active Directory for login” for how to get permission to read access the Azure AD.

3.b Add a “Sync users” frontend feature

AuthP’s user admin code contains a method called SyncAndShowChangesAsync, returns the differences (new, change and deleted) between the list of authenticated users and AuthP’s list of users. Typically, you would show these changes before you apply the changes to the AuthP’s users. Here is an example of the type of display you might see.

To apply all the changes to the AuthP’s user clicks the “update all” button that calls the method ApplySyncChangesAsync.  At that point the list of AuthP users is in sync with the authenticated users, but any new users might need Role and / or tenant setting up.

NOTE: Read the documentation on Synchronizing the AuthUsers which gives more information on how to do this, plus links to example code for showing / altering the changes.

4. Handing external login providers

The final approach allows you to use external authentication providers such as Google, FaceBook, Twitter etc. This might be more attractive to your users, who don’t have to remember another password.

You should take over the external authentication provider login so that you can check if an AuthP user with that email exists. If a AuthP user doesn’t exist, you need to take the user through a process that links them to a tenant (maybe paying for the service too).

Once an AuthP user exists and set up you need to intercept the OAuth 2.0 authentication to add the AuthP claims. You do this via ASP.NET Core’s OAuth 2.0 OnCreatingTicket event and get the IClaimsCalculator service. This service contains the GetClaimsForAuthUserAsync method which takes in the user’s userId and will return claims to be added to the logged in user.

Useful links on this approach are:

Conclusion

Adding proper administration features to an application is always important. But when your application is multi-tenant application you want your tenants to find your application is easy to use. And on the other side you need to control the admin features such that a tenant can’t break or bypass the intended features in the tenant.

And in a multi-tenant application it’s not just logging in, its also about adding a new user to a tenant. That’s why I explained the “take over the registering of a new user” and the “Have the tenant admin user invite a new user to join a tenant” first – they take a bit more code, but they automate adding of a new user, which removes the load on your admin people.

In the next article you will learn about AuthP’s tenant Roles, which gives you more control of what a tenant user can do. In particular you can offer different versions of your code, e.g. Free, Pro, Enterprise, which might help to reach more people, some of which will pay extra for the extra features in your higher levels.

I hope this and the other articles in this series help you in understanding how you might build multi-tenant application. The AuthP library and the AuthP repo contain a large body of admin features, useful example applications, plus lots of documentation.

Happy coding.

Multi-tenant apps with different versions can increase your profits

$
0
0

This article explores ways you can make each tenant in your multi-tenant applications have different features – for instance, Visual Studio has three versions: Community, Pro, and Enterprise. This is often called versioning, has four benefits:

  • Having different versions / prices will increase your potential user base.
  • The users that need the application’s advanced feature will pay more.
  • Having different versions makes your really useful features stand out.
  • People are more likely to try your service if there is a free version to try.

In this article I show how the features in the AuthPermissions.AspNetCore library (shortened to AuthP in these articles) allows to add versioning to your multi-tenant application. The other articles in “Building ASP.NET Core and EF Core multi-tenant apps” series are:

  1. The database: Using a DataKey to only show data for users in their tenant
  2. Administration: different ways to add and control tenants and users
  3. Versioning your app: Creating different versions to maximise your profits (this article)
  4. Hierarchical multi-tenant: Handling tenants that have sub-tenants

TL;DR; – Summary of this article

  • The AuthP version 2 release allows you to create different versions of your multi-tenant application and you can now add extra Roles to tenant depending on what the customer selected and paid for – this it known as versioning.
  • The AuthP version 2 release also a Role filter to hide Roles that allow access to advanced feature from tenant users, which means a tenant user can safely manage users in their tenant.
  • The AuthP repo contains an ASP.NET Core MVC multi-tenant application which can be found in the Example3 project. This application is called the “Invoice Manage” and has three versions: Free, Pro, and Enterprise. This provides a runnable example of how to build a version a multi-tenant application using AuthP.
  • The article covers
    • How it hides advanced feature Roles from tenant users so that a tenant person can take over the job of managing the users in their tenant.
    • The three types of Roles that let you to version different types of multi-tenant application.
    • How to add the versioning Roles to your application.
    • How to add Roles to a tenant to turn on different versions of tenants.
    • How the admin user inside a tenant can get the correct roles for their tenant.
    • How your admin staff can manually add Roles to a tenant.

Introduction to AuthP’s v2 improved Roles

In the AuthP library controls what a logged-in user can access via Roles and Permissions (read the original AuthP article to learn about Roles and Permissions). In version 1 of the AuthP library all the tenants could access the same set of Roles. This meant you a) couldn’t have different versions of a tenant and b) all the administration of AuthP users had to be done by top-level users of your multi-tenant application – referred to as app admin users.

The Part 2 article, which covers administration, I suggested that having an admin user in a tenant level (referred to as tenant admin users) is useful to reduce the work your admin staff has to do (I also think the tenants would like that as they can change things immediately). This is only possible because of an improvement in version 2 of the AuthP library which adding a RoleType to the Roles. The purpose of adding the RoleType was to:

  1. Allows a Role contains access to advanced features, such as deleting a tenant, to be hidden from tenant users. This allows a tenant admin user can manage their user’s Roles safely, as the advanced admin Roles aren’t available to a tenant admin user.
  2. Add two RoleTypes that are registered to a Tenant instead of directly to a AuthP user – these are known as Tenant Roles. This allows you to add extra Roles to a tenant, which gives the users in that tenant have extra features over the base (normal) Roles. This is the key to creating different versions of your application.

NOTE: There is also a default RoleType, known as a normal RoleType, which any user can have. These Roles define the basic Roles that everyone can use.

The next sections describe how these two changes helps when building multi-tenant applications.

  1. Hiding Roles that contains access to advanced features.  This describes how a tenant admin can safely manage their users’ Roles, because Roles that a tenant user aren’t allowed to have are filtered out.
  2. Versioning your application. This describes how each tenant can have its own Roles, called tenant Roles.  These tenant Roles allow you to add extra features in a tenant, thus creating the versioning of your application.

1. Hiding Roles that contains access to advanced features

The AuthP library controls access to your applications using features via Roles and Permissions, but some of the advanced features, such a deleting a tenant, should only accessible to your admin staff. The Permissions that control access to these powerful features are called advanced Permissions (see this section in the Permission document about advanced Permissions).

When a Role is created or updated the Permissions in the Role are scanned, and if any of the Permissions are marked as advanced Permissions, then the Role’s RoleType will be set to “HiddenFromTenants”. You can also set the RoleType manually (shown later in this article), but if advanced Permissions are found, then it will always set to “HiddenFromTenants”.

This change allows a tenant admin user to manage the Roles that their users have, as the Roles containing access to advanced (dangerous) features aren’t available to the tenant users. This feature is fundamental to allowing a tenant admin user to manage their users.

2. Versioning your application

As I said at the start, versioning your multi-tenant application can increase your potential user base and your profits. The multi-tenant application I designed for a client required different versions to cover the different features their retail customers needed, and of course the versions with more features had a higher price.

An analyse of the versioning needs comes up with three types of Roles to add versioning:

  • Base Roles (also known as normal Role): These are Roles that every valid user can access, but a non-logged in user shouldn’t be able to access. This is the base version of your application.
  • Tenant auto-add Roles: These are extra Roles in the higher versions of a tenant. If an auto-add Role is turned on in a tenant, then all logged-in users in that tenant can access the auto-add Role. In the example later I use an auto-add Role to provide extra analytics Role to the “Invoice Manager” application.
  • Tenant admin-add Roles: These are extra Roles in the higher versions of a tenant. If an admin-add Role is turned on, its available for users in the tenant, but it’s not automatically added to every user in the tenant. This is useful in more complex applications where users within a tenant have different access. In the example later I use an admin-add Role to promote a user to be the tenant admin in the “Invoice Manager” application, which allows that user to invite new users to the tenant.

These three types of Roles allow you to build different versions into your multi-tenant applications. For SaaS (Software as a Service) applications your versions can offer customers different versions / prices for users to choose from. While a multi-tenant application for big multi-national company can not only separate the data for each division in a company, but also what features users in a division can access.

The next section describes an example multi-tenant application called the “Invoice Manager” which uses the three types of Roles defined to create an application with three versions.   

Building the multi-tenant “Invoice Manager” application

The rest of the article show how I built a versioned multi-tenant application that I call the “Invoice Manager”. The Invoice Manager application is a multi-tenant ASP.NET Core MVC application and can be found in the Example3 project in the AuthP repo. It has three versions:

  • Free: This version only allows one user in the tenant. This works for one-person companies, but also acts as a trial system that larger companies can try out the Invoice Manager’s features.
  • Pro: This allows multiple users within the tenant, which works for companies with many employees. This uses a tenant admin-add Role to promote the person that signed up for the Pro (or Enterprise) version to a tenant admin, which allows that user to invite new user to join their tenant.
  • Enterprise: This gives access to an invoice analyse feature to all the users in this tenant. This uses a tenant auto-add Role so that all the users in an Enterprise version tenant have access to the invoice analytical features. Companies that want this feature will pay extra to get this feature.

The Invoice Manager application allows a customer to sign up for this service where the customer picks what version they want / can afford. You have seen some of Invoice Manager features in the Part 2, administration article, but I didn’t cover how I managed the Roles and the versions. This article focuses on code / Roles that make the versioning work. The parts covered are:

  1. Adding the Role types needed for the three versions
  2. Creating a tenant with the Roles defined by the user’s selected version
  3. Allowing the tenant admin user to change a user’s Roles
  4. Manually adding Roles to a tenant

2a. Adding the Role types needed for the three versions

In this simple example I only need three Roles, one of each type.

  • Tenant user: This Role allows a user to access the Invoice Manager’s features, e.g. listing and adding invoices. This holds the base set of Invoice Manager’s features and is a Normal RoleType, and every valid tenant user should have this Role.
  • Tenant admin: This Role allows the user to a) sent invites to users to join their tenant, and b) can alter the Roles of users in the admin tenant’s tenant. This Role’s RoleType is admin-add, which it has be added to a user manually or by your sign-up code and is applied to Pro or Enterprise tenants, and not on the Free version.
  • Enterprise: This Role unlocks an analytic feature it the Invoice Manager code.  This Role’s RoleType is auto-add, which means the Role will automatically applied to every user within an Enterprise tenant.

The table below summaries the Roles in each version of the tenants.

TYPEFreeProEnterpriseWhat
NormalTenant user (everyone has)Tenant user (everyone has)Tenant user (everyone has)Gives access to basic features
Admin-add Tenant admin (given to a user)Tenant admin (given to a user)Can upgrade a user to Tenant admin
Auto-add  Enterprise (all users get this)Gives every user in tenant to this feature(s)

To create these Roles you need to be an app admin (see definition of an app admin and tenant admin in the part 2 article) and you manually set up each Role using AuthP’s AuthRoleAdminService’s  CreateRoleToPermissionsAsync method. The screenshot below comes from the Example3 ASP.NET Core MVC application and allows you to manually create a Role and set the RoleType.

The tenant Roles (tenant auto-add and tenant admin-add) can be added to a tenant. The Normal Role can be added to any AuthP user, while the HiddenFromTenant RoleType can only be added to non-tenant users. The AuthP’s library checks these rules and will give you helpful errors if you try to assign a Role in the wrong place.

2b. Creating a tenant with the user’s selected version

If you have read the Part 2, administration article, then you will have seen a section called Take over the registering of a new user, which sends a new user to “Sign up now!” page to pick which version of the Invoice Manager they want. The Part 2 section was focusing on the registration of the user, but in this article, we are focusing on the different Roles that are applied to the tenant and the user.

Here is a screenshot of the “Sign up now!” page to remind you what the sign up page offers.

NOTE: Please do try this example. Clone the AuthP repo and run the Example3 ASP.NET Core project. On the home page there is a “Sign up now!” tab in the NavBar with will take you to the “Welcome to the Invoice Manager” page where you can select which version you want.

I have split up the “Sign up now!” code into sections to show the parts about Roles and versioning. The first part are the two dictionaries using the enum TenantVersionTypes (Free, Pro, Enterprise) that hold the users Roles and the Tenant Role. Using a dictionary makes it very clear what Roles go where, any its easy to change / expand in the future.

NOTE: You can find “Sign up now!” code in the AddUserAndNewTenantAsync method inside UserRegisterInviteService class.

private readonly Dictionary<TenantVersionTypes, List<string>> 
     _rolesToAddUserForVersions = new()
{
    { TenantVersionTypes.Free, new List<string> { "Tenant User" } },
    { TenantVersionTypes.Pro, new List<string> { "Tenant User", "Tenant Admin" } },
    { TenantVersionTypes.Enterprise, new List<string> { "Tenant User", "Tenant Admin" } }
};

private readonly Dictionary<TenantVersionTypes, List<string>> 
    _rolesToAddTenantForVersion = new()
{
    { TenantVersionTypes.Free, null },
    { TenantVersionTypes.Pro, new List<string> { "Tenant Admin" } },
    { TenantVersionTypes.Enterprise, new List<string> { "Tenant Admin", "Enterprise" } },
};

Notice that:

  • Lines 1 to 7: This defines what Normal RoleTypes are added to the user setting up the tenant. In the Free version there is no “Tenant Admin” Role
  • Lines 9 to 15: This defines what tenant Roles (Admin-add and Auto-add) to the tenant. This adds extra Roles that the users can / will use.

The actual code starts with some checks (check version is set and requested tenant name is available, etc.) which I left out to make the main part easier to read. Notice the references to a directory to get the Roles for the Tenant and the AuthP user.

//Add a new user, in this case using individual users account
var userStatus = await GetIndividualAccountUserAndCheckNotAuthUser(
    dto.Email, dto.Password);
if (status.CombineStatuses(userStatus).HasErrors)
    return status;

//Now we can create the tenant, with the correct tenant roles
var tenantStatus = await _tenantAdminService.AddSingleTenantAsync(
    dto.TenantName, _rolesToAddTenantForVersion[tenantVersion])
if (status.CombineStatuses(tenantStatus).HasErrors)
    return status;

//This creates a user, with the roles suitable for the version of the app
status.CombineStatuses(await _authUsersAdmin.AddNewUserAsync(
    userStatus.Result.Id, dto.Email, null,
    _rolesToAddUserForVersions[dto.GetTenantVersionType()], 
   dto.TenantName));

Notice that:

  • Lines 8 to 11: This creates the tenant, with the correct tenant Roles for the user’s selected version.
  • Lines 14 to 17: This creates the AuthP user, with the Normal Roles defined for the user’s selected version.

The result is a customer can sign-up to the Invoice Manager version they need / can pay for and the code in the Invoice Manager uses an ASP.NET Core authentication provider to register the new user and then uses the AuthP’s admin services to set up the Tenant and AuthP’s user with the correct Roles for the version the customer has selected.

2c. Allowing the tenant admin user to change a user’s Roles

In the Pro and Enterprise versions the user that signed up to the Invoice Manager service it promoted to being a tenant admin. This allows this user to:

What the Part 2 article didn’t cover what is going on inside, especially when the tenant admin changes the Roles of a user in their tenant. In this case there are two types of Roles that the tenant admin can change:

  • The normal (base) Roles that anyone can have
  • The tenant admin-add Roles that are in the Tenant’s list of Roles.

NOTE: The tenant auto-add RoleTypes assigned to a tenant are automatically added to every logged-in user in that tenant.

When a tenant admin is editing a user’s Roles the page must list the normal and tenant admin-add Roles from the user’s Tenant. The AuthP user admin service contains a method called GetRoleNamesForUsersAsync(string userId) which looks for the two types of Roles to create a dropdown list of Roles that the user can have. This method takes the userId of the user being updated, so that it can add any admin-add Roles in the user’s tenant.

The screenshot below shows the user user1@4uInc.com who in the tenant called “4U Inc.”, who’s version is Enterprise.  The dropdown list of Roles shows the normal Role called “Tenant User” and a tenant admin-add Role called “Tenant Admin” Role. Only the “Tenant User” is currently selected, but if the tenant admin selected the “Tenant Admin” Role too, then the user1@4uInc.com would also be an tenant admin too.

4. Manually adding Roles to a tenant

In the “creating a tenant with the user’s selected version” section the code added the correct auto-add or admin-add type Roles to a tenant when it is created. You may also add some code to allow the user to upgrade to a higher version too. But in some applications, you might want to simply leave it to an admin user to do this.

Only an app-level admin user (i.e. a user that isn’t linked to a tenant and has permission to use the add / update tenants feature. That’s because the Roles that a tenant has defines what Roles the tenant users can have, and you don’t want to allow a tenant user to bypass your versioning.

The screenshot below shows three tenants:

  • 4U Inc. is an Enterprise version, with both the “Tenant Admin” (admin-add) Role and the “Enterprise” (auto-add) Role. (see the black tooltip with the two Role’s names when I hover the Tenant Role? column
  • Big Rocks Inc. is a Free version, and as such doesn’t have any tenant Roles
  • Pets Ltd. Is a Pro version, with only the “Tenant Admin” (admin-add) Role.

Conclusion

Many multi-tenant / SaaS applications use versioning, e.g. GitHub, Salesforce, Facebook, YouTube and even Twitter is now got a blue version (in some countries). Many applications offer free versions – GitHub for instance has a free version, but it has limitations which makes some companies pay for GitHub’s higher versions.

Versioning your multi-tenant application can will increase your potential user base and profits. The downside is it makes your application more complicated. The AuthP library will help, but it’s still takes careful work to create, market and manage different versions of your application.

The Invoice Manager application in the Example3 project in the AuthP repo is a great example to look at, and its designed to run using the localdb SQL Server that Visual Studio installs (or you can change the database connection string in the appsetting.json file if you want a different SQL Server). It automatically loads some demo data so that you can try various features written in the first three articles.

Happy coding.

Building ASP.NET Core and EF Core hierarchical multi-tenant apps

$
0
0

This article looks at a specific type of multi-tenant where a tenant can have a sub-tenant. For instance, a large company called XYZ with different businesses could have a top-level tenant called XYZ, with sub-tenants for each business it has. This multi-tenant type is known as hierarchical multi-tenant application.

The big advantage of a hierarchical multi-tenant applications is that higher level tenant can see all the data in the lower levels. For instance, the top-level XYZ tenant could see all the data in the businesses below, but each individual business can’t see the data of the other businesses – the “Setting the scene” section gives more examples of where a hierarchical multi-tenant approach can be useful.

The other articles in “Building ASP.NET Core and EF Core multi-tenant apps” series are:

  1. The database: Using a DataKey to only show data for users in their tenant
  2. Administration: different ways to add and control tenants and users
  3. Versioning your app: Creating different versions to maximise your profits
  4. Hierarchical multi-tenant: Handling tenants that have sub-tenants (this article)

Also read the original article that introduced the library called AuthPermissions.AspNetCore library (shortened to AuthP in these articles) which provide pre-built (and tested) code to help you build multi-tenant apps using ASP.NET Core and EF Core.

TL;DR; – Summary of this article

  • This article describes what a hierarchical multi-tenant application is and provides a couple of examples of where a hierarchical approach can help.
  • The article lists the three changes to the single level multi-tenant setup listed in the Part 1 article to create a hierarchical multi-tenant application.
  • Because a hierarchical multi-tenant application has sub-tenants, then the name and DataKey of a tenant are a combination of the names / DataKeys of the parent tenants.
  • Many of the features / administration of a hierarchical multi-tenant are the same as a single level multi-tenant application, but the create and update of a tenant requires a parent tenant to define the tenant’s place in the hierarchy. There is also an extra feature which allows you to move a tenant to a different place / level in the hierarchy.

Setting the scene – when are hierarchical multi-tenant applications useful?

A hierarchical multi-tenant is useful when the data, or users, are managed in sections. Typically, the sub-tenants are created a business grouping or geographic areas, or both. Each sub-tenant is separate from each other, and a user linked to a sub-tenant can only see the data in their tenant. While a user linked to a top-level tenant can all the sub-tenant’s data. Here is a real example to make this more real.

I was asked to design a hierarchical multi-tenant for a company that manages the stocking and sales for many chains of retail outlets. Companies would sign up their service, and some of the companies had hundreds of outlets all across the USA and beyond, which were managed locally. This meant the multi-tenant had to handle multiple layers and the diagram below gives you an idea of what that might look

Using the diagram above, the hierarchical multi-tenant design allows:

  • The “4U Inc.” tenant to see all the data from all their shops worldwide
  • The “West Coast” tenant to only see the shops in their West Coast region
  • The “San Fran” tenant to see the shops in their San Fran region
  • Each retail outlet can only see their data.

The data contains stock and sales data, including information on the person who made the sale. This allows business analytics, restock scheduling and even the performance of shops and their staff across the different hierarchical levels.  

So, to answer the question “when are hierarchical multi-tenant applications useful?” it’s when there are multiple groups of data and users, but there is a business advantage for a management team to have access to these multiple groups of data. If you have a business case like that, then you should consider a hierarchical multi-tenant approach.

How AuthP library manages a hierarchical multi-tenant

The part 1 article, which is about setting up the multi-tenant database, gave you eight stages (see this section) to register the AuthP library and setting up the code to split the data into a normal (single-level) tenants. The splitting up the data into tenants uses a string, known as the DataKey, that contains the primary key (e.g. “123.”) of the tenant and EF Core’s global query filter uses an exact match, to filter each tenant e.g.

modelBuilder.Entity<YourEntity>().HasQueryFilter(
   entity => entity.DataKey == dataKey;

The big change when using a hierarchical multi-tenant is that the AuthP creates the Datakey with a combination of the tenant primary keys. Then, by changing the EF Core’s global query filter to a StartWith filter (see below), then the data can be managed as a hierarchical multi-tenant

modelBuilder.Entity<YourEntity>().HasQueryFilter(
   entity => entity.DataKey.StartsWith(dataKey);

So, the hierarchical multi-tenant the AuthP creates the Datakey by combining the DataKeys of the higher levels, so “4U Inc.” might be “1.”, while “4U Inc., West Coast” might be “1.3.” and so on. The diagram shows the layers again, but now with the DataKeys added. From this you can see a DataKey of 1.3.7. would access the two shops in LA, but the people within each shop can only see the stock / sales for their shop.

This simple change to the EF Core’s global query filter in your application (and some extra AuthP code) changes your application from being a normal (single level) multi-tenant to a hierarchical multi-tenant.

The next section gives you changes to the eight steps to set up your database shown in the part 1 article.

Setting up a hierarchical multi-tenant application

It turns out the setting up of a hierarchical multi-tenant application is almost the same as setting up a single-level multi-tenant application. The Part 1 article covers the setting up the database for a single-level multi-tenant and a hierarchical multi-tenant only changes three of those steps: 1, 6, and addition to step 8.

You can see the full list of all the eight steps in Part 1 so I have listed just the changed steps, but you MUST apply all the steps in the Part 1 article – it’s just the following steps are different.

1. Register the AuthP library in ASP.NET Core

You need to add the AuthP NuGet package to your application and set up your Permissions (see this section in the Part 1 article). Registering AuthP to the ASP.NET Core dependency injection (DI) provider is also similar, but there is one key difference – the setting of the TenantType in the options.

The code below is taken from the Example4 project (with some test data removed) in the AuthP with the TenantType setup highlighted.

services.RegisterAuthPermissions<Example4Permissions>(options =>
    {
        options.TenantType = TenantTypes.HierarchicalTenant;
        options.AppConnectionString = connectionString;
        options.PathToFolderToLock = _env.WebRootPath;
    })
    .UsingEfCoreSqlServer(connectionString)
    .IndividualAccountsAuthentication()
    .RegisterTenantChangeService<RetailTenantChangeService>()
    .RegisterFindUserInfoService<IndividualAccountUserLookup>()
    .RegisterAuthenticationProviderReader<SyncIndividualAccountUsers>()
    .SetupAspNetCoreAndDatabase(options =>
    {
        //Migrate individual account database
        options.RegisterServiceToRunInJob<StartupServiceMigrateAnyDbContext<ApplicationDbContext>>();

        //Migrate the application part of the database
        options.RegisterServiceToRunInJob<StartupServiceMigrateAnyDbContext<RetailDbContext>>();
    });

NOTE: Look at the documentation about the AuthP startup setting / methods for more information and also have a look at the Example4 ASP.NET Core project.

6. Add EF Core’s query filter

This stage is exactly the same as in the step 6 in the Part 1 article, apart from one change – the method you call to set up the global query filter, which is highlighted in the code.

public class RetailDbContext : DbContext, IDataKeyFilterReadOnly
    {
        public string DataKey { get; }

        public RetailDbContext(DbContextOptions<RetailDbContext> options, IGetDataKeyFromUser dataKeyFilter)
            : base(options)
        {
            DataKey = dataKeyFilter?.DataKey ?? "Impossible DataKey"; 
        }

        //other code removed…

        protected override void OnModelCreating(ModelBuilder modelBuilder)
        {
            //other configuration code removed…

            foreach (var entityType in modelBuilder.Model.GetEntityTypes())
            {
                if (typeof(IDataKeyFilterReadOnly)
                    .IsAssignableFrom(entityType.ClrType))
                {
                    entityType.AddHierarchicalTenantReadOnlyQueryFilter(this);
                }
                else
                {
                    throw new Exception(
                        $"You missed the entity {entityType.ClrType.Name}");
                }
            }
        }
    }
}

This code automates the setting up of the EF Core’s query filter, and its database type, size and index. Automating the query filter makes sure you don’t forget to apply the DataKey query filter in your application, because a missed DataKey query filter creates a big hole the security of your multi-tenant application.

The AddHierarchicalTenantReadOnlyQueryFilter extension method is part of the AuthP library and does the following:

  • Adds query filter to the DataKey in an entity which must start with the DataKey provided by the service IGetDataKeyFromUser described in step 3.
  • Sets the string type to varchar(250). That makes a (small) improvement to performance.
  • Adds an SQL index to the DataKey column to improve performance.

NOTE: a size of 250 characters for the DataKey allows a depth of 25 sub-tenants, which should be enough for most needs.

8. Make sure AuthP and your application’s DbContexts are in sync

A hierarchical multi-tenant application stills needs to sync changes between the AuthP database and your application’s database, which means creating a ITenantChangeService service and register that service via the AuthP library registration. I’m not going to detail those steps because they are already  described in stage 8 in the Part 1 article.

What is different is the that a tenant’s rename and delete are much more complicated because a there may be multiple tenants to update or delete. The AuthP code manages this for you, but you need to understand that it will your ITenantChangeService code multiple times, and the some of the calls will include higher layers.

Also, you must implement the MoveHierarchicalTenantDataAsync method in your ITenantChangeService code, because AuthP allows you to move a tenant, with any sub-tenants, to another place in the hierarchical tenants – see next section for a diagram of what a move can do.

NOTE: I recommend you look at Example4’s RetailTenantChangeService class for a an example of how you would build your ITenantChangeService code for hierarchical tenants.

AuthP’s tenant admin service

The AuthP’s ITenantAdminService handles both a normal (single-level) and a hierarchical multi-tenant. The big change is a tenant can have a sup-tenants, so a list of tenants shows the full tenant’s name, which is a combination of all the tenant names with a | between them – see the screenshot from Example4 for an example of hierarchical tenants.

Note that hierarchical tenants have an extra feature called Move that the normal (single level) multi-tenant doesn’t have. This allows you to move a tenant, with all of its sub-tenants, to another level in your tenants. The diagram below shows an example of a move, where a new “California” tenant is added and then the “LA” and “SanFran” tenants are moved into the “California” tenant. The Move code recursively goes through the tenant + sub-tenants updating the parent and recalculating the DataKey for each tenant. This also calls your ITenantChangeService code at each stage so that you can update the DataKey of your application’s data.

Other general multi-tenant features such as administration features (see the part 2 article) and versioning (see the part 3 article) also apply to a hierarchical multi-tenant approach.

Conclusion

When I was asked to design and build a hierarchical multi-tenant application for a client, I found the project a challenge and I really enjoyed working on it. Over the years I have thought about the client’s project, and I have found ways to improve the first design.

Two years after the client project I wrote an article called “A better way to handle authorization in ASP.NET Core” which improved on some the client project. Another two years later I came up with the AuthP library improves the code again and makes it much easier for a developer to create a hierarchical multi-tenant (and other things too).

My first-hand experience of designing and building a real hierarchical multi-tenant application for a client means I had a good idea on what is really needed to create a proper application. For instance, the part 2 article has a lot of administration features which I know are needed, and the part 3 article adds an important feature of offering different versions of your multi-tenant application to users.

Happy coding.

Advanced techniques around ASP.NET Core Users and their claims

$
0
0

This article describes some advanced techniques around adding or updating claims of users when building ASP.NET Core applications. These advanced techniques are listed below with examples taken from the AuthPermissions.AspNetCore library / repo.

This article is part of the AuthPermissions.AspNetCore (shortened to AuthP in this article) documentation. Others articles about the AuthP library are:

TL;DR; – Summary of this article

  • A logged-in user in an ASP.NET Core application has a HttpContext.User that has information (stored in claims) about the logged-in user and, optionally, data on what pages / features the user can access (known as authorization).
  • The user and their claims are calculated on login and then stored in a Cookie or a JWT Bearer Token. This makes subsequent accessed to the application by that user fast, as the user’s claims are stored and can be read in quickly.
  • You can add extra claims when a user logs in, either to add extra authorization, or value(s) that take a long time to calculate but change infrequently. 
  • The ways you add extra claims on login depends on which of the authentication handlers APIs and which way you store the claims.
  • In certain situations, you might want to recalculate the claims of an all-ready logged in user and this article describes two refresh claims approaches to do this – a periodical approach and an event-driven approach.

Setting the scene – What the AuthP library adds to ASP.NET Core?

The AuthP library provides three main extra authorization features to a ASP.NET Core application. They are:

  • An improved Role authorization system where the features a Role can access can be changed by an admin user (i.e. no need to edit and redeploy your application when a Role changes).
  • Provides features to create a multi-tenant database system, either using one-level tenant or multi-level tenant (hierarchical).
  • Implements a JWT refresh token feature to improve the security of using JWT Token in your application.

The rest of this article describes various ways to add or change the user’s claims in an ASP.NET Core application. Each one either improves the performance of your application or solves an issue around logged-in user claims being out of date.

1. Adding extra claims on to authentication handlers on login

The ASP.NET Core’s authentication handlers are there to ensure that the person who is logging in is verified as a known user. On a successful login it adds information, in the form of claims, about the user and what they can access – known as authorization. Together these form a ClaimsPrincipal class.

This ClaimsPrincipal class is stored in a form that the user to access the application without logging in again. In ASP.NET Core there are two main ways: in a Cookie or in a JWT Bearer Token (shorted to JWT Token). These work in a different way, but they do the same thing – that is provide a secure version of the user’s log-in data that will create a HttpContext.User on every HTTP request the user makes.

The brilliant part of storing the claims in a Cookie or JWT Token is they are super-fast – the claims are calculated on login, which make some time, but for subsequent HTTP requests the claims just read in from the Cookie or JWT Token.

The AuthP library adds extra authorization claims to the ClaimsPrincipal class on login. It does this by intercepting a login event if the claims are stored in Cookie, or for JWT Token it adds code within the building of the Token. AuthP relies on unique string that the authentication handler creates for each user, referred to as the userId. The library has a class called ClaimsCalculator which when called with the user’s userId it returns the extra AuthP claims needed for that user.

Currently, the AuthP library has built-in connectors to individual user accounts and Azure Active Directory (Azure AD) versions when using a Cookie to store the logged-in user’s information. But if you are using a JWT Bearer Token to store the logged-in user’s information then its easy to connect to the AuthP’s claims, which is described later.

ASP.Net Core has lots of authentication handlers and if you want to link into authentication handler’s login you need to know how to tap into their login event. Thankfully, most of the main authentication providers you might want to do use two external services APIs, OAuth2 and OpenIdConnect. The two sections look at how to link into these two APIs to add AuthP claims.

1a. OAuth2: Used by Google, Facebook, Twitter, etc.

The 0Auth2 API is an industry-standard protocol for authorization and is used for lots of external authentication providers and was released in 2012. ASP.NET Core documentation uses OAuth2 for  social logins like Google, Facebook, Twitter, but you can use OpenID Connect for these too (see this article about using OpenID Connect to use Google social login).  

To add extra claims on login, you need to link the OnCreatingTicket event of the ASP.NET Core authentication handler. The event call to your method provides a OAuthCreatingTicketContext parameter which provides the current tokens / claims (at the OAuth2 level they are represented by the AuthenticationToken class). See the code in this section on adding extra tokens /claims on login).

NOTE: that the OAuthCreatingTicketContext contains a HttpContext class, which allows you to use dependency injects to get an instance of the AuthP’s ClaimsCalculator to get the AuthP’s claims, e.g., ctx.HttpContext.RequestServices.GetRequiredService<IClaimsCalculator>();

1b. OpenID Connect:

OpenID Connect is based on OAuth2 and came out in 2014 to fix some issues in OAuth2. Its design is more secure, standardises the authentication steps, and fixes some issues found on OAuth2. ASP.NET Core documentation uses OpenID Connect to use a Microsoft account / Azure AD as a external authentication source.

NOTE: Andrew Lock has written an excellent article about OpenID Connect in ASP.NET Core which explains how OpenID Connect works and how it differs from OAuth2.

As I said earlier, the AuthP library has connector to Azure AD via a method called SetupOpenAzureAdOpenId, which is an excellent example of how to build an OpenID Connect connector. This method links code to OnTokenValidated event that allows you to add / change the logging in user. Just like the OAuth2 event call the parameter contains the HttpContext class, so you can use manual dependency injection to get access to AuthP’s ClaimsCalculator.

1c. Telling AuthP you have set up a custom authentication connector

The last thing you need to do is to add the ManualSetupOfAuthentication method to the registering of the AuthP library in your Program class (or Startup class in NET 3.1). 

2. Adding an extra Claim(s) to a user via AuthP

In the AuthP repo, the Example3 project is a single-level multi-tenant web application called the Invoice Manager. When a companies can sign up to use the Invoice Manage the banner changes to show the company’s name. Styling the application to customer like that improves the user’s experience, but the downside that every HTTP request has a database access to get the company’s name twice (one for redirecting to the correct page and another to add the header). That hits the performance of the overall application so, how could I improve that?

As already explained, claims are calculated when a user logs in and then stored in a Cookie or a JWT Bearer Token. This means that if I added the company’s name as a claim, then I will remove two database queries for every HTTP request.

Version 2.3.0 of the AuthP library has a RegisterAddClaimToUser<TClaimsAdder> method that will allow you to extra claims when a user logs in. The class you register with this method must implement the IClaimsAdder interface and it then registered with the ASP.NET Core dependency injection. When the AuthP’s ClaimsCalculator is called it will add the default AuthP’s default claims and all of the claims you added via the RegisterAddClaimToUser method.

The code below shows the code I used to find the tenant company name. NOTE: if the user isn’t a tenant user, then no claim is added.

public class AddTenantNameClaim : IClaimsAdder
{
    public const string TenantNameClaimType = "TenantName";

    private readonly IAuthUsersAdminService _userAdmin;

    public AddTenantNameClaim(IAuthUsersAdminService userAdmin)
    {
        _userAdmin = userAdmin;
    }

    public async Task<Claim> AddClaimToUserAsync(string userId)
    {
        var user = (await _userAdmin.FindAuthUserByUserIdAsync(userId)).Result;

        return user?.UserTenant?.TenantFullName == null
            ? null
            : new Claim(TenantNameClaimType, user.UserTenant.TenantFullName);
    }
}

A simple test of displaying the company home page many times gave me the following timings:

  1. Before adding the tenantName claim: 20 to 25 ms
  2. After adding the tenantName claim: 19 to 20 ms

That’s a ~3 ms / ~10% performance improvement for a small amount of extra code, so I think that’s a win. But what happens if the tenant’s (company’s) is changed? Read the next two approaches for ways to overcome this.

NOTE: You can see this approach in the Example3 ASP.NET Core project. If your clone the  AuthPermissions.AspNetCore repo and run the Example3.MvcWebApp project then it will seed the database with demo data on its first run and you can try this out.

3. Periodically refreshing the logged-in user’s claims

There are a few times where your want the user’s claims to be updated. The main one is if you change some of the user’s authorization parts, like Roles, or you change the Permissions in an AuthP Role, then these only get changed when the user logs out and logs back in again. This can have security issues, as you might have a logged-in users and you want their authorization revoked in some way.

My answer to this issue is to recalculate the user’s claims every so often – how often you update the claims of a logged-in user is a compromise between performance and claims being ‘correct’. If you recalculate every second, then for applications with lots of users will become slow, as recalculating the AuthP’s claims needs two database accesses to set up the Permissions and the DataKey. Alternatively, updating the company name isn’t a security issue, so you might want to wait say 10 minutes or more between recalculating the claims.

How you do this depends on whether you are using a Cookie or a JWT Bearer Token to hold the calculated user’s claims. The two subsections show you how to do this:

3a. Refreshing the claims in a Cookie

Refreshing a user’s authentication Cookie is available via the cookie OnValidatePrincipal event. Lots of authentication handlers use an authentication Cookies by default, but its sometimes its hard to find the Cookie events. The individual user accounts authentication handler is easy to set up – see the code below as to you configure, with the setup of the event highlighted.

services.AddDefaultIdentity<IdentityUser>(options =>
        options.SignIn.RequireConfirmedAccount = false)
    .AddEntityFrameworkStores<ApplicationDbContext>();
services.ConfigureApplicationCookie(options =>
{
    options.Events.OnValidatePrincipal = 
        PeriodicCookieEvent.PeriodicRefreshUsersClaims;
});

Finding the way to link cookie events when using OpenID Context to use an external Azure AD was much harder find, but the code below shows how to set up the PeriodicRefreshUsersClaims method.

    .AddMicrosoftIdentityWebApp(identityOptions =>
    {
        var section = _configuration.GetSection("AzureAd");
        identityOptions.Instance = section["Instance"];
        identityOptions.TenantId = section["TenantId"];
        identityOptions.ClientId = section["ClientId"];
        identityOptions.CallbackPath = section["CallbackPath"];
        identityOptions.ClientSecret = section["ClientSecret"];
    }, cookieOptions =>
        cookieOptions.Events.OnValidatePrincipal =
            PeriodicCookieEvent.PeriodicRefreshUsersClaims);

You also need to create a claim holding the time when the user should be updated. Here is some code that adds a claim called TimeToRefreshUserClaim, which contains a time one minute in the future.

public class AddRefreshEveryMinuteClaim : IClaimsAdder
{
    public Task<Claim> AddClaimToUserAsync(string userId)
    {
        var claimValue = DateTime.UtcNow.AddMinutes(1).ToString("O");;
        var claim = new Claim(TimeToRefreshUserClaimType, claimValue)
        return Task.FromResult(claim);
    }
}

This claim will be used by the method linked to the Cookie’s OnValidatePrincipal event in the PeriodicCookieEvent class, which is shown below. The highlighted lines compares the time in the TimeToRefreshUserClaim claim with the current time (UTC) and updates the user’s claims (including the TimeToRefreshUserClaim) if the user’s time is older than DateTime.UtcNow.

NOTE: I’m recommend a refresh of 1 minute. I only used short time as it’s easier to check it’s working.

public static async Task PeriodicRefreshUsersClaims
    (CookieValidatePrincipalContext context)
{
    var originalClaims = context.Principal.Claims.ToList();

    if (originalClaims.GetClaimDateTimeUtcValue
        (TimeToRefreshUserClaimType) < DateTime.UtcNow)
    {
        //Need to refresh the user's claims 
        var userId = originalClaims.GetUserIdFromClaims();
        if (userId == null)
            //this shouldn't happen, but best to return
            return;

        var claimsCalculator = context.HttpContext.RequestServices
            .GetRequiredService<IClaimsCalculator>();
        var newClaims = await claimsCalculator
              .GetClaimsForAuthUserAsync(userId);
        newClaims.AddRange(originalClaims
            .RemoveUpdatedClaimsFromOriginalClaims(newClaims));

        var identity = new ClaimsIdentity(newClaims, "Cookie");
        var newPrincipal = new ClaimsPrincipal(identity);
        context.ReplacePrincipal(newPrincipal);
        context.ShouldRenew = true;
    }
}
private static IEnumerable<Claim> RemoveUpdatedClaimsFromOriginalClaims
    (this List<Claim> originalClaims, List<Claim> newClaims)
{
    var newClaimTypes = newClaims.Select(x => x.Type);
    return originalClaims.Where(x => !newClaimTypes.Contains(x.Type));
}

This method is very quick if the user doesn’t need refreshing – it calls DateTime.Parse of a string in an already loaded claim and compare to a time which takes a few microseconds. That’s important because it will be called on every HTTP request. If the claims do need refreshing there are a few database accesses which takes milliseconds, which is why there is a compromise between performance and claims being ‘correct’.

NOTE: The line context.ShouldRenew = true at the end of the PeriodicRefreshUsersClaims method. This makes sure that the Cookie is updated. If you don’t add this line, then new claims work in this HTTP request, but the claims aren’t changed for the next HTTP request.

NOTE: You can see this approach in the Example3 ASP.NET Core project. If your clone the  AuthPermissions.AspNetCore repo and run the Example3.MvcWebApp project then it will seed the database with demo data on its first run and you can try this out.

3b. Refreshing the claims in a JWT Bearer Token

The basic JWT Bearer Token (shortened to JWT Token) can’t be updated as once it is created you can’t change it until it times out – maybe 8 hours later. But this long life of the JWT Token creates a security issue as of a hacker can get a copy of the JWT Token they can assess the application too. Also, there is logout of an JWT Token which is another security issue too.

The solution to these JWT Token security issues is adding a refresh token and the AuthP library contains an implementation of the refresh token in ASP.NET Core. And this implementation can refresh the user’s claims, but first let’s look at how the refresh token works.

The diagram below shows that the JWT Token times out quickly, in this case every five minutes, but the refresh token provides the authority to create a new JWT Token.

In this scheme the JWT Token can still be copied but its only valid for a small time. The refresh token also provides extra security because it can only be used once, and it can be revoked which will log out the user. The short life JWT Token time and the one-time use of the refresh token makes the use of a JWT Token much more secure. It also provides a way to periodically update the user’s claims.

The AuthP library provides a TokenBuilder class that can create just a JWT Token or a JWT Token with an associated refresh token. When creating the JWT Token it uses AuthP’s ClaimsCalculator class to add the AuthP’s claims. This means if you are using the JWT Token with refresh, then the user’s claims are updated on every refresh.

NOTE: For more information on JWT Token with refresh you can look at a video I created explaining how the JWT Token with refresh works,  or look at the AuthP’s JWT Token refresh explained documentation.

4. Refreshing the logged-in user’s claims on an event

The next challenge is when we need to immediately refresh the claims in a logged-in user in a way that doesn’t make every HTTP slow. This “immediately refresh” requirement is needed when using the hierarchical multi-tenant “Move Tenant” feature. This feature changes the DataKey of the moved tenants, which means the DataKey in users linked to a tenant that was moved need an immediate updated of their DataKey claim.

Implementing the immediately refresh the claims in a logged-in user requires three parts:

  1. A way to know if a logged-in user’s claims need updating.
  2. A way to tell ASP.NET Core to immediately refresh logged-in user’s DataKey claim
  3. A way to detect when a DataKey is changed

NOTE: This only works with Cookie authentication because you can’t force an immediate update a JWT token.

NOTE: You can see this approach in the Example4 ASP.NET Core project. Make sure to look at the RefreshUsersClaims folder in the Example4.ShopCode project for the various code shown in this approach.

4a. A way to know if a logged-in user’s claims need updating.

In the periodic refresh of the logged-in user’s claims a claim contained the time when it should be updated. With the refresh on an event, we need the opposite –the time when the logged-in user’s claims were last updated. The code below adds a claim that contains the time the claims were last created / updated.

public class AddGlobalChangeTimeClaim : IClaimsAdder
{
    public Task<Claim> AddClaimToUserAsync(string userId)
    {
        var claimValue = DateTime.UtcNow.ToString("O");
        var claim = new Claim(EntityChangeClaimType, claimValue)
        return Task.FromResult(claim);
    }
}

4b. Telling ASP.NET Core to immediately refresh user’s claims

We use the same cookie OnValidatePrincipal event that the periodic update of claims, but now we need the last time a DataKey was changed. If you only have one instance of your application you could use a static variable, but if you have multiple instances of your application running (Azure calls this Scale Out), then a static variable won’t work – you need a global resource that all the instances can see.

For the “multiple instances” case you could use the database, but that would need a database access on every HTTP request, which would hit performance. An in-memory distributed cache like Redis would be quicker, but I used a simpler global resource – the application’s FileStore. The GlobalFileStoreManager class in the AuthP repo implements a global FileStore, storing the value in a text file stored in the ASP.NET Core wwwRoot directory. The GlobalFileStoreManager is quicker than a database, only taking 0.02 ms. (on my dev machine) to read.

So, the OnValidatePrincipal event can use the GlobalFileStoreManager to read the last time a DataKey was updated, and it can compare that to the last time the logged-in user’s claims and update the claims if they are older.

The code below shows the event method in the TenantChangeCookieEvent class. This is almost the same as the periodic refresh event, with the changed lines highlighted.

public static async Task UpdateIfGlobalTimeChangedAsync
    (CookieValidatePrincipalContext context)
{
    var originalClaims = context.Principal.Claims.ToList();
    var globalTimeService = context.HttpContext.RequestServices
        .GetRequiredService<IGlobalChangeTimeService>();
    var lastUpdateUtc = globalTimeService.GetGlobalChangeTimeUtc();

    if (originalClaims.GetClaimDateTimeUtcValue
           (EntityChangeClaimType) < lastUpdateUtc)
    {
        //Need to refresh the user's claims 
        var userId = originalClaims.GetUserIdFromClaims();
        if (userId == null)
            return;

        var claimsCalculator = context.HttpContext.RequestServices
            .GetRequiredService<IClaimsCalculator>();
        var newClaims = await claimsCalculator
            .GetClaimsForAuthUserAsync(userId);
        newClaims.AddRange(originalClaims
            .RemoveUpdatedClaimsFromOriginalClaims(newClaims)); 

        var identity = new ClaimsIdentity(newClaims, "Cookie");
        var newPrincipal = new ClaimsPrincipal(identity);
        context.ReplacePrincipal(newPrincipal);
        context.ShouldRenew = true;
    }
}

4a. A way to detect when a DataKey is changed

The version 2.3.0 of the AuthP library introduces an IRegisterStateChangeEvent interface, which allows you to use EF Core 5’s events to detect a series of events. This lets you trigger code when different events happen within the AuthP’s DbContext.

The code below shows the code to register the specific event(s) you want to use. In this case I want to detect a change to the ParentDataKey property in the Tenant entity class, so we add a private event hander called RegisterDataKeyChange to the ChangeTracker.StateChanged event. Once that happens it uses a service, which in turn uses the GlobalFileStoreManager to set the global

public class RegisterTenantDataKeyChangeService 
    : IRegisterStateChangeEvent
{
    private readonly IGlobalChangeTimeService _globalAccessor;
    public RegisterTenantDataKeyChangeService(
        IGlobalChangeTimeService globalAccessor)
    {
        _globalAccessor = globalAccessor;
    }

    public void RegisterEventHandlers(AuthPermissionsDbContext context)
    {
        context.ChangeTracker.StateChanged += 
            RegisterDataKeyChange;
    }

    private void RegisterDataKeyChange(object sender, 
        EntityStateChangedEventArgs e)
    {
        if (e.Entry.Entity is Tenant
            && e.NewState == EntityState.Modified
            && e.Entry.OriginalValues[nameof(Tenant.ParentDataKey)] !=
            e.Entry.CurrentValues[nameof(Tenant.ParentDataKey)])
        {
            //A tenant DataKey updated, so set the global value 
            _globalAccessor.SetGlobalChangeTimeToNowUtc();
        }
    }
} 

This is very efficient because it’s only a DataKey change that causes the immediate update of the logged-in user’s claims.

NOTE: There is a very small possibility that a logged-in user could start a database access before the DataKey is changed, and database access is delayed by the “Move Tenant” transaction. In this case the user’s database access will use the old DataKey. If you want to ensure this cannot happen you could put your application into a mode where the user is diverted to a “please wait” page for a few seconds prior to the “Move Tenant”. I have an old article about how make your application “down for maintenance” for ASP.NET MVC5, but the same approach would work with ASP.NET Core

Conclusion

Most ASP.NET Core authentication handlers provide the basic claims but little or no authorization claims. A few, like individual user accounts with Roles-based authorization provide a full  authentication / authorization solution. The AuthP library adds authorization claims similar to the Roles-based authorization for accessing features / pages and also the database (i.e. multi-tenant applications).

As you have learnt in this article claims are the key part of authentication and authorization. They are also a fast, because they are calculated on login, but every HTTP request after that the user’s claims are read in from a Cookie or a JWT Token. If you have a value that takes time to create and use in ever HTTP request, then consider turning that value into a claim on login.

The downside of the claims being stored in a Cookie or a JWT Token is that is fixed throughout the time the user is logged in. In a few cases this is a problem, and this article gives two ways to overcome this problem while not causing a slowdown of your application.

Most of the techniques in this article are advanced, but each one has a valid use in real-world applications. You most likely won’t use these techniques very often, but if you need something like this then you now have some example code to use.

Happy coding.

Part6: Using sharding to build multi-tenant apps using ASP.NET Core and EF Core

$
0
0

This article describes how to use EF Core and ASP.NET Core to create a multi-tenant application where each different groups (known as tenants) has its own database – this is known as sharding. A second section describes how to build a hybrid a multi-tenant design which supports multiple tenants in one database and tenants that has its own database, i.e. sharding. 

NOTE: Multi-tenant applications are also referred to as SaaS (Software as a Service). In these articles I use the term multi-tenant as SaaS can also cover having one application + database for each company / tenant.

This article is part of the series that covers .NET multi-tenant applications in general. Also the designs shown in this article comes from the library called AuthPermissions.AspNetCore library (shortened to AuthP in these articles) which provide pre-built (and tested) code to help you build multi-tenant apps using ASP.NET Core and EF Core. The other articles in “Building ASP.NET Core and EF Core multi-tenant apps” series are:

  1. The database: Using a DataKey to only show data for users in their tenant
  2. Administration: different ways to add and control tenants and users
  3. Versioning your app: Creating different versions to maximise your profits
  4. Hierarchical multi-tenant: Handling tenants that have sub-tenants
  5. Advanced techniques around ASP.NET Core Users and their claims
  6. Using sharding to build multi-tenant apps using EF Core and ASP.NET Core (this article)

TL;DR; – Summary of this article

  • Multi-tenant applications provide a service to many tenants. Each tenant has their own set of data that is private to them.
  • Sharding is the name given to multi-tenant applications where each different tenant has its own database. The other approach is to put all the tenants in one database where each tenant has a unique key which makes sure only data marked with the tenant’s key is returned (known as shared-database).
  • The good parts of the sharding approach are its faster and data is more secure than the shared-database approach, but sharding comes with the price of having many databases.
  • There is a third, hybrid approach which supports both shared-database and sharding at the same time – this allows you to manage the cost / performance by putting a group of tenants with small data / usage into one database while tenants with high data / usage can have their own database.
  • I detail 7 steps to create a sharding multi-tenant application using EF Core and ASP.NET Core. The steps are a mixture of ASP.NET Core code and EF Core’s code.
  • Then I detail 8 steps (3 of which the same as the sharding approach) that implements the hybrid approach. This implementation has been added in version 3 of the AuthP library and there is an example called Example6.SingleLevelSharding which you can run.

Setting the scene – what is sharding and why it is useful?

Wikipedia says that database sharding “A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load”. I emphasized the last sentence because that’s the key part – a multi-tenant / SaaS application will have a database for each separate tenant. The alternative to using sharding is to store all the data in one database and the tenant’s data are differentiated by a unique key for each tenant – I will refer to as shared-database, while sharding is a dedicated-database approach or sharding, which is shorter.

There are a number of pros / cons to each approach, but the biggest is the cost verses performance issue. A sharding approach should be quicker than the shared-database approach, but sharding’s performance comes from having lots of databases, which costs more money. The other pro for the sharding approach is that each tenant’s data is more isolated from each other, as each tenant has its own database.

NOTE: This Microsoft document describes some other differences between sharding and shared-database, plus a comparison of three ways to provide a service to many tenants.

There is a third, hybrid approach that allows you to balance the cost /performance. This design uses sharding for tenants that has a lot of data / demand, while tenants with less data / demand go the shared-database approach. The benefit of this approach is you can offer smaller tenants a lower price by putting them in shared database, while tenants that have higher demands will pay for a dedicated database.

Some years ago, I was asked to design and build a multi-tenant application with thousands of tenants, with tenants ranging from a less than a hundred users to a few large tenants with thousands of users. My client wanted a hybrid approach to cover this wide range of tenant types, which is why I have added both sharding and the hybrid approach to my AuthP library.

Here is a diagram to show all three approaches with a summary of their pros and cons.

Finally, I should add that Azure has a useful feature called SQL Server Elastic Pools which can help with the cost / performance by providing an overall level of database performance which is shared across all the databases in the pool. I will talk more about that in the next article.

How I implemented sharding in version 3 of my AuthP library

In one of my tweets about building multi-tenant applications a number of people said they used sharding. AuthP version 2 only supports the shared-database approach, but this feedback made me make the focus of version 3 release of the library to implementing sharding for multi-tenant application.

In addition, the AuthP sharding feature is designed to support the hybrid approach as well, which means you can use the shared-database approach and / or dedicated-database (sharding) approach. As I have already explained, this allows you to balance the cost / performance for each tenant if you want to.

I have split the description of the EF Core / ASP.NET Core code into two parts:

  1. Implement a sharding-only multi-tenant application.
  2. Implement a hybrid multi-tenant application.

1. Implement a sharding-only multi-tenant application

The figure below shows what the sharding-only design would look like, with a database containing information about the users and tenants (top left) and a database for each tenant (bottom).

Here are the steps to implement sharding for a multi-tenant application:

  1. Decide on where the database connection strings should be stored
  2. Hold information of the tenant and its users in admin database
  3. When a user logs in, then add a ConnectionName claim to their claims
  4. Provide a service to convert the user’s ConnectionName claim to a database connection
  5. Provide the connection string to the application’s DbContext
  6. Use EF Core’s SetConnectionString method to set the database connection
  7. Migrate the tenant database if not used before

1. Decide on how the database connection strings are managed

You need a list of connection strings for each tenant database and the most obvious place to hold connection strings is in the ASP.NET Core appsettings.json file. Another reason to using the appsettings file is security: connection strings can contain sensitive data such as username and password by ASP.NET Core / Azure offers lots of ways to hide / replace the connection strings.

The second decision is how we obtain to the connection string when a tenant user accesses the application? The best performance approach is to add a claim with the connection string when a tenant user logs in, but if you are using JWT Bearer Token that would expose the connection string with its username / password sensitive data.

In the end I used the names of the connection strings from the “ConnectionString” section in the appsetting file – see below:

{
  "ConnectionStrings": {
    "DefaultConnection": "Server=… admin database connection string.",
    "Shard1": "Server=… tenant database connection string.",
    "Shard2": "Server=… tenant database connection string.",
    …etc.
  },
//… other parts left out
}

By referring to the database by the connection string name, e.g. “DefaultConnection”, “Shard1”,  the sensitive data is hidden and the admin user has a simple / obvious name to assign to a tenant.

NOTE: In the next article I will describe how you can add extra connection strings to the appsetting file while the application is running.

2. Hold information of the tenant and its users in admin database

The way a multi-tenant application is there are tenants, with many users linked to each tenant. Typically, the tenant will have a unique key, often a primary key provided by the admin database, a name, e.g. “Company XYZ”, and in this case it would contain the name of the connection string.

ASP.NET Core handles the authentication of a user and provides a unique id, often a string, for each user. You need to add extra data to link a user’s id to a tenant – one simple way would add a collection of users’ id using a one-to-many relationship.

NOTE: The AuthP library has built-in AuthUser and Tenant classes, with admin code to manage these and link to the ASP.NET Core authentication handler. This the AuthP documentation page called “Multi-tenant explained” for how that works.

3. When a user logs in, then add a ConnectionName claim to their claims

When a user logs in you need to detect if they are linked to a tenant, then you need to add a claim containing the connection string name. This requires you to intercept the login process and use the user’s id to obtain the connection string name held in the tenant admin class.

Intercepting the login process depends on the ASP.NET Core authentication handler you are using – the article called “Advanced techniques around ASP.NET Core Users and their claims” provides information on the main authentication handlers.

NOTE: The AuthP library automatically adds the ConnectionName claim if sharding is turned on.

4. Provide a service to convert the user’s ConnectionName claim to a database connection

Having decided to use the connection string name in the claim, you need a way to access the “ConnectionStrings” object in the appsetting file. At the same time, I want to be able to add new connection strings while the application is running. This means can’t use the normal IOption<T> options, but I have to use the IOptionsSnapshot<T> option which reads current data in the appsetting file.

I built a service called ShardingConnections which uses IOptionsSnapshot<T> to get the latest “ConnectionStrings”. The code below shows the specific parts of this service to get the connection string from a connection name (thanks to ferarias answer to this stack overflow question). This service should be set up as a scoped service.

public class ConnectionStringsOption : Dictionary<string, string> { }
public class ShardingConnections : IShardingConnections
{
    private readonly ConnectionStringsOption _connectionDict;

    public ShardingConnections(
        IOptionsSnapshot<ConnectionStringsOption> optionsAccessor)
    {
        _connectionDict = optionsAccessor.Value;
    }

    public string GetNamedConnectionString(string connectionName)
    {
        return _connectionDict.ContainsKey(connectionName) 
        ? _connectionDict[connectionName]
        : null;
    }
    //Other methods not shown
}

NOTE: The full service contains extra methods useful for the admin when assigning a connection name to a tenant.

5. Provide the connection string to the application’s DbContext

You need to inject a service into the tenant application’s DbContext which contains the connection string for the current user. To do that we need two parts:

  • Get the ConnectionName claim from the current user
  • Use the ShardingConnections service to get the connection string

The code below shows a Scoped service that uses the IHttpContextAccessor to get the logged-in user (if present) with its claims. From this it can obtain the ConnectionName claim and passes the connextion string name to the ShardingConnections’s GetNamedConnectionString method. This returns the connection string that the DbContext.

public class GetShardingData : IGetShardingDataFromUser
{
    public GetShardingDataUserNormal(IHttpContextAccessor accessor,
        IShardingConnections connectionService)
    {
        var connectionStringName = accessor.HttpContext?
             .User?.Claims.SingleOrDefault(x => 
                   x.Type == PermissionConstants.ConnectionNameType)?.Value
        if (connectionStringName != null)
            ConnectionString = connectionService
                .GetNamedConnectionString(connectionStringName);
    }

    public string ConnectionString { get; }
}

6. Use EF Core’s SetConnectionString method to set the database connection

The tenant application’s DbContext needs to link to the database that has been provided by the GetShardingData service in the last section. The  EF Core’s SetConnectionString method (added in EF Core 5) sets the connection string to be used for this instance of the DbContext. The code below shows the constructor on the DbContext than handles the tanant’s data.

public class ShardingSingleDbContext : DbContext
{
    public ShardingSingleDbContext(
        DbContextOptions<ShardingSingleDbContext> options,
        IGetShardingDataFromUser shardingData)
        : base(options)
    {
        Database.SetConnectionString
           (shardingData.ConnectionString);
    }
    //… other parts left out
}

NOTE: The EF Core team says using SetConnectionString doesn’t have much of an overhead so sharding shouldn’t be slowed down by changing databases. You may also be interested how you would use DbContext pooling when building multi-tenant applications.

7. Migrate a database if not used before

At some point you need to migrate a database that hasn’t been used before. In the AuthP library the creation of a new tenant causes a call a method in a service written by the developer that follows the ITenantChangeService interface. When working with sharding I added the following code to check the database exists and migrate if it has no tables in it. It returns an error string if the database isn’t found, or null if it finished successfully.

NOTE: This stack overflow question has lots of useful ways to detect if a database exist and so on.

private static async Task<string> CheckDatabaseExistsAndMigrateIfNew(
     ShardingSingleDbContext context, Tenant tenant,
     bool migrateEvenIfNoDb)
{
    if (!await context.Database.CanConnectAsync())
    {
        //The database doesn't exist
        if (migrateEvenIfNoDb)
            await context.Database.MigrateAsync();
        else
        {
            return $"The database defined by the connection string"+ 
                "'{tenant.ConnectionName}' doesn't exist.";
        }
    }
    else if (!await context.Database
        .GetService<IRelationalDatabaseCreator>()
        .HasTablesAsync())
        //The database exists but needs migrating
        await context.Database.MigrateAsync();

    return null;
}

The migrateEvenIfNoDb parameter is there because EF Core’s Migrate can create a database if you have the authority, e.g. when in development mode and are using a local SQL Server. But if you don’t have authority, e.g. when in production and using Azure SQL Server, then the code must return an error if there isn’t an database.

That’s the end of the code needed to implement the sharding-only approach to multi-tenant applications. The next section shows the extra steps to support the hybrid approach that allows you to use shared-database approach and / or dedicated-database (sharding) approach at the same time.

2. Implement a hybrid multi-tenant application.

The figure below shows what the hybrid design for multi-tenants where each database can either have many tenants in one database (see left and right databases) or only one tenant in a database (see the middle two databases).

NOTE: You don’t have to use both shared-database approach and dedicated-database (sharding). You can use just use sharding if you want.

There is a runnable example of a hybrid multi-tenant in the AuthP repo. Simply clone the repo and set the Example6.MvcWebApp.Sharding project as the startup project. The example assumes you have a locadb SQL Server and seeds the DefaultConnection database with three, non-sharding tenants. The home page shows you how you can move one of the tenants to another database and make it a sharding tenant.

Here is a screenshot just after a non-sharding tenant called “Pets Ltd.” has been moved into its own database and the tenant is now using sharding.

Here are the steps for the hybrid approach, with changes from the sharding approach shown in bold.

  1. Decide on where the database connection strings should be stored
  2. Hold extra information of the tenant and its users in admin database
  3. When a user logs in, then add a ConnectionName and DataKey claim to their claims
  4. Provide a service to convert the user’s ConnectionName claim to a database connection
  5. Provide the connection string and DataKey to the application’s DbContext
  6. Use EF Core’s SetConnectionString method to set the database connection and DataKey
    1. “Turn off” the query filter
    2. Stop setting the DataKey on entites
  7. Migrate the tenant database if not used before
  8. Extra features available in a hybrid design.

1 Decide on where the database connection strings should be stored

Same as sharding-only approach – see this section.

2. Hold extra information of the tenant and its users in admin database

The hybrid approach needs an additional way to handle and shared-database tenants, as they need some form of filter when accessing one tenant out of all the tenants in that database. In the AuthP library the Tenant creates a unique string for each tenant called the DataKey. This DataKey in injected to the tenant application’s DbContext and used in EF Core’s global query filter to only return the data linked to the Tenant. This is explained in the in the article called “Building ASP.NET Core and EF Core multi-tenant apps – Part1: the database”.

In addition, the AuthP Tenant contains a HasOwnDb boolean property, which is true if the tenant is using sharding. This HasOwnDb property is used in a few ways, for instance to remove the query filter on sharding tenants and to return an error if someone tries to add another tenant into a database that has already go a sharding tenant in it – see section 2fi later for more on that.

3. When a user logs in, then add a ConnectionName and DataKey claim to their claims

The hybrid approach needs both the ConnectionName claim and DataKey claim to handle the two types of database arrangement: the ConnectionName is used by every tenant to get the correct database and the DataKey is needed for the shared-database tenants.

However, you have one tenant DbContext to handle both shared-database tenants and sharding tenants and you can’t change the query filter. This means you would be running a query filter on a sharding tenant which doesn’t need it – see section 2e for how I “turn off” the query filter on sharding tenants.

4. Provide a service to convert the user’s ConnectionName claim to a database connection

Same as sharding-only approach – see this section.

5. Provide the connection string and DataKey to the application’s DbContext

For the hybrid approach we need the connection string and DataKey property is added to the Scoped service that uses the IHttpContextAccessor to get the logged-in user (if present) with its claims. The updated service (see GetShardingDataUserNormal class) provides both the ConnectionString and the DataKey to the tenant application’s DbContext.

Obtaining the DataKey is much easier than the connection string because the DataKey was already calculated when the claim was added, so it just about copying the DataKey claim’s Value into a DataKey property in the service.

6. Use EF Core’s SetConnectionString method to set the database connection and DataKey

In a hybrid approach a tenant can be using the shared-database approach or the sharding approach. Therefore, you have to add extra code to every tenant (including sharding) to handle the shared-database DataKey. This extra code includes adding a DataKey property / column to a tenant data classes and the tenant DbContext must have a global query filter configuring on all of the tenant data classes – this is covered in detail in the Part1 article in sections 6 and 7.   

However, we don’t want a sharding tenant to take a performance hit because of the (unnecessary) global query filter, so how do we handle that? The solution is to use the Tenant’s HasOwnDb property to alter the DataKey.

In the AuthP library if the Tenant’s HasOwnDb property is true (and the tenant type is single-level), then the GetTenantDataKey method doesn’t return the normal DataKey, but returns the “NoQueryFilter” string. This allows two things to happen:

6.1. The query filter is “turned off”

The AuthP contains an extension method called SetupSingleTenantShardingQueryFilter which adds a global query filter with a query that can be forced to true if the special DataKey string of “NoQueryFilter” – the code below shows what manual setup of what the code does (NOTE that the recommended automatic approach uses EF Core’s  metadata methods).

modelBuilder.Entity<Invoice>().HasQueryFilter(
    x => DataKey == "NoQueryFilter" || 
    x.DataKey == DataKey);
modelBuilder.Entity<Invoice>().HasIndex(x => x.DataKey);
modelBuilder.Entity<Invoice>().Property(x => DataKey).IsUnicode(false);
modelBuilder.Entity<Invoice>().Property(x => 
    DataKey).HasMaxLength("NoQueryFilter".Length);

The important line is line 2, where the DataKey from the claim is compared with the “NoQueryFilter” string. If the 2 part of the query filter is true, then there is no need to filter on a DataKey . The SQL Server’s execution planner will see that the WHERE clause is always true and will remove WHERE clause from the execution.

NOTE: The AuthP library also supports a hierarchical multi-tenant type (see this article about a hierarchical multi-tenant) and in that case you still need a DataKey to access the various levels in the hierarchical data. Therefore, AuthP won’t turn off the query filer for hierarchical multi-tenant even if you give its own database.

6.2. It stops the setting of the DataKey

The other part of using a DataKey is to set the DataKey in any newly created entity. In a hybrid design if the DataKey is “NoQueryFilter”, then it returns immediately, thus removing the compute time to detect and update the entities that needed a DataKey. See the code below for the updated MarkWithDataKeyIfNeeded method.

public static void MarkWithDataKeyIfNeeded(this DbContext context, string accessKey)
{
    if (accessKey == MultiTenantExtensions.DataKeyNoQueryFilter)
        //Not using query filter so don't take the time to update the 
        return;

    foreach (var entityEntry in context.ChangeTracker.Entries()
                 .Where(e => e.State == EntityState.Added))
    {
        var hasDataKey = entityEntry.Entity as IDataKeyFilterReadWrite;
        if (hasDataKey != null && hasDataKey.DataKey == null)
            hasDataKey.DataKey = accessKey;
    }
}

This change doesn’t save much compute time, but still think its worth doing.

7. Migrate the tenant database if not used before

Same as sharding-only approach – see this section.

8. Extra features available in a hybrid design.

To make this work in a real application you need some extra code, for instance how to add new connection string the appsetting file while the application is running. There are also maintenance issues, such as converting a tenant from a shared-database to a dedicated-database tenant (or the other way around) while the application is running.

I will cover these issues and using Azure SQL elastic pools for sharding in a future article.

Conclusion

Multi-tenant applications allow you to serve many tenants from one software source. Having different levels of your service is also a good idea, as it allows your tenants to choose what level of service they want to pay for. And that payment needs to cover your costs for all the cloud service and databases you need to provide your multi-tenant application.

Within the features of your multi-tenant application is its performance, that is the speed (how quickly a query takes) and scalability (Wikipedia defines scalability as the property of a system to handle a growing amount of work by adding resources to the system). The more tenants you have then it takes more work to provide a good performance.

Providing a good performance requires lots of different parts: running multiple instances of the ASP.NET Core applications, upgrading your web server / database server, caching and so on. Sharding is a good approach for handing tenant’s data, but like the other performance improving options it increases the costs.

The client I talked about wanted something working as soon as possible (no surprize there!) so I build a shared-database tenant approach first. But they knew their big tenants would want their own database not just for the performance but for the security. The hybrid handles that and that’s why the AuthP library supports that.

So, if you are building a multi-tenant application for a company you might consider using my open source AuthP library to help you. The library contains all the setup and admin code which there is lot to look through, but that’s because building a multi-tenant application isn’t simple to do. There is a lot of documentation, article and a few videos and if something isn’t clear, then raise an issue and I will try to update the documentation.

Using custom databases with the AuthP library – Part1: normal apps

$
0
0

The AuthPermissions.AspNetCore library, referred to as AuthP library, provides enhanced Roles authentication and multi-tenant services to an .NET application. The AuthP library needs to store information in a database, and the previous version (4.2.0) supported SQL Server and PostgreSQL, but with release of AuthP version 5.0.0 you can use the main database provider that that EF Core supports. This feature is called “custom databases” and which allows you to use other database providers other than the build-in SqlServer or PostgreSQL database providers.

This article explains the steps you need to use a different database provider with a normal (i.e. not sharding / hybrid) multi-tenant application. A second article called “Use custom databases with the AuthP library – Part2: sharding / hybrid apps” will cover using the “custom databases” feature with sharding / hybrid multi-tenant application.

This article is part of the series that covers AuthP’s multi-tenant applications in general. The other articles in “Building ASP.NET Core and EF Core multi-tenant apps” series are:

  1. The database: Using a DataKey to only show data for users in their tenant
  2. Administration: different ways to add and control tenants and users
  3. Versioning your app: Creating different versions to maximise your profits
  4. Hierarchical multi-tenant: Handling tenants that have sub-tenants
  5. Advanced techniques around ASP.NET Core Users and their claims
  6. Using sharding to build multi-tenant apps using EF Core and ASP.NET Core
  7. Three ways to securely add new users to an application using the AuthP library
  8. How to take an ASP.NET Core web site “Down for maintenance”
  9. Three ways to refresh the claims of a logged-in user
  10. Use custom databases with the AuthP library – Part1: normal apps (this article)
  11. Use custom databases with the AuthP library – Part2: sharding / hybrid apps (coming soon)

TL;DR; – Summary of this article

  • The AuthP library is designed to make building .NET multi-tenant applications by providing a the backend design and admin features to get your application build more quickly.
  • The new AuthP version 5.0.0 and contains the “custom databases” feature (plus other features – see next section), where you can now use any of the main of EF Core database providers with AuthP, focusing on normal (i.e. not sharding / hybrid) multi-tenant applications.
  • To use the “custom databases” feature with a normal multi-tenant applications you need to go three stages:
    • Create an EF Core migration for your selected database provider
    • Create an extension method to register your custom database
    • Various minor changes to your tenant data to work with your custom database
  • There is a working example of a normal multi-tenant applications using Sqlite as the custom database. You can find this in the AuthP.CustomDatabaseExamples repo – look at projects with a name starting with “CustomDatabase1.”
  • This article compares a multi-tenant application using a build-in database and the same multi-tenant application using Sqlite as the custom database.

Summary of the new features in AuthP version 5.0.0

Before getting to the details of the new “custom databases” in AuthP version 5.0.0 I provide an overall list of all the new in this release:

  • BREAKING CHANGE in AuthP’s sharding / hybrid multi-tenant feature: If you are using AuthP’s sharding /hybrid features, then look at the UpdateToVersion5.md file for what you need to do.
  • This new release makes it easier to use, and change, the sharding / hybrid multi-tenant feature. The two items below cover the easy of use / change:
    • Easier to use: there is a new extension method called SetupMultiTenantSharding that sets up the sharding / hybrid multi-tenant feature. This makes it easier to set up this feature.
    • Easier to change: You can change the internal services / setting of the sharding / hybrid feature, e.g. one developer wants to store sharding data in a database, instead of in a json file, which this release allows. This is done by creating a new extension method containing the code in the SetupMultiTenantSharding extension method with your changes.
  • New feature: You can now use range of database providers (see list later) to use with the AuthP library. The table below shown the database providers AuthP 5.0.0 supports:
Supported database providers in V5.0.0Comments
Microsoft.EntityFrameworkCore.SqlServerBuilt-in – already in AuthP library
Npgsql.EntityFrameworkCore.PostgreSQLBuilt-in – already in AuthP library
Microsoft.EntityFrameworkCore.Sqlite 
Microsoft.EntityFrameworkCore.Cosmos 
Pomelo.EntityFrameworkCore.MySqlPomelo Foundation Project
MySql.EntityFrameworkCoreMySQL project (Oracle)
Oracle.EntityFrameworkCoreOracle

NOTE: The AuthP library uses Giorgi Dalakishvili’s EntityFramework.Exceptions library to detect concurrency and unique errors, and Giorgi only supports the main EF Core database providers, i.e. SqlServer, PostgreSQL, SQLite, MySQL, MySQL.Pomelo, and Oracle.

The rest of this article covers using a “custom database” with a single tenant database, and Part2 article will show to use “custom databases” with tenants in many databases (i.e. sharding / hybrid approach).

Setting the scene – how AuthP’s multi-tenant features uses databases  

The AuthP’s multi-tenant feature provides nearly all the backend services / admin to build a multi-tenant application. I started building the AuthP library using one / two databases, which works for small / medium sized multi-tenant apps. Then in version 3.0.0 I added a “one db per tenant” approach (known as sharding) plus a hybrid design, which gives you more flexibility to handle both small and large tenants on the same multi-tenant app. The diagram below shows the four ways AuthP can handle databases to provide the right cost / performance for your multi-tenant project.

NOTE: If you aren’t creating a multi-tenant project, then you follow the first, “All in in one Db” approach with AuthP data and any project data in one database.

Up to this point the AuthP only supported SQL Server and Postgres database types, but a few developers asked if the I could support other database providers. So, when I found time, I created a new version of the AuthP I added the “custom database” feature that supports allows any database server that EF Core supports.

Introducing the AuthP.CustomDatabaseExamples repo

To help you, and to make sure the custom database feature works, I created a repo called AuthP.CustomDatabaseExamples which contains two examples: this article covering multi-tenant application that keep all tenant data in one database (see 1 and 2 in the Four Ways diagram).

To use a custom database you must change various code and migrations, which this article will explain. I chose Sqlite for my custom database type because the Individual User Accounts authentication supports Sqlite.

NOTE: Using Sqlite for the custom database examples requires extra code over other database providers. That’s because the Sqlite’s connection string doesn’t have a Server / host part but has a filepath in the data source part. Other databases, such as MySql, should be slightly easier as the connection string fully defines the server and database name.

The example which works with AuthP “add tenant data in one database” and has three projects, all starting with CustomDatabase1.

Steps in building a normal AuthP application with a custom database

We start with an application using AuthP which uses one / two databases, i.e. 1 and 2 in the previous “Four Ways” diagram. Here are the steps to use a custom database are listed below, which will detailed later.

  1. Create a migration of the AuthPermissionsDbContext for your custom database.
  2. Create an extension method to register your custom database.
  3. Other, non-AuthP things you need to think about

These stages are explained below.

1. Create a migration of the AuthPermissionsDbContext for your custom database

To create or update the AuthPermissionsDbContext you need to create an EF Core migration, and each database provider needs its own migration based on your chosen custom database type.

If you are using AuthP’s build-in database providers, then the AuthP NuGet package contains migrations for SQL Server and Postgres. But when using the custom database feature, then you need to create the migration for your custom database type. There are many ways to create a EF migration, but personally I use a IDesignTimeDbContextFactor<TContext>.   The code below comes from the CustomDatabase1’s AuthPermissionsDbContextSqlite class.

public class AuthPermissionsDbContextSqlite :
     IDesignTimeDbContextFactory<AuthPermissionsDbContext>
{
    // The connection string must be valid, but the  
    // connection string isn’t used when adding a migration.
    private const string connectionString = 
        "Data source=PrimaryDatabase.sqlite";

    public AuthPermissionsDbContext CreateDbContext(string[] args)
    {
        var optionsBuilder =
            new DbContextOptionsBuilder<AuthPermissionsDbContext>();
        optionsBuilder.UseSqlite(connectionString);

        return new AuthPermissionsDbContext(optionsBuilder.Options, 
            null, new SqliteDbConfig());
    }
}

The following lines you to change for your custom database provider

  • Lines 6 and 7: You must have a connection string in the correct format for your custom database type, but if you are creating a migration then the database won’t be accessed. 
  • Line 13: you need to use the correct “Use???” method for your custom database provider.
  • Line 16: The third parameter to the AuthPermissionsDbContext class allows you to add any custom-specific set of EF Core model commands to be run at the start of the AuthPermissionsDbContext’s  OnModelCreating method.  The main use is to set up concurrency tokens to capture concurrent changes to the same table. See the SqliteDbConfig class for Sqlite concurrent change commands.

NOTE: Setting up the Sqlite concurrent change in the EF Core migration is a bit harder than from other database types. That’s because you need to add extra trigger code – see this article on what Sqlite needs and see the …Version5.cs in the SqliteCustomParts Migrations folder.

2. Extension method to register your custom database

You need to create an extension method to register your custom database. You do this by copying one of existing extension methods already in the AuthP code, such as UsingEfCoreSqlServer, and alter six parts:

  1. You set the AuthPDatabaseType to the enum AuthPDatabaseTypes.CustomDatabase
  2. Change the AuthPermissionsDbContext to your custom database provider.
  3. Link to your assembly containing the AuthPermissionsDbContext migration.
  4. Update the EntityFramework.Exceptions to your custom database provider.
  5. Add new code to register custom database configuration code.
  6. Optional: Update the RunMethodsSequentially code to provide a global lock

The diagram below shows how you would take the UsingEfCoreSqlServer extension method and alter it to become your custom database extension method (NOTE click to get a bigger version of the diagram).

The AuthP.CustomDatabaseExamples repo has a UsingEfCoreSqlite extension method in the SqliteSetupExtensions class which sets up a Sqlite database as the custom database. This has gone through the six steps shown in the diagram above.

3. Other, non-AuthP things you need to think about

There are a few other things to do that use your custom database outside the AuthP library. The main one is the tenant part provides some sort of application that users will use. In the AuthP library there is one that mimics an application where you can entry invoices. Another example is managing shops sales/stock. If you want to use your custom database in your tenant data, then you need to set that up too.

The first two options in the “Four Ways” diagram show that you two ways to handle the tenant part of the outside sharding.

  1. “All in one Db”: your tenant data is within the same database, so it must use the same custom database.
  2. “Separate AuthP / tenant data”: In this case your tenant data doesn’t have to use the same custom database that AuthP uses. 

Comparing two AuthP multi-tenant examples to see the changes

I created an example multi-tenant application using Sqlite as the custom database, by copying an existing multi-tenant application that used the built-in SqlServer – see Example3 in the AuthP repo. This allows me to compare the changes to my new Sqlite to show what has changed.

I created a new repo called AuthP.CustomDatabaseExamples and copy / updated an  example multi-tenant application, using three projects wholes names all starting with “CustomDatabase1.”. There are three projects are:

Projects 
CustomDatabase1.InvoiceCodeCode for the tenant data / features
CustomDatabase1.SqliteCustomPartsContains Sqlite code / migration
CustomDatabase1.WebAppASP.NET Core providing a runnable web app

3a. Example3.InvoiceCode -> CustomDatabase1.InvoiceCode

The main changes are to so using Sqlite for the invoice code. Here are the changes:

  • …AppStart.RegisterInvoiceServices – changed to use Sqlite
  • InvoicesDesignTimeContextFactory (IDesignTimeDbContextFactory) to create a migration for the Invoice DbContext to Sqlite (The AuthP Example3 Invoice DbContext used the build-in SqlServer). See my comments on the end of this class which provides one way to create the migration.
  • You need to create a Sqlite Migration for the Invoice DbContext using the InvoicesDesignTimeContextFactory detail above.

3b. New CustomDatabase1.SqliteCustomParts project

This project contains the Sqlite extras that you need to use Sqlite with AuthP. They are:

3.c. Example3.MvcWebApp.IndividualAccounts -> CustomDatabase1.WebApp

The big change in the ASP.NET Core project is changing is the Program class. In the Program class I have added #regions to show what has changed.

The other change is in the appsettings.json file where you need to provide the Sqlite connection string, which is quite different from other database connection strings.

Conclusion

I haven’t had many requests for the “custom database” feature for AuthP, but like the support for multiple languages in AuthP many found it useful once its there.

Creating the “custom database” feature for a normal (non-sharding) applications was fairly quick to create, but doing the same to the sharding / hybrid applications it turns out to be quite more complex. I decided to release AuthP version 5.0.0 without the sharding / hybrid “custom database” feature because this release contains other improvements that people have asked for.

Watch this space for Part2 of the “custom databases” article to see how to use “custom database” feature when building sharding / hybrid multi-tenants applications. Happy cod

The (long) journey to a better sharding multi-tenant application

$
0
0

This article focuses on the sharding approach of building a .NET multi-tenant application using the AuthPermissions.AspNetCore library (shortened to AuthP). After working exclusively on a sharding-only multi-tenant application I found various issues that make building and using such an application is difficult. This article describes the issues I found and the changes I made to the AuthP’s 6 release to improve sharding multi-tenant application.

This article is part of the series that covers AuthP’s multi-tenant applications in general. The other articles in “Building ASP.NET Core and EF Core multi-tenant apps” series are:

  1. The database: Using a DataKey to only show data for users in their tenant
  2. Administration: different ways to add and control tenants and users
  3. Versioning your app: Creating different versions to maximise your profits
  4. Hierarchical multi-tenant: Handling tenants that have sub-tenants
  5. Advanced techniques around ASP.NET Core Users and their claims
  6. Using sharding to build multi-tenant apps using EF Core and ASP.NET Core
  7. Three ways to securely add new users to an application using the AuthP library
  8. How to take an ASP.NET Core web site “Down for maintenance”
  9. Three ways to refresh the claims of a logged-in user
  10. Use custom databases with the AuthP library – Part1: normal apps
  11. Making .NET sharding multi-tenant apps easier to use with AuthP 5.1 (this article)
  12. Use custom databases with the AuthP library – Part2: sharding / hybrid apps (coming soon)

TL;DR; – Summary of this article

  • This article is about a sharding multi-tenant application. AuthP has two types of sharding:
    • Hybrid, which supports tenants that share a database with other users and tenants that have their own database, which I refer to Sharding-only.
    • Sharding-only, where every tenant have their own database.
  • Part 1 covers two types of multi-tenant applications: one where admin is done by the app-maker and one where specific user(s) in a tenants handle the admin of their tenant.
  • Part 2 describes the changes the AuthP’s sharding-only multi-tenant application, which are
    • Allow certain features to work, e.g. the “Sign up now” feature
    • To improve the administration of a sharding-only multi-tenant application.
    • Add a service to make creating / removing sharding-only tenants easier in code and human time.
  • Part 3 covers how you can use the new features in AuthP version 6. They are
    • Where does each tenant’s data is stored when using AuthP’s sharding?
    • A service that makes it simpler to create / delete a sharding-only tenant.
    • How the AuthP’s 6 “Sign up now” feature handles sharding-only tenants.
  • At the end of this article it provides links to relevant documentation, the file containing steps to update an sharding multi-tenant using an older version, and a new example app.

Part 1: Two types of multi-tenant applications

Recently I have been working on the multi-tenant features in the AuthP’s library, and I started to see that there are different types of multi-tenant applications. This insight came while building a new sharding-only app and it helped me to see areas where I need to improve the AuthP’s sharding futures. The two main types of multi-tenant are based on who is in administration of users and tenants – see the description of two types below.

1. Multi-tenant app controlled by the app-maker

This type of multi-tenant app is controlled by the organisation who owns the app. Normally there is some form of admin team (which I refer to as the app admin) in the organisation who manages the users and its data. Some examples of this type of multi-tenant apps are:

  • A bank app allowing its users to manage their bank accounts.
  • An in-house app that controls the inventory in the organisation.

The pro of this style is they have good control over who can access the app, but the downside is you need an admin team to manage the users, tenants etc., e.g. add / remove users, changing what they can do, etc.

2. Multi-tenant app controlled by tenant user

This type of multi-tenant app move much of the admin features out to a tenant user. The AuthP has the concept of a tenant admin allow a user to create their own tenant for their organisation (normally with a payment) and then manage the users in their tenant, such as adding new users, controlling what they can do, etc. Some examples of this type of multi-tenant apps are:

  • GitHub, which allows users to create their own space and manage.
  • Facebook / Meta,  where you post comments, share photos and have friends.

The pros of this style is that the tenant user can quickly makes changes and its reduces the size of the app admin team you need for to support your multi-tenant application. The downside is the multi-tenant administration code is more complex , but AuthP has built-in solutions to most of these admin features.

NOTE: I was asked to design a large multi-tenant application for a client, and I personally know that there is a LOT of admin code. Also, the app admin team were very busy when a large organisation joined their multi-tenant application. This is why I put a lot of work on the tenant admin features to reduce the amount of work the app admin team must deal with.

The AuthP library supports both multi-tenant types, but you will read how I needed to improve the AuthP’s features for sharding-only databases.

Part 2: The problems I found and what I did to improve sharding features

The AuthP’s implementation of sharding uses a “sharding entry” approach that links a tenant to a database. The diagram below the implementation of two types:  “controlled by the app-maker” and the “controlled by tenant users”.

What I found that both types could be improved, but the “controlled by tenants” had one feature didn’t work as it should, and admin code weren’t that easy to use, especially around creating / deleting sharding-only tenants. Part 2 (below) will cover the problems and how I solve them.

Part 2: The problems I found and what I did to solve them

The list below details the various problems and what I did about it 

  1. The “Sign up for a new tenant” feature doesn’t handle sharding tenants
  2. Solved a limitation in ASP.Net Core’s IOptions feature (breaking changes)
  3. Add a service to make creating / removing sharding-only tenants.

2.1. The “Sign up for a new tenant” feature doesn’t handle sharding-only tenants

The AuthP’s “Sign up for a new tenant” feature allows to a user to create a new tenant, which can significantly reduce the amount of work for your admin people. However the version 5.0.0 of the “Sign up” feature doesn’t contain code to setup a new sharding entry in a sharding-only tenant.

But it was then that is hit a limitation of the ASP.Net Core’s IOptions features – see the next section below what the problem is and how I got around it.

2.2. Solved a limitation in ASP.Net Core’s IOptions feature (breaking changes)

The current sharding entries are held in a json file and accessed via ASP.NET Core’s IOptions feature (similar to how ASP.NET Core uses a appsetting.json file). What I found was that the I added a new sharding entry to a json file the IOptionsMonitor won’t see the changes until a new HTTP request starts – I assume it looked for updates at the end of every HTTP request. This causes me a problem as there are cases where a new sharding entry is added and another service needs to retrieve sharding entry within the same HTTP request.

There are a few ways to get around this, but in the end I changed to using my Net.DistributedFileStoreCache library (shortened to FileStoreCache)  that uses the FileSystemWatcher and updates are immediately updated. The upside of using the FileStoreCache is has a similar very fast (~25ns) read-time that the IOptions approach had. The downside is this change creates a breaking change to applications that have been created to older versions of AuthP.  

NOTE: I have created a detail list of the changes in the UpdateToVersion6 file and a console application to help with update your app’s sharding entries to the AuthP 6 format.

2.3. Add a service to make creating / removing sharding-only tenants

When you are adding a new sharding-only tenant there are four steps:

  1. Add a sharding entry to define the new database that the tenant will use.
  2. Create the tenant using AuthP’s AuthTenantAdminService which will…
  3. Run your ITenantChangeService code to set up the database.

…And if you are using a cloud database service, then you need create a new database.

NOTE: If you need to create / delete cloud database, then you can do that withing your implementation of the ITenantChangeService – create your cloud database in the create tenant method and delete your cloud database in the delete tenant method.

You could do this yourself with the AuthP’s version 5 in two manual steps, but in AuthP version 6 there is a new service called IShardingOnlyTenantAddRemove which runs the three steps for creating and removing a tenant. This service makes it easier, and clearer, for an admin person to add or remove sharding-only tenants.

Part 3 – How you can use these new features

So, having told you what has changed in AuthP version 6 let’s see how you would use these updated features. I will cover the following topics, and I will point out the things that have changed

  1. Where does each tenant’s data is stored when using AuthP’s sharding?
  2. A service that makes it simpler to create / delete a sharding-only tenant
  3. How the AuthP’s 6 “Sign up now” feature handles sharding-only tenants

3.1 Where does each tenant’s data is stored when using AuthP’s sharding?

A multi-tenant application provided a service for an organization / person finds useful, but instead of deploying a single web app / database for each tenant they have one web app and they segregate each tenant’s data. In AuthP provides three ways to segregate the tenant’s data, as shown in the table below.

TypesDescriptionHow filter only tenant’s data?
1. One databaseAll the tenant’s data in one databaseFilter data be unique key
2. HybridSome tenants have their own database, and some tenants share a databaseSelect the correct database
Filter data be unique key
3. Sharding-OnlyEach tenant has their own database for dataSelect the correct database

Since AuthP version 3  supports all these ways to separate each tenant’s data, but when I ran a twitter poll on what type of approach they wanted the “Tenants have their own db” (4. In the diagram above) won by a large margin. That why this article (and the new release of the AuthP library) is all about.

AuthP version 3 defined a way that a tenant can be linked to a database. That database may have data from many tenants in it (shared) or a database has only one tenant’s data (shared-only). In this article we are mainly looking sharding, and in particular at the “Select the correct database” part, which applies to both hybrid and sharding-only approaches. The diagram shows the four stages that sets up the tenant’s data DbContext to the specific database for tenant that the logged-in user is linked to.

AuthP version 6 stills these four steps, but it had to change the service that implements stage 2 was changed due the limitation in the ASP.NET Core IOptionsMonitor features. This created **breaking changes** in the version 6 release and I created UpdateToVersion6.md file which shows how to update your existing sharding multi-tenant built using a previous release of the AuthP library.

NOTE: When fixing the ASP.NET Core IOptionsMonitor limitation I also found the existing code rather hard to understand, partly because of the code being complex and partly because the names of some of the classes and methods didn’t really say what they were doing. Therefore I wanted to make the code easier to understand, but this created more breaking changes. Thankfully these changes are obvious i.e. there will be a compile error as the name have changed.

3.2. A service that makes it simpler to create / delete a sharding-only tenant

Section 1.3 talks about a new service called IShardingOnlyTenantAddRemove that combines the creation of the tenant and its sharding entry at the same time. This service means that the admin user only has one page to fill in and the code sets up the rest. The service does the following:

  • Gathers the information to create the tenant and sharding entry needed:
    • Tenant name – from the admin user
    • The connection string holding the database server (code or human – see diagram below)
    • Set the database provider used to create the tenant database (e.g. SqlServer) is set by your code
  • Then, after various checks, it creates the sharding entry and Tenant in one go.

The diagram below shows two web pages that the app admin user (think “controlled by app-maker” type) to create a new tenant on one go. The two pages show the two types of app: left is where the application has a single database server while on the right is for applications needing multiple database servers.

NOTE: The Example7’s TenantController in the AuthP repo has an example of the multiple database servers option.

NOTE: AuthP’s sharding muti-tenant feature uses connections strings to define the database server, authentication, and other options, but AuthP will add the name of the database. That’s why the  multiple database servers option above uses the name “Servers” for the connection strings names.

The service has a companion class called ShardingOnlyTenantAddDto which makes setting up of the data very simple. In fact the code below will work for both create tenant options as it detects if you only have one valid connection string and sets the ConnectionStringName automatically.

public IActionResult Create([FromServices] 
    IGetSetShardingEntries shardingService)
{
    var dto = new ShardingOnlyTenantAddDto
    {
        DbProviderShortName = AuthPDatabaseTypes.SqlServer.ToString()
    };
    dto.SetConnectionStringNames(
        shardingService.GetConnectionStringNames());
    return View(dto);
}

NOTE: The name of the database and the sharding entry is created by the ShardingOnlyTenantAddDto’s FormDatabaseInformation method which uses time as the database name, e.g. 20230808101149-892. If you want to use the tenant name then the FormDatabaseInformation method can be overridden by you.

3.3 How the AuthP’s 6 “Sign up now” feature handles sharding-only tenants

Earlier I said that the IOptionsMonitor limitation made it impossible to use the “Sign up now” feature to create a new sharding-only tenant (think “controlled by tenant user” type). Once the IOptionsMonitor limitation was overcome it wasn’t that hard to create a service that handles the tenant and its sharding entry.

The AuthP repo contains a DemoShardOnlyGetDatabaseForNewTenant which handles sharding-only tenant and the required sharding entry. The Example7 app in the repo has the “Sign up now” and uses the DemoShardOnlyGetDatabaseForNewTenant. The screenshot below shows this working – note that allows the user to decide where they wanted their data located.

Lots of new / updated documentation for AuthP version 6

When I came back to programming after 21-year break (I went to the dark side, e.g. I was a manager) I picked Microsoft because their documentation was so good. Therefore I know that documentation is (nearly) as important as the code.

So, with the release of version 6 of the AuthP I have added / updated the documentation about sharding. Here is a list.

DocumentWhat about
Multi-tenant explainedStart here if new to AuthP
Sharding explainedDescribes two sharding modes, new
How AuthP handles shardingDescribes internal code, new
Configuring shardingHow to setup sharding app, new
Extra help for shading‐only appsDocs on new features, new
Managing sharding databasesShowing / changing tenants, updated
UpdateToVersion6How to update app to AuthP 6

The documents give you the overview of what to do, but I also take the time to add comments on all the code. That’s because sometimes it easier read the code to really what it does.

Also, the AuthP’s repo now has new example called Example7 which implements a sharding-only multi-tenant application. The Example7 app uses all the new / updated features included in the new AuthP version 6 release.

Conclusion

When I started to look at improving Auth’s sharding-only tenant I kept finding things more and more issues and I wasn’t sure it was worth doing the update, so I did a twitter poll to find out what the users of the AuthP wanted. That poll clearly shows that developers preferred the sharding-only approach nearly twice more than a shared database approach. This feedback made my spend more time to make sharding-only really well.

I’m very happy about the AuthP version 6 update because it fixed the limitations in sharding-only multi-tenant applications and improved the admin of creating / deleting tenants. The only problem is that AuthP version 6 contains the breaking changes, which that means you have alter your the sharding multi-tenant application when update to AuthP version 6 – see UpdateToVersion6 for the details.

There was always going to be some breaking changes, but I found my previous sharding code was hard to understand so I tided up that code at the same time. This adds extra breaking changes, but it turned out to fairly simple. For instance the update of the Example6 (hybrid multi-tenant) from version 5 to version 6 wasn’t hard – usually it was just changing the names of the services and methods, with the complex changed only in the internal code.

I hope AuthP’s user’s will find version 6 helps them, or anyone building a sharding-only multi-tenant application can gets some ideas for their project. Do give me feedback on version 6 and I am happy to answer your questions via the AuthP ‘s GitHub issues pages.

Happy coding.


A detailed look at EF Core’s JSON Columns feature

$
0
0

Some developers used EF Core JSON Columns in their DbContext when using my EfCore.SchemaCompare library, then they had a problem. That’s because I hadn’t added support for JSON Columns to my EfCore.SchemaCompare library (this library compares a EF Core’s Model of the database against a database’s schema). When I added support for JSON Columns feature to the library, I found it had a rather complex internal design. In fact Shay Rojansky talks about this complex design in the EF Core 9 release EF Core community standup (click this to hear what he says) and they want to improve it in EF Core 10.

So the combination of a complex design, not much documentation, and my dementia (See the ENDNOTE dementia and programming section at the end of this article) made it pretty hard for me to work out what to do. Thankfully the EF Core team responded to my issue, which helped me a lot.

So, I decided to create this article to provide more documentation to help other developers when using JSON Columns. The article compares the EF Core normal approach of using SQL Tables, columns, Indexes, etc., (shortened to “Tables/Indexes approach” in this article) against the JSON Columns approach so that you can compare / contrast the two approaches.

TL;DR; – Summary of this article

  • EF Core added a feature called JSON Columns in .NET 7, which stores data in an JSON string, and was also updated in EF Core’s .NET 8 release.
  • This article compares the JSON Columns approach against the normal Tables/Indexes approach.
  • I use a “list of books” app using both the JSON Columns approach and the Tables/Indexes approach
  • The way relationships are configured and stored are different – The JSON Columns’ relationships are held inside the its JSON string, while the the Tables/Indexes approach has multiple tables, linked by Indexes.
  • The create and read of both approaches are (nearly) the same, while selecting particular entries is different in the two approaches.
  • There is a large section looking at the performance of eight tasks with different numbers of the entities to give you an idea where JSON Columns would work in your application.
  • At the end I give you my, and EF Core’s, view of what sort of data that works with JSON Columns: The data must be “Coherent”,  i.e. there no links outside the data. Logging data is one example where JSON Columns would work well.

Introduction of storing data in a database using JSON

JSON (Full name: JavaScript Object Notation) is a lightweight format for storing and transporting data in strings. The JSON’s format has a beautifully simple design, which makes it easy to understand and use. JSON is used everywhere, for instance Web Applications to send data from the server to the client and to send back inputs from the client.

I also used JSON in my early use of EF Core by using the JsonSerializer.Serialize and JsonSerializer.Deserialize methods to store data in a database. But in the .NET 7 release of EF Core added dedicated code to convert data into a JSON string (this is known as “JSON Columns” in EF Core). In  .NET 8 release of EF Core to improve its performance and added JSON Columns to SQLite. And Shay Rojansky, who is part of the EF Core team, said they will improve how JSON Columns in .NET 10 EF Core.

The JSON Columns feature allows you to put some code in your EF Core’s DbContext’s OnModelCreating method to automatically convert a class’s properties into a JSON string. It also allows multiple classes, including relationships, to be stored in one JSON string.

The .NET 7 what’s new JSON Columns section says “JSON columns allow relational databases to take on some of the characteristics of document databases, creating a useful hybrid between the two.“ JSON Columns can contain any basic types: e.g., string, int, bool, DateTime and relationships, although JSON Columns’ relationships are handed differently to the normal Tables / Index approach.

Creating an example application

I wanted an example application that to covers relationships, configured, read/write and performance. I also I wanted an application that both the Tables/Indexes approach and JSON Columns approach can use. I choose a “list of books” app (think a very simple Amazon book app).

I have to say that the “list of books” app is NOT something that I would build using JSON Columns, but it does show all the JSON Columns’ relationships. Also, the “list of books” app provides a good way to compare the JSON Columns’ performance against the Tables/Indexes approach.

NOTE: I used a “list of books” app in my Entity Framework Core in Action (2nd. Edition). In chapter 2 I used the “list of books” app to show the main types of EF Core’s relations, and in chapter 15 looked at performance and chapter 16 using Cosmos DB.

I created a GitHub called  ExploringJsonColumns which holds the code to check that the code was correct, and also to run performance tests to compare the two approaches. The sections in this article are:

  1. An overview of the “list of books” data
  2. The relationships and how they are configured
  3. Writing into the database
  4. Selecting data from the database
  5. Reading data from the database
  6. Performance comparisons
  7. My thoughts on where JSON Columns are a good fit

1. An overview of the “list of books” data

My ExploringJsonColumns solution holds two versions of the “list of books” app code: one using the normal Tables/Indexes approach and the other the JSON Columns approach, and provides the same output into common class called BookListDto (The BookListDto class contains all the data to show a book’s details, e.g., book title, author(s), price, star ratings, etc.. This ensures that we can compare their performances the results because both approaches are the same data. The screenshot below shows one (fictional) book showing the books data.

2. The relationships and how they are configured

The “list of books” app has the three types of relationships, and the diagram below shows the Tables/Indexes approach on the left and the JSON Columns approach on the right.

NOTE: click on the picture below to get a bigger picture.

See Tables/Indexes classes in this folder.See JSON Columns classes in this folder.

The JSON Columns part of the diagram above isn’t as clear as the Tables/Indexes approach, so the screenshot below shows the JSON in the BookData’s nvarchar(max) string.

The configuring of JSON Columns is added in your DbContext’s OnModelCreating method – see the diagram below which shows the BookJsonContext class with comments about where the JSON Columns’ OwnsOne and OwnsMany methods are used

Here is a link to another the JsonCustomerContext DbContext which I used in testing my JSON Columns code added to my EfCore.SchemaCompare library. This shows how you can configure multiple layers (e.g., grandfather, father, son) relationships and handing multiples JSON Columns’ entities.

The differences between Tables/Indexes and JSON Columns version’s “many” relationships

When showing a book with multiple authors you must ensure the authors’ names are in the correct order. In the Tables/Indexes approach: the BookAuthor class has an byte “Order” column to make sure that Author’s Names are shown in the correct order.

But in the JSON Columns approach the JsonAuthor class doesn’t need a “Order” property because the List<JsonAuthor> holds the authors’ names in the order they were added.  Therefore you don’t need a “Order” property.

4. Writing into the database

From the developer’s perspective the writing of data to the database uses the same process for the Table / Index approach and JSON Columns – you put data into a class and EF Core will store it into the database. EF Core will store Table / Index data into tables and columns, while with JSON Columns EF Core will store the data in a JSON string.

5. Selecting data from the database

In the first version (EF Core 7) of the JSON Columns the only way to select data is using the Table / Index approach, i.e. by using column(s) to in the top entity, for instance in the  “lists of books” you could its Id (Index) in the BookTop class.

But in EF Core 8 there was a new Enhancements to JSON column mapping” feature which enables you to filter each entry (i.e. book) using data point(s) in the JSON string. I use this feature in my performance tasks to find all the books that has a specific author’s name. You can find that code here.

6. Reading data from the database

The reading of JSON Columns’ data is very easy – you simply access the classes / properties just like the Table / Index approach. The only small difference is that any relationship in the JSON Columns is immediately available, so you don’t need to add code to load relationships (e.g., context.Books.Include(book => book.Reviews)… isn’t needed).

NOTE: There are some rare situations when you need to a) read in many JSON Columns entities, and b) combine data, or you only need a subset of the data points, then using IEnumerable<> input and an IEnumerable<> out can make the process (a bit) quicker (see MapJsonBooks for an example). I found this when reading all the books ordered by each book’s star rating, because I had two ToList’s – using IEnumerable<> in/out removes one of the ToList method.

4. Performance comparisons

I had a project called ConsoleBenchmark in my ExploringJsonColumns solution which used  BenchmarkDotNet to get the performance of the two approaches, i.e., the normal Tables/Indexes approach and the JSON Columns approach. There are four tasks to compare, with two types of data:

  • Simple data: This has the least relationships, with each book having one Author’s Name and no reviews or promotions.
  • Complex data: This has many relationships, with each book having two Author’s Name and many reviews and a few promotions.
Task nameWhat is does
1. AddAdd N sample data to an empty SQL database
2. ReadRead N sample data from the SQL database
3. OrderRead N sample data and order the books using their star rating
4. AuthorFind all the books that has a specific author’s name in the N sample data.

The first two tasks (Add, Read) don’t favour either approach, but last two tasks (Order, Author) should favour the Tables/Indexes approach because it can use indexes to get to the information more easily. NOTE: that the “Order” and  “Author” tasks don’t work in the Simple type.

Below are the benchmark results for the two setups: Simple and Complex, with a performance summary after the BenchmarkDotNet’s results.

From the performance of this “list of books” code of you can say:

  • Add tests: The JSON Columns is as good, or better than the Table / Index version, and the 1,000 books with JSON Columns is nearly half the time of the Table / Index code. I suspect the number of relationships will have some effect on the Table / Index version’s performance.
  • Read tests: The JSON Columns performance is near to the Table / Index version when there aren’t many books, i.e., 10, but once you get to 100 books takes 8 times longer, and with 1,000 test Columns takes 10 times longer.
  • Order tests (complex only): I thought the Table / Index version would win every time because it could use the LINQ Average method on all the review’s star rating, but in the “ten books” JSON Columns works well. However, compared against the Table / Index version the 100 books JSON Columns take 7 times longer, and 1,000 JSON Columns takes 7 times longer.
  • Author tests (complex only):  This task finds all the books that have a specific author’s name. The JSON Columns approach was better than I expected, but it’s always slower than the Table/Index approach. The worse was 1,000 JSON Columns version, which was 8 times slower that the Table/Index approach.

5. My thoughts on where JSON Columns are a good fit

I used the “list of books” code because it covers the JSON Columns’ relationships, configuration, types of reading, etc. which makes a good example. As I said earlier, I was sure that the normal Table / Index approach would have the best performance on the “list of books” tests, and I was right. But I was surprised that the JSON Columns did so well!

So, what types of data is a good fit for using JSON Columns? My take is:

  • The data has many parts.
  • Each data point can be held in a built-in type, e.g. int, bool, string…
  • The data points are “coherent”,  i.e. there no links outside this data.

This means the JSON Columns approach is good if you have lots of data about one thing that doesn’t need links to another entity. The .NET 7 release the EF Core documentation gives you examples that work well with JSON Columns, for instance:

  • Contact Details: name, phones, address, etc.
  • Logging: what, where, when, who…

The other thing to consider is the performance comparison where the JSON Columns approach is better that the Table / Index approach. The Performance comparisons section show that:

  • Adding a lot of JSON Columns data is normally faster than the Table / Index approach (I think this because the Table / Index approach of the “”list of books” app has to create more indexing and relationships on adding).
  • Do you need to read a lot of entities using JSON Columns, then a JSON Columns can be slow. For instance, “list of books”:
    • JSON Columns: 10 takes the same as the Table / Index approach, but…
    • JSON Columns: 100 takes 2 times of the Table / Index approach
    • JSON Columns: 1,000 takes 8 times of the Table / Index approach
  • If there are lookups (e.g., finding all the books for a specific Author) or ordering, then the Table / Index approach, with its links using Indexing, is likely to perform better that JSON Columns approach. For instance, in the Author task the JSON Columns approach has read every AuthorName in each entity, while the  Table / Index approach only has to look at the at the links via the BookAuthor class, which are 10.

Conclusion

I hope this article helps you to understand how to setup and use EF Core’s JSON Columns feature. Also the performance figures give you some idea of what JSON Columns will do in add, read, and filter tasks.

Just to end, if your data fits a JSON Columns approach, but performance is critical, then I would recommend Cosmos DB. I used Cosmos DB with the CQRS pattern in chapter 16 of my book Entity Framework Core in Action (2nd. Edition) and I found it very fast. For instance, the Order task in this article was way faster than the normal Tables / Indexes approach.  

I wrote an article in the end of May 2024 which had an endnote about my dementia, talking about my diagnosis of Alzheimer’s disease in January 2024. Alzheimer’s disease is a progressive disease that worsens over time; this endnote talks about how I am in January 2025.

I am very happy to say my programming skills are better! and that’s what this endnote talks abouts. However I have to say that there are a few negative symptoms as well, like something I call “sensory overload,” where if I do lots of different tasks one after another (or lots of people or sounds), then I feel overloaded. But I’m going to focus on how my programming skills got better, because developers will be reading this.

In my first endnote about my dementia I talked about using lists more to cover the problem of memory loss, and using my version Cognitive Stimulation Therapy (CST) approach. The typical CST ‘s suggestions are to do crossword puzzles, Sudoku, Rubik’s Cube, etc. I do some of these but for me the best CST work is programming! I had to change my approach to programming to counter the dementia’s symptoms – and for me it works.

It turns out in the last three months I have done a lot of programming:

  • Added JSON Columns to my EfCore.SchemaCompare (very difficult!)
  • Fixed a tricky problem my EfCore.SchemaCompare (complex)
  • I updated five these libraries to .NET 9 (mostly easy)
  • The ExploringJsonColumns for this article (complex)

Doing all this has helps me to regain some of my programming skills!, like my “holding all of parts of the code in my head” skill back (I talked about this in first endnote). This skill gave a detailed understanding of design of any projects I was working on. This allows me to think about additions or improvements “in my head”, often after work without the code in front of me. I though everyone could do this, but when I lost this gift, I was devastated, but I also realised that my “in my head” skill might be rare. But now I have seen that I can get back my “in my head” skill, it makes it easier to do programming!

However dementia effects many parts of the brain, and the big ones for me are remembering words and spelling. Here are two that I have, with examples from my programming:

  • Correct name: For instance, when coding I usually know that I can do a task, but I can’t remember the name of the method I need. For instance, I knew I could select a subset of a list, but didn’t remember its name (it turns out is the LINQ’s Take method). This happened many times on my small ExploringJsonColumns project.
  • Correct spelling: The coding is bad, but the real problem is to remember how to spell a word, even when using Word to help me! I can’t put a number of the times I had to type a word, and it wasn’t correct, then I have to try different spelling or look up on the web (I found that Google Search is better that Word when figuring out what my incorrect version of a word should be).

The other things I do when coding are:

  • Unit Tests: Indispensable : I have always used unit tests, but now I have dementia I need it more!
  • Task list: Now used: Before dementia I could handle many things in my head. Now I must create a task list to help me to not miss something. For this article I wrote the section names first, and then created the code for each section.
  • Use two monitors: If I am changing some code it’s useful to remember what the original code was, but my dementia removes the ability to remember things (short-term memory). Therefore I often have the original code on my second monitor while I updating the code using my main monitor.

END

Viewing all 76 articles
Browse latest View live