June 2016

Volume 31 Number 6

[ASP.NET]

Use Custom Middleware to Detect and Fix 404s in ASP.NET Core Apps

By Steve Smith

If you’ve ever lost something at a school or an amusement park, you may have had the good fortune of getting it back by checking the location’s Lost and Found. In Web applications, users frequently make requests for paths that aren’t handled by the server, resulting in 404 Not Found response codes (and occasionally humorous pages explaining the problem to the user). Typically, it’s up to the user to find what they’re looking for on their own, either through repeated guesses or perhaps using a search engine. However, with a bit of middleware, you can add a “lost and found” to your ASP.NET Core app that will help users find the resources they’re looking for.

What Is Middleware?

The ASP.NET Core documentation defines middleware as “components that are assembled into an application pipeline to handle requests and responses.” At its simplest, middleware is a request delegate, which can be represented as a lambda expression, like this one:

app.Run(async context => {
  await context.Response.WriteAsync(“Hello world”);
});

If your application consists of just this one bit of middleware, it will return “Hello world” to every request. Because it doesn’t refer to the next piece of middleware, this particular example is said to terminate the pipeline—nothing defined after it will be executed. However, just because it’s the end of the pipeline doesn’t mean you can’t “wrap” it in additional middleware. For instance, you could add some middleware that adds a header to the previous response:

app.Use(async (context, next) =>
{
  context.Response.Headers.Add("Author", "Steve Smith");
  await next.Invoke();
});
app.Run(async context =>
{
  await context.Response.WriteAsync("Hello world ");
});

The call to app.Use wraps the call to app.Run, calling into it using next.Invoke. When you write your own middleware, you can choose whether you want it to perform operations before, after, or both before and after the next middleware in the pipeline. You can also short-circuit the pipeline by choosing not to call next. I’ll show how this can help you create your 404-fixing middleware.

If you’re using the default MVC Core template, you won’t find such low-level delegate-based middleware code in your initial Startup file. It’s recommended that you encapsulate middleware in its own classes, and provide extension methods (named UseMiddlewareName) that can be called from Startup. The built-in ASP.NET middleware follows this convention, as these calls demonstrate:

if (env.IsDevelopment())
{
  app.UseDeveloperExceptionPage();
}
app.UseStaticFiles()
app.UseMvc();

The order of your middleware is important. In the preceding code, the call to UseDeveloperExceptionPage (which is only configured when the app is running in a development environment) should be wrapped around (thus added before) any other middleware that might produce an error.

In a Class of Its Own

I don’t want to clutter my Startup class with all of the lambda expressions and detailed implementation of my middleware. Just as with the built-in middleware, I want my middleware to be added to the pipeline with one line of code. I also anticipate that my middleware will need services injected using dependency injection (DI), which is easily achieved once the middleware is refactored into its own class. (See my May article at msdn.com/magazine/mt703433 for more on DI in ASP.NET Core.)

Because I’m using Visual Studio, I can add middleware by using Add New Item and choosing the Middleware Class template. Figure 1 shows the default content this template produces, including an extension method for adding the middleware to the pipeline via UseMiddleware.

Figure 1 Middleware Class Template

public class MyMiddleware
{
  private readonly RequestDelegate _next;
  public MyMiddleware(RequestDelegate next)
  {
    _next = next;
  }
  public Task Invoke(HttpContext httpContext)
  {
    return _next(httpContext);
  }
}
// Extension method used to add the middleware to the HTTP request pipeline.
public static class MyMiddlewareExtensions
{
  public static IApplicationBuilder UseMyMiddleware(this IApplicationBuilder builder)
  {
    return builder.UseMiddleware<MyMiddleware>();
  }
}

Typically, I’ll add async to the Invoke method signature, and then change its body to:

await _next(httpContext);

This makes the invocation asynchronous.

Once I’ve created a separate middleware class, I move my delegate logic into the Invoke method. Then I replace the call in Configure with a call to the UseMyMiddleware extension method. Running the app at this point should verify that the middleware still behaves as it did before, and the Configure class is much easier to follow when it consists of a series of UseSomeMiddleware statements.

Detecting and Recording 404 Not Found Responses

In an ASP.NET application, if a request is made that doesn’t match any handlers, the response will include a StatusCode set to 404. I can create a bit of middleware that will check for this response code (after calling _next) and take some action to record the details of the request:

await _next(httpContext);
if (httpContext.Response.StatusCode == 404)
{
  _requestTracker.Record(httpContext.Request.Path);
}

I want to be able to keep track of how many 404s a particular path has had, so I can fix the most common ones and get the most out of my corrective actions. To do that, I create a service called RequestTracker that records instances of 404 requests based on their path. RequestTracker is passed into my middleware through DI, as shown in Figure 2.

Figure 2 Dependency Injection Passing RequestTracker into Middleware

public class NotFoundMiddleware
{
  private readonly RequestDelegate _next;
  private readonly RequestTracker _requestTracker;
  private readonly ILogger _logger;
  public NotFoundMiddleware(RequestDelegate next,
    ILoggerFactory loggerFactory,
    RequestTracker requestTracker)
  {
    _next = next;
    _requestTracker = requestTracker;
    _logger = loggerFactory.CreateLogger<NotFoundMiddleware>();
  }
}

To add NotFoundMiddleware to my pipeline, I call the UseNotFoundMiddleware extension method. However, because it now depends on a custom service being configured in the services container, I also need to ensure the service is registered. I create an extension method on IServiceCollection called AddNotFoundMiddleware and call this method in ConfigureServices in Startup:

public static IServiceCollection AddNotFoundMiddleware(
  this IServiceCollection services)
{
  services.AddSingleton<INotFoundRequestRepository,
    InMemoryNotFoundRequestRepository>();
  return services.AddSingleton<RequestTracker>();
}

In this case, my AddNotFoundMiddleware method ensures an instance of my RequestTracker is configured as a Singleton in the services container, so it will be available to inject into the NotFoundMiddleware when it’s created. It also wires up an in-memory implementation of the INotFoundRequestRepository, which the RequestTracker uses to persist its data.

Because many simultaneous requests might come in for the same missing path, the code in Figure 3 uses a simple lock to ensure duplicate instances of NotFoundRequest aren’t added, and the counts are incremented properly.

Figure 3 RequestTracker

public class RequestTracker
{
  private readonly INotFoundRequestRepository _repo;
  private static object _lock = new object();
  public RequestTracker(INotFoundRequestRepository repo)
  {
    _repo = repo;
  }
  public void Record(string path)
  {
    lock(_lock)
    {
      var request = _repo.GetByPath(path);
      if (request != null)
      {
        request.IncrementCount();
      }
      else
      {
        request = new NotFoundRequest(path);
        request.IncrementCount();
        _repo.Add(request);
      }
    }
  }
  public IEnumerable<NotFoundRequest> ListRequests()
  {
    return _repo.List();
  }
  // Other methods
}

Displaying Not Found Requests

Now that I have a way to record 404s, I need a way to view this data. To do this, I’m going to create another small middleware component that will display a page showing all of the recorded NotFoundRequests, ordered by how many times they’ve occurred. This middleware will check to see if the current request matches a particular path, and will ignore (and pass through) any requests that don’t match the path. For matching paths, the middleware will return a page with a table containing the NotFound requests, ordered by their frequency. From there, the user will be able to assign individual requests a corrected path, which will be used by future requests instead of returning a 404.

Figure 4 demonstrates how simple it is to have the NotFoundPageMiddleware check for a certain path, and to make updates based on querystring values using that same path. For security reasons, access to the NotFoundPageMiddleware path should be restricted to admin users.

Figure 4 NotFoundPageMiddleware

public async Task Invoke(HttpContext httpContext)
{
  if (!httpContext.Request.Path.StartsWithSegments("/fix404s"))
  {
    await _next(httpContext);
    return;
  }
  if (httpContext.Request.Query.Keys.Contains("path") &&
      httpContext.Request.Query.Keys.Contains("fixedpath"))
  {
    var request = _requestTracker.GetRequest(httpContext.Request.Query["path"]);
    request.SetCorrectedPath(httpContext.Request.Query["fixedpath"]);
    _requestTracker.UpdateRequest(request);
  }
  Render404List(httpContext);
}

As written, the middleware is hardcoded to listen to the path /fix404s. It’s a good idea to make that configurable, so that different apps can specify whatever path they want. The rendered list of requests shows all requests ordered by how many 404s they’ve recorded, regardless of whether a corrected path has been set up. It wouldn’t be difficult to enhance the middleware to offer some filtering. Other interesting features might be recording more detailed information, so you could see which redirects were most popular, or which 404s were most common in the last seven days, but these are left as an exercise for the reader (or the open source community).

Figure 5 shows an example of what the rendered page looks like.

The Fix 404s Page
Figure 5 The Fix 404s Page

Adding Options

I’d like to be able to specify a different path for the Fix 404s page within different apps. The best way to do so is to create an Options class and pass it into the middleware using DI. For this middleware, I create a class, NotFoundMiddlewareOptions, which includes a property called Path with a value that defaults to /fix404s. I can pass this into the NotFoundPageMiddleware using the IOptions<T> interface, and then set a local field to the Value property of this type. Then my magic string reference to /fix404s can be updated:

if (!httpContext.Request.Path.StartsWithSegments(_options.Path))

Fixing 404s

When a request comes in that matches a NotFoundRequest that has a CorrectedUrl, the NotFoundMiddleware should modify the request to use the CorrectedUrl. This can be done by just updating the path property of the request:

string path = httpContext.Request.Path;
string correctedPath = _requestTracker.GetRequest(path)?.CorrectedPath;
if(correctedPath != null)
{
  httpContext.Request.Path = correctedPath; // Rewrite the path
}
await _next(httpContext);

With this implementation, any corrected URL will work just as if its request had been made directly to the corrected path. Then, the request pipeline continues, now using the rewritten path. This may or may not be the desired behavior; for one thing, search engine listings can suffer from having duplicate content indexed on multiple URLs. This approach could result in dozens of URLs all mapping to the same underlying application path. For this reason, it’s often preferable to fix 404s using a permanent redirect (status code 301).

If I modify the middleware to send a redirect, I can have the middleware short circuit in that case, because there’s no need to run any of the rest of the pipeline if I’ve decided I’m just going to return a 301:

if(correctedPath != null)
{
  httpContext.Response. Redirect(httpContext.Request.PathBase + correctedPath +
    httpContext.Request.QueryString, permanent: true);
  return;
}
await _next(httpContext);

Take care not to set corrected paths that result in infinite redirect loops.

Ideally, the NotFoundMiddleware should support both path rewriting and permanent redirects. I can implement this using my NotFoundMiddlewareOptions, allowing one or the other to be set for all requests, or I can modify CorrectedPath on the NotFoundRequest path, so it includes both the path and the mechanism to use. For now I’ll just update the options class to support the behavior, and pass IOptions<NotFoundMiddleOptions> into the NotFoundMiddleware just as I’m already doing for the NotFoundPageMiddleware. With the options in place, the redirect/rewrite logic becomes:

if(correctedPath != null)
{
  if (_options.FixPathBehavior == FixPathBehavior.Redirect)
  {
    httpContext.Response.Redirect(correctedPath, permanent: true);
    return;
  }
  if(_options.FixPathBehavior == FixPathBehavior.Rewrite)
  {
    httpContext.Request.Path = correctedPath; // Rewrite the path
  }
}

At this point, the NotFoundMiddlewareOptions class has two properties, one of which is an enum:

public enum FixPathBehavior
{
  Redirect,
  Rewrite
}
public class NotFoundMiddlewareOptions
{
  public string Path { get; set; } = "/fix404s";
  public FixPathBehavior FixPathBehavior { get; set; } 
    = FixPathBehavior.Redirect;
}

Configuring Middleware

Once you have Options set up for your middleware, you pass an instance of these options into your middleware when you configure them in Startup. Alternatively, you can bind the options to your configuration. ASP.NET configuration is very flexible, and can be bound to environment variables, settings files or built up programmatically. Regardless of where the configuration is set, the Options can be bound to the configuration with a single line of code:

services.Configure<NotFoundMiddlewareOptions>(
  Configuration.GetSection("NotFoundMiddleware"));

With this in place, I can configure my NotFoundMiddleware behavior by updating appsettings.json (the configuration I’m using in this instance):

"NotFoundMiddleware": {
  "FixPathBehavior": "Redirect",
  "Path": "/fix404s"
}

Note that converting from string-based JSON values in the settings file to the enum for FixPathBehavior is done automatically by the framework.

Persistence

So far, everything is working great, but unfortunately my list of 404s and their corrected paths are stored in an in-memory collection. This means that every time my application restarts, all of this data is lost. It might be fine for my app to periodically reset its count of 404s, so I can get a sense of which ones are currently the most common, but I certainly don’t want to lose the corrected paths I’ve set.

Fortunately, because I configured the RequestTracker to rely on an abstraction for its persistence (INotFoundRequestRepository), it’s fairly easy to add support for storing the results in a database using Entity Framework Core (EF). What’s more, I can make it easy for individual apps to choose whether they want to use EF or the in-memory configuration (great for testing things out) by providing separate helper methods.

The first thing I need in order to use EF to save and retrieve NotFoundRequests is a DbContext. I don’t want to rely on one that the app may have configured, so I create one just for the NotFoundMiddleware:

public class NotFoundMiddlewareDbContext : DbContext
{
  public DbSet<NotFoundRequest> NotFoundRequests { get; set; }
  protected override void OnModelCreating(ModelBuilder modelBuilder)
  {
    base.OnModelCreating(modelBuilder);
    modelBuilder.Entity<NotFoundRequest>().HasKey(r => r.Path);
  }
}

Once I have the dbContext, I need to implement the repository interface. I create an EfNotFoundRequestRepository, which requests an instance of the NotFoundMiddlewareDbContext in its constructor, and assigns it to a private field, _dbContext. Implementing the individual methods is straightforward, for example:

public IEnumerable<NotFoundRequest> List()
{
  return _dbContext.NotFoundRequests.AsEnumerable();
}
public void Update(NotFoundRequest notFoundRequest)
{
  _dbContext.Entry(notFoundRequest).State = EntityState.Modified;
  _dbContext.SaveChanges();
}

At this point, all that remains is to wire up the DbContext and EF repository in the app’s services container. This is done in a new extension method (and I rename the original extension method to indicate it’s for the InMemory version):

public static IServiceCollection AddNotFoundMiddlewareEntityFramework(
  this IServiceCollection services, string connectionString)
{
    services.AddEntityFramework()
      .AddSqlServer()
      .AddDbContext<NotFoundMiddlewareDbContext>(options =>
        options.UseSqlServer(connectionString));
  services.AddSingleton<INotFoundRequestRepository,
    EfNotFoundRequestRepository>();
  return services.AddSingleton<RequestTracker>();
}

I chose to have the connection string passed in, rather than stored in NotFoundMiddlewareOptions, because most ASP.NET apps that are using EF will already be providing a connection string to it in the ConfigureServices method. If desired, the same variable can be used when calling services.AddNotFoundMiddleware­EntityFramework(connectionString).

The last thing a new app needs to do before it can use the EF version of this middleware is run the migrations to ensure the database table structure is properly configured. I need to specify the middelware’s DbContext when I do so, because the app (in my case) already has its own DbContext. The command, run from the root of the project, is:

dotnet ef database update --context NotFoundMiddlewareContext

If you get an error about a database provider, make sure you’re calling services.AddNotFoundMiddlewareEntityFramework in Startup.

Next Steps

The sample I’ve shown here works fine, and includes both an in-memory implementation and one that uses EF to store Not Found request counts and fixed paths in a database. The list of 404s and the ability to add corrected paths should be secured so that only administrators can access it. Finally, the current EF implementation doesn’t include any caching logic, resulting in a database query being made with every request to the app. For performance reasons, I would add caching using the CachedRepository pattern.

The updated source code for this sample is available at bit.ly/1VUcY0J.


Steve Smith is an independent trainer, mentor and consultant, as well as an ASP.NET MVP. He has contributed dozens of articles to the official ASP.NET Core documentation (docs.asp.net), and helps teams quickly get up to speed with ASP.NET Core. Contact him at ardalis.com.

Thanks to the following Microsoft technical expert for reviewing this article: Chris Ross
Chris Ross is a developer working on the ASP.NET team at Microsoft. At this point his brain is made out of middleware.


Discuss this article in the MSDN Magazine forum