blog post image
Andrew Lock avatar

Andrew Lock

~11 min read

Creating a custom ConfigurationProvider in ASP.NET Core to parse YAML

In the previous incarnation of ASP.NET, configuration was primarily handled by the ConfigurationManager in System.Configuration, which obtained it's values from web.config. In ASP.NET Core there is a new, lightweight configuration system that is designed to be highly extensible. It lets you aggregate many configuration values from multiple different sources, and then access those in a strongly typed fashion using the new Options pattern.

Microsoft have written a number of packages for loading configuration from a variety of sources. Currently, using packages in the Microsoft.Extensions.Configuration namespace, you can read values from:

  • Console command line arguments
  • Environment variables
  • User Secrets stored using the Secrets Manager
  • In memory collections
  • JSON files
  • XML files
  • INI files

I recently wanted to use a YAML file as a configuration source, so I decided to write my own provider to support it. In this article I'm going to describe the process of creating a custom configuration provider. I will outline the provider I created, but you could easily adapt it to read any other sort of structured file you need to.

If you are just looking for the YAML provider itself, rather than how to create your own custom provider, you can find the code on GitHub and on NuGet.

Introduction to the ASP.NET Core configuration system

For those unfamiliar with it, the code below shows a somewhat typical File - New Project configuration for an ASP.NET Core application. It shows the constructor for the Startup class which is called when your app is just starting up.

public Startup(IHostingEnvironment env)
{
    var builder = new ConfigurationBuilder()
        .SetBasePath(env.ContentRootPath)
        .AddJsonFile("appsettings.json", optional: true, reloadOnChange: true)
        .AddJsonFile($"appsettings.{env.EnvironmentName}.json", optional: true)
        .AddEnvironmentVariables();
    Configuration = builder.Build();
}

public IConfigurationRoot Configuration { get; }

This version was scaffolded by the Yeoman generator so may differ from the Visual Studio template but they are both similar. Configuration is performed using a ConfigurationBuilder which is used to aggregate settings from various sources. Before adding anything else, you should be sure to set the ContentRootPath, so the builder knows where to look for your files.

We are then adding two JSON files - the appsettings.json file (which is typically where you would store settings you previously stored in web.config), and an environment specific JSON file (when in development, it would look for a appsettings.development.json file). Any settings with the same key in the latter file will overwrite settings read from the first.

Finally, the environment variables are added to the settings collection, again overwriting any identical values, and the configuration is built into an IConfigurationRoot, which essentially exposes a key-value store of setting keys and values.

Under the hood

There are a few important points to note in this setup.

  1. Settings discovered later in the pipeline overwrite any settings found previously.
  2. The setting keys are case insensitive.
  3. Setting keys are a string representation of the whole context of a setting, with a context delimited by the : character.

Hopefully the first two points make sense but what about that third one? Essentially we need to 'flatten' all our configuration files so that they have a single string key for every value. Taking a simple JSON example:

{
  "Outer" : { 
    "Middle" : { 
      "Inner": "value1",
      "HasValue": true
    }
  }
}

This example contains nested objects, but only two values that are actually being exposed as settings. The JsonConfigurationProvider takes this representation and ultimately converts it into an IDictionary<string, string> with the following values:

new Dictionary<string, string> {
  {"Outer:Middle:Inner", "value1"},
  {"Outer:Middle:HasValue", "true"}
}

YAML basics

YAML stands for "YAML Ain't Markup Language", and according to the official YAML website::

YAML is a human friendly data serialization standard for all programming languages.

It is a popular format for configuration files as it is easy to ready and write, used by continuous integration tools like AppVeyor and Travis. For example, an appveyor.yml file might look something like the following:

version: '{build}'
pull_requests:
  do_not_increment_build_number: true
branches:
  only:
  - master
nuget:
  disable_publish_on_pr: true
build_script:
- ps: .\Build.ps1
test: off
artifacts:
- path: .\artifacts\**\*.nupkg
  name: NuGet
deploy:
- provider: NuGet
  server: https://www.myget.org/F/andrewlock-ci/api/v2/package
  skip_symbols: true
  on:
    branch: master
- provider: NuGet
  name: production
  on:
    branch: master
    appveyor_repo_tag: true

Whitespace and case are important in YAML, so the indents all have meaning. If you are used to working with JSON, it may help to think of an indented YAML section as being surrounded by {}.

There are essentially 3 primary structures in YAML, which correspond quite nicely to JSON equivalents. I'll go over these briefly as we will need to understand how each should be converted to produce the key-value pairs we need for the configuration system.

YAML Scalar

A scalar is just a value - this might be the property key on the left, or the property value on the right. All of the identifiers in the snippet below are scalars.

key1: value
key2: 23
key3: false

The scalar corresponds fairly obviously with the simple types in javascript (int, string, boolean etc - not arrays or objects), whether they are used as keys or values.

YAML Mapping

The YAML mapping structure is essentially a dictionary, with a unique identifier and a value. It corresponds to an object in JSON. Within a mapping, all the keys must be unique; YAML is case sensitive. The example below shows a simple mapping structure, and two nested mappings:

mapping1: 
  prop1: val1
  prop2: val2
mapping2:
  mapping3:
    prop1: otherval1
    prop2: otherval2
  mapping4: 
    prop1: finalval
    prop1: finalval

YAML Sequence

Finally, we have the sequence, which is equivalent to a JSON array. Again, nested sequences are possible - the example shows a sequence of mappings, equivalent to a JSON array of objects:

sequence1: 
- map1:
   prop1: value1
- map2:
   prop2: value2

Creating a custom configuration provider

Now we have an understanding of what we are working with, we can dive in to the fun bit, creating our configuration provider!

In order to create a custom provider, you only need to implement two interfaces from the Microsoft.Extensions.Configuration.Abstractions package - IConfigurationProvider and IConfigurationSource.

In reality, it's unlikely you will need to implement these directly - there are a number of base classes you can use which contain partial implementations to get you started.

The ConfigurationSource

The first interface to implement is the IConfigurationSource. This has a single method that needs implementing, but there is also a base FileConfigurationSource which is more appropriate for our purposes:

public class YamlConfigurationSource : FileConfigurationSource
{
    public override IConfigurationProvider Build(IConfigurationBuilder builder)
    {
        FileProvider = FileProvider ?? builder.GetFileProvider();
        return new YamlConfigurationProvider(this);
    }
}

If not already set, this calls the extension method GetFileProvider on IConfigurationBuilder to obtain an IFileProvider which is used later to load files from disk. It then creates a new instance of a YamlConfigurationProvider (described next), and returns it to the caller.

The ConfigurationProvider

There are a couple of possibilities for implementing IConfigurationProvider but we will be implementing the base class FileConfigurationProvider. This base class handles all the additional requirements of loading files for us, handling missing files, reloads, setting key management etc. All that is required is to implement a single Load method. The YamlConfigurationProvider (elided for brevity) is show below:

using System;
using System.IO;
using Microsoft.Extensions.Configuration;

public class YamlConfigurationProvider : FileConfigurationProvider
{
    public YamlConfigurationProvider(YamlConfigurationSource source) : base(source) { }

    public override void Load(Stream stream)
    {
        var parser = new YamlConfigurationFileParser();
       
        Data = parser.Parse(stream);
    }
}

Easy, we're all done! We just create an instance of the YamlConfiguraionFileParser, parse the stream, and set the output string dictionary to the Data property.

Ok, so we're not quite there. While we have implemented the only required interfaces, we have a couple of support classes we need to setup.

he FileParser

The YamlConfigurationProvider above didn't really do much - it's our YamlConfigurationFileParser that contains the meat of our provider, converting the stream of characters provided to it into a string dictionary.

In order to parse the stream, I turned to YamlDotNet, a great open source library for parsing YAML files into a representational format. I also took a peek at the source code behind the JsonConfigurationFileParser in the aspnet/Configuration project on GitHub. In fact, given how close the YAML and JSON formats are, most of the code I wrote was inspired either by the Microsoft source code, or examples from YamlDotNet.

The parser we create must take a stream input from a file, and convert it in to an IDictionary<string, string>. To do this, we make use of the visitor pattern, visiting each of the YAML nodes we discover in turn. I'll break down the basic outline of the YamlConfigurationFileParser below:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using Microsoft.Extensions.Configuration;
using YamlDotNet.RepresentationModel;

internal class YamlConfigurationFileParser
{
    private readonly IDictionary<string, string> _data = 
        new SortedDictionary<string, string>(StringComparer.OrdinalIgnoreCase);
    private readonly Stack<string> _context = new Stack<string>();
    private string _currentPath;

    public IDictionary<string, string> Parse(Stream input)
    {
        _data.Clear();
        _context.Clear();

        var yaml = new YamlStream();
        yaml.Load(new StreamReader(input));

        // Examine the stream and fetch the top level node
        var mapping = (YamlMappingNode)yaml.Documents[0].RootNode;

        // The document node is a mapping node
        VisitYamlMappingNode(mapping);

        return _data;
    }

    // Implementation details elided for brevity
    private void VisitYamlMappingNode(YamlMappingNode node) { }

    private void VisitYamlMappingNode(YamlScalarNode yamlKey, YamlMappingNode yamlValue) { }

    private void VisitYamlNodePair(KeyValuePair<YamlNode, YamlNode> yamlNodePair) { }

    private void VisitYamlSequenceNode(YamlScalarNode yamlKey, YamlSequenceNode yamlValue) { }

    private void VisitYamlSequenceNode(YamlSequenceNode node) { }

    private void EnterContext(string context) { }

    private void ExitContext() { }

    // Final 'leaf' call for each tree which records the setting's value 
    private void VisitYamlScalarNode(YamlScalarNode yamlKey, YamlScalarNode yamlValue)
    {
        EnterContext(yamlKey.Value);
        var currentKey = _currentPath;

        if (_data.ContainsKey(currentKey))
        {
            throw new FormatException(Resources.FormatError_KeyIsDuplicated(currentKey));
        }

        _data[currentKey] = yamlValue.Value;
        ExitContext();
    }

}

I've hidden most of the visitor functions as they're really just implementation details, but if you're interested you can find the full YamlConfigurationFileParser code on GitHub.

First, we have our private fields - Dictionary<string, string> _data which will contain all our settings once parsing is complete, Stack<string> _context which keeps track of the level of nesting we have, and string _currentPath which will be set to the current setting key when _context changes. Note that the dictionary is created with StringComparer.OrdinalIgnoreCase (remember we said setting keys are case insensitive).

The processing is started by calling Parse(stream) with the open file stream. We clear any previous data or context we have, create an instance of YamlStream, and load our provided stream into it. We then retrieve the document level RootNode which you can think of as sitting just outside the YAML document, pointing to the document contents.

YAML root node mapping structure

Now we have a reference to the document structures, we can visit each of these in sequence, looping over all of the children until we have visited every node. For each node, we call the appropriate 'visit' method depending on the node type.

I have only shown the body of the VisitYamlScalarNode(keyNode, valueNode) for brevity but the other 'visit' methods are relatively simple. For every level you go into a mapping structure, the mapping 'key' node gets pushed onto the context stack. For a sequence structure, the 0 based index of the item is pushed on to the stack before it is processed.

Every visitation context will ultimately terminate in a call to VisitYamlScalarNode. This method adds the final key to the context, and fetches the combined setting key path in _currentPath. It checks that the key has not been previously added (in this file), and then saves the setting key and final scalar value into the dictionary.

Once all the nodes have been visited, the final Dictionary is returned, and we're done! To give a concrete example, consider the following YAML file:

key1: value1
mapping1: 
  mapping2a: 
    inside: value2
  mapping2b:
  - seq1
  - seq2
a_sequence: 
- a_mapping: 
    inner: value3

Once every node has been visited, we would have a dictionary with the following entries:

new Dictionary<string, string> {
  {"key1", "value1"},
  {"mapping1:mapping2a:inside", "value2"},
  {"mapping1:mapping2b:0", "seq1"},
  {"mapping1:mapping2b:1", "seq2"},
  {"a_sequence:0:a_mapping:inner", "value3"},
}

The builder extension methods

We now have all the pieces that are required to load and provide configuration values from a YAML file. However the new configuration system makes heavy use of extension methods to enable a fluent configuration experience. In keeping, with this, we will add a few extension methods to IConfigurationBuilder to allow you to easily add a YAML source.


using System;
using System.IO;
using Microsoft.Extensions.FileProviders;
using Microsoft.Extensions.Configuration

public static class YamlConfigurationExtensions
{
    public static IConfigurationBuilder AddYamlFile(this IConfigurationBuilder builder, string path)
    {
        return AddYamlFile(builder, provider: null, path: path, optional: false, reloadOnChange: false);
    }

    public static IConfigurationBuilder AddYamlFile(this IConfigurationBuilder builder, string path, bool optional)
    {
        return AddYamlFile(builder, provider: null, path: path, optional: optional, reloadOnChange: false);
    }

    public static IConfigurationBuilder AddYamlFile(this IConfigurationBuilder builder, string path, bool optional, bool reloadOnChange)
    {
        return AddYamlFile(builder, provider: null, path: path, optional: optional, reloadOnChange: reloadOnChange);
    }

    public static IConfigurationBuilder AddYamlFile(this IConfigurationBuilder builder, IFileProvider provider, string path, bool optional, bool reloadOnChange)
    {
        if (provider == null && Path.IsPathRooted(path))
        {
            provider = new PhysicalFileProvider(Path.GetDirectoryName(path));
            path = Path.GetFileName(path);
        }
        var source = new YamlConfigurationSource
        {
            FileProvider = provider,
            Path = path,
            Optional = optional,
            ReloadOnChange = reloadOnChange
        };
        builder.Add(source);
        return builder;
    }
}

These overloads all mirror the AddJsonFile equivalents you will likely have already used. The first three overloads of AddYamlFile all just delegate to the final overload, passing in default values for the various optional parameters. In the final overload, we first create a PhysicalFileProvider which is used to load files from disk, if one was not provided. We then setup our YamlConfigurationSource with the provided options, add it to the collection of IConfigurationSource in IConfigurationBuilder, and return the builder itself to allow the fluent configuration style.

Putting it all together

We now have all the pieces required to load application settings from YAML files! If you have created your own custom file provider in a class library, you need to include a reference to it in the project.json of your web application. If you just want to use the public YamlConfigurationProvider described here, you can pull it from NuGet using:

{
  "dependencies": {
    "NetEscapades.Configuration.Yaml": "1.0.3"
  }
}

Finally, use the extension method in your Startup configuration!

public Startup(IHostingEnvironment env)
{
    var builder = new ConfigurationBuilder()
        .SetBasePath(env.ContentRootPath)
        .AddYamlFile("my_required_settings.yml", optional: false);
        .AddJsonFile("appsettings.json", optional: true, reloadOnChange: true)
        .AddJsonFile($"appsettings.{env.EnvironmentName}.json", optional: true)
        .AddEnvironmentVariables();
    Configuration = builder.Build();
}

public IConfigurationRoot Configuration { get; }

In the configuration above, you can see we have added a YAML file to the start of our configuration pipeline, in which we load a required my_required_settings.yml file. This can be used to give us default setting values which can then be overwritten by our JSON files if required.

As mentioned before, all the code for this setup is on GitHub and NuGet so feel free to check it out. If you find any bugs, or issues, please do let me know.

Happy coding!

Resources

Andrew Lock | .Net Escapades
Want an email when
there's new posts?