Tag Archives: Docker

.NET C# CI/CD in Docker

Works on my machine-as-a-service

When building software in the modern workplace, you want to automatically test and statically analyse your code before pushing code to production. This means that rather than tens of test environments and an army of manual testers you have a bunch of automation that runs as close to when the code is written. Tests are run, the rate of how much code is not covered by automated tests is calculated, test results are published to the build server user interface (so that in the event that -heaven forbid – tests are broken, the developer gets as much detail as possible to resolve the problem) and static analysis of the built piece of software is performed to make sure no known problematic code has been introduced by ourselves, and also verifying that dependencies included are free from known vulnerabilities.

The classic dockerfile added by C# when an ASP.NET Core Web Api project is started features a multi stage build layout where an initial layer includes the full C# SDK, and this is where the code is built and published. The next layer is based on the lightweight .NET Core runtime, and the output directory from the build layer is copied here and the entrypoint is configured so that the website starts when you run the finished docker image.

Even tried multi

Multistage builds were a huge deal when they were introduced. You get one docker image that only contains the things you need, any source code is safely binned off in other layers that – sure – are cached, but don’t exist outside this local docker host on the build agent. If you then push the finished image to a repository, none of the source will come along. In the before times you had to solve this with multiple Dockerfiles, which is quite undesirable. You want to have high cohesion but low coupling, and fiddling with multiple Dockerfiles when doing things like upgrading versions does not give you a premium experience and invites errors to an unnecessesary degree.

Where is the evidence?

Now, when you go to Azure DevOps, GitHub Actions or CircleCI to find what went wrong with your build, the test results are available because the test runner has produced and provided output that can be understood by that particular test runner. If your test runner is not forthcoming with the information, all you will know is “computer says no” and you will have to trawl through console data – if that – and that is not the way to improve your day.

So – what – what do we need? Well we need the formatted test output. Luckily dotnet test will give it to us if we ask it nicely.

The only problem is that those files will stay on the image that we are binning – you know multistage builds and all that – since we don’t want these files to show up in the finished supposedly slim article.

Old world Docker

When a docker image is built, every relevant change will create a new layer, and eventually a final image will be created and published that is an amalgamation of all consistuent layers. In the olden days, the legacy builder would cache all of the intermediate layers and publish a hash in the output so that you could refer back to intermediate layers should you so choose.

This seems like the perfect way of forensically finding the test result files we need. Let’s add a LABEL so that we can find the correct layer after the fact, copy the test data output and push it to the build server.

FROM mcr.microsoft.com/dotnet/aspnet:7.0-bullseye-slim AS base
WORKDIR /app
FROM mcr.microsoft.com/dotnet/sdk:7.0-bullseye-slim AS build
WORKDIR /
COPY ["src/webapp/webapp.csproj", "/src/webapp/"]
COPY ["src/classlib/classlib.csproj", "/src/classlib/"]
COPY ["test/classlib.tests/classlib.tests.csproj", "/test/classlib.tests/"]
# restore for all projects
RUN dotnet restore src/webapp/webapp.csproj
RUN dotnet restore src/classlib/classlib.csproj
RUN dotnet restore test/classlib.tests/classlib.tests.csproj
COPY . .
# test
# install the report generator tool
RUN dotnet tool install dotnet-reportgenerator-globaltool --version 5.1.20 --tool-path /tools
RUN dotnet test --results-directory /testresults --logger "trx;LogFileName=test_results.xml" /p:CollectCoverage=true /p:CoverletOutputFormat=cobertura /p:CoverletOutput=/testresults/coverage/ /test/classlib.tests/classlib.tests.csproj
LABEL test=true
# generate html reports using report generator tool
RUN /tools/reportgenerator "-reports:/testresults/coverage/coverage.cobertura.xml" "-targetdir:/testresults/coverage/reports" "-reporttypes:HTMLInline;HTMLChart"
RUN ls -la /testresults/coverage/reports
 
ARG BUILD_TYPE="Release" 
RUN dotnet publish src/webapp/webapp.csproj -c $BUILD_TYPE -o /app/publish
# Package the published code as a zip file, perhaps? Push it to a SAST?
# Bottom line is, anything you want to extract forensically from this build
# process is done in the build layer.
FROM base AS final
WORKDIR /app
COPY --from=build /app/publish .
ENTRYPOINT ["dotnet", "webapp.dll"]

The way you would leverage this test output is by fishing out the remporary layer from the cache and assign it to a new image from which you can do plain file operations.

# docker images --filter "label=test=true"
REPOSITORY   TAG       IMAGE ID       CREATED          SIZE
<none>       <none>    0d90f1a9ad32   40 minutes ago   3.16GB
# export id=$(docker images --filter "label=test=true" -q | head -1)
# docker create --name testcontainer $id
# docker cp testcontainer:/testresults ./testresults
# docker rm testcontainer

All our problems are solved. Wrap this in a script and you’re done. I did, I mean they did, I stole this from another blog.

Unfortunately keeping an endless archive of temporary, orphaned layers became a performance and storage bottleneck for docker, so – sadly – the Modern Era began with some optimisations that rendered this method impossible.

The Modern Era of BuildKit

Since intermediate layers are mostly useless, just letting them fall by the wayside and focus on actual output was much more efficient according to the forces that be. The use of multistage Dockerfiles to additionally produce test data output was not recommended or recognised as a valid use case.

So what to do? Well – there is a new command called docker bake that lets you do docker build on multiple docker images, or – most importantly – built targetting multiple targets on the same Dockerfile.

This means you can run one build all the way through to produce the final lightweight image and also have a second run that saves the intermediary image full of test results. Obviously the docker cache will make sure nothing is actually run twice, the second run is just about picking out the layer from the cache and making it accessible.

The Correct way of using bake is to format a bake file in HCL format:

group "default" {
  targets = [ "webapp", "webapp-test" ]
}
target "webapp" {
  output = [ "type=docker" ]
  dockerfile = "src/webapp/Dockerfile"
}
target "webapp-test" {
  output = [ "type=image" ]
  dockerfile = "src/webapp/Dockerfile"
  target = "build"
} 

If you run this command line with docker buildx bake -f docker-bake.hcl, you will be able to fish out the historic intermediary layer using the method described above.

Conclusion

So – using this mechanism you get a minimal number of dockerfiles, you get all the build guffins happening inside docker, giving you freedom from whatever limitations plague your build agent yet the bloated mess that is the build process will be automagically discarded and forgotten as you march on into your bright future with a lightweight finished image.

You can have nice things

I have come across a few things that are legitimately pleasant to use, so I thought I should collate them here to aid my aging memory. Dear reader, I am not attempting to copy Scott Hanselman’s tools list, I am stealing the concept.

Github Actions

Yea, not something revolutionary I just uncovered that you never heard of before, but still. It’s pretty great. Out of all the yet-another-yet-another-markup-language-configuration-file-to-configure-a-thing tools that exist that help you orchestrate builds, I personally find Github Actions the least weirdly magical and easy to live with, but then I’ve only tried CircleCI, Azure DevOps/TFS and TeamCity.

Pulumi – Infrastructure as code

Write your infrastructure code in C# using Pulumi.It supports Azure, AWS, Google Cloud and Kubernetes, but – as I’ve ranted about before, this shouldn’t be taken as a way to support multi-cloud, the object hierarchy is still very bespoke to each cloud provider. That said, you can mix and match providers in a stack, let’s say you have your DNS hosted in DNSimple but your cloud compute bits in Azure. You would be stuck doing a lot of bash scripting to make it work otherwise, but Pulumi lets you write one C# file that describes all of your infra, mostly.
You will recognise the feel of using it from chef, basically you write code that describes the infrastructure, but the actual construction isn’t happening in the code, first the description is made, the desired state is then compared to the actual running state, and adjustments are made. It is a thin wrapper over terraform, but it does what it says on the tin.

MinVer – automagic versioning for .NET Core

At some point you will write your build chain hack to populate some attributes on your Assembly to stamp a brand on a binary so you can display a version on your site that you can track back to a specific commit. The simplest way of doing this, without needing to change branching strategy or write custom code, is MinVer.

It literally browses through your commits to find your version tags and then increments that version with how many commits there are from that commit. It is what I dreamed would be out there when I started looking. It is genius.

A couple of gotchas: It relies – duh- on having access to the git history, so you need to remember to remove .git from your .dockerignore file, or else your dotnet publish inside docker build will fail to locate any version information. Obviously, unless you intended to release all versions of your source code in the docker image, make sure you have a staged docker build – this is the default in recent Visual Studio templates – but still. I encourage you in any case to mount your finished docker image using docker run -it --entrypoint sh imagename:tag to have a look that your docker image contains what you expect.

Also, in your GitHub Actions you will need to allow for a deeper fetch depth for your script to have enough data to calculate the version number, but that is mentioned in the documentation. I already used a tag prefix ‘v’ for my versions, so I had to add that to my project files. No problems, it just worked. Very impressed.

WSL 2 in anger

I have previously written about the Windows Subsystem for Linux. As a recap, it comes in two flavours- one built on the concept of pico processes, marshalling the Linux ABI into Win32 API calls (WSL1) and an actual Linux kernel hosted in a lightweight Hyper-V installation (WSL2). Both types have file system integration and fairly transparent command line interface to run Linux commands from Windows and Windows executables from the Linux command line. But, beyond the headline stuff, how does it work in real life?

Of course with WSL1, there are compatibility issues, but the biggest problem is horrifyingly slow Linux file system performance because of it being Windows NTFS pretending to be EXT4. Since NTFS is slow on small files, you can imagine an operating system whose main feature is being a immense collection of small files working together would run slowly on top of it a filesystem with those characteristics.

With WSL2, obviously kernel compatibility is 100%, as, well it’s a Linux kernel, and the Linux file system stuff Just Works, as the file system is managed natively (although over hypervisor), but ironically, the /mnt filesystem with the Windows drives mounted are prohibitively slow. It has been said to be a bug that has been allegedly fixed, but given that we are – at the end of the day – talking about accessing local PCIe gen 4 NVME storage, managing to make file I/O this slow, betrays plenty of room for improvement. To summarise – if you want to do Linuxy in Linux under Windows, use WSL2 , if you want to do Windowsy things in Linux under Windows, use WSL1. Do with that what you will. WSL2 being based on a proper VM means despite huge efforts, the networking story is not super smooth, no proper mechanism exists to make things easier for you and no hits on Google will actually address the fundamental problem.

That is to say, I can run a website I have built in docker in WSL2, but l need to do a lot of digging to figure out what IP the site got, and do a lot of firewall stuff to be able to reach it. Also, running X Window with the excellent X410 server requires a lot of bespoke scripting because there is no way of setting up the networking to just work on start-up. You would seriously think that a sensible bridging default could have been brought in to make things a lot more palatable? After all, all I want to do is road test my .NET Core APIs and apps in docker before pushing them. Doesn’t seem too extreme of a use case.

To clarify – running or debugging a .NET Core Linux website from Visual Studio Code (with the WSL2 backend) works seamlessly, absolutely seamlessly. My only gripe is that because of the networking issue, I cannot really actually verify docker things in WSL2 which I surmised was the point of WSL2 vs WSL1.

Database Integration Testing

Testing your SQL queries is as important as any other piece of logic. Unless you only do reads and writes, presumably some type of logic will be implemented at least in the form of a query, and you would like to validate that logic same as any other.

Overview

For this you need database integration tests, There are multiple strategies for this (in-memory databases, additional abstractions and mocks, or creating a temporary but real database, just to name a few) but I will in this post discuss running a linux SQL Server docker image, applying all migrations to it from scratch and the running tests on top of it.

Technology choice is beyond the scope of this text. I use .NET Core 3.1, XUnit and legacy C# because I know it already and because my F# is not idiomatic enough for me not to go on tangents and end up writing a monad tutorial instead. I have used MySQL / MariaDB before and I will never use it for anything I care about. I have tried Postgres, and I like it , it is a proper database system, but again, not familiar enough for my purposes this time. To reiterate, this post is based on using C# on .NET Core 3.1 over MSSQL Server and the tests will be run on push using Github Actions.

My development machine is really trying, OK, so let us cut it some slack. Anyway, I have Windows 10 insider something, with WSL2 and Docker Desktop for WSL2 on it. I run Ubuntu 20.04 in WSL2, dist-upgraded from 18.04. I develop the code in VS2019 Community on Windows, obviously.

Problem

This is simple, when a commit is made to the part of a repository that contains DbUp SQL Scripts, related production code or these tests, I want to trigger tests that verify that my SQL Migrations are valid, and when SQL queries change, I want those changes verified against a real database server.

I do not like docker, especially docker-compose. It seems to me it has been designed by people that don’t know what they are on about. Statistically that cannot be the case, since there are tens of thousands of docker-compose users that do magical things, but I have wasted enough time, so like Seymour Skinner I proclaim, “no, it is the children that are wrong!”, and I thus need to find another way of running an ad hoc SQL Server.

All CI stuff and production hosting of this system is Linux based, but Visual Studio is stuck in Windows, so I need a way to be able to trigger these tests in a cross platform way.

Clues

I found an article by Jeremy D Miller that describes how to use a .NET client of the Docker API to automatically run a MSSQL database server. I made some hacky mods:

internal class SqlServerContainer : IDockerServer
{
    public SqlServerContainer() : base("microsoft/mssql-server-linux:latest", "dev-mssql")
    {
        // My production code uses some custom types that Dapper needs
        // handlers for. Registering them here seems to work
        SqlMapper.AddTypeHandler(typeof(CustomType), CustomTypeHandler.Default);
    }

    public static readonly string ConnectionString = "Data Source=127.0.0.1,1436;User Id=sa;Password=AJ!JA!J1aj!JA!J;Timeout=5";

    // Gotta wait until the database is really available
    // or you'll get oddball test failures;)
    protected override async Task<bool> isReady()
    {
        try
        {
            using (var conn =
                new SqlConnection(ConnectionString))
            {
                await conn.OpenAsync();

                return true;
            }
        }
        catch (Exception)
        {
            return false;
        }
    }

    // Watch the port mapping here to avoid port
    // contention w/ other Sql Server installations
    public override HostConfig ToHostConfig()
    {
        return new HostConfig
        {
            PortBindings = new Dictionary<string, IList<PortBinding>>
            {
                {
                    "1433/tcp",
                    new List<PortBinding>
                    {
                        new PortBinding
                        {
                            HostPort = $"1436",
                            HostIP = "127.0.0.1"
                        }

                    }
                }
            },

        };
    }

    public override Config ToConfig()
    {
        return new Config
        {
            Env = new List<string> { "ACCEPT_EULA=Y", "SA_PASSWORD=AJ!JA!J1aj!JA!J", "MSSQL_PID=Developer" }
        };
    }

    public async static Task RebuildSchema(IDatabaseSchemaEnforcer enforcer, string databaseName)
    {
        using (var conn = new SqlConnection($"{ConnectionString};Initial Catalog=master"))
        {
            await conn.ExecuteAsync($@"
                IF DB_ID('{databaseName}') IS NOT NULL
                BEGIN
                    DROP DATABASE {databaseName}
                END
            ");
        }
        await enforcer.EnsureSchema($"{ConnectionString};Initial Catalog={databaseName}");
    }
}

I then cheated by reading the documentation for DbUp and combined the SQL Server schema creation with the docker image starting code to produce the witch’s brew below.

internal class APISchemaEnforcer : IDatabaseSchemaEnforcer
{
    private readonly IMessageSink _diagnosticMessageSink;

    public APISchemaEnforcer(IMessageSink diagnosticMessageSink)
    {
        _diagnosticMessageSink = diagnosticMessageSink;
    }

    public Task EnsureSchema(string connectionString)
    {
        EnsureDatabase.For.SqlDatabase(connectionString);
        var upgrader =
            DeployChanges.To
                .SqlDatabase(connectionString)
                .WithScriptsEmbeddedInAssembly(Assembly.GetAssembly(typeof(API.DbUp.Program)))
                .JournalTo(new NullJournal())
                .LogTo(new DiagnosticSinkLogger(_diagnosticMessageSink))
                .Build();
        var result = upgrader.PerformUpgrade();
        return Task.CompletedTask;
    }
}

When DbUp runs it will output all scripts run to the console, so we need to make sure this type of information will actually end up being logged, despite it being diagnostic. There are two problems there, we need to use a IMessageSink to write diagnostic logs from DbUp for XUnit to become aware of the information and secondly we must add a configuration file to the integration test project for xunit to choose to print the messages to the console.

Our message sink diagnostic logger is plumbed into DbUp as you can see in the previous example, and here is the implementation:

internal class DiagnosticSinkLogger : IUpgradeLog
{
    private IMessageSink _diagnosticMessageSink;

    public DiagnosticSinkLogger(IMessageSink diagnosticMessageSink)
    {
        _diagnosticMessageSink = diagnosticMessageSink;
    }

    public void WriteError(string format, params object[] args)
    {
        var message = new DiagnosticMessage(format, args);
        _diagnosticMessageSink.OnMessage(message);
    }

    public void WriteInformation(string format, params object[] args)
    {
        var message = new DiagnosticMessage(format, args);
        _diagnosticMessageSink.OnMessage(message);
    }

    public void WriteWarning(string format, params object[] args)
    {
        var message = new DiagnosticMessage(format, args);
        _diagnosticMessageSink.OnMessage(message);
    }
}

Telling XUnit to print diagnostic information is done through a file in the root of the integration test project called xunit.runner.json, and it needs to look like this:

{
  "$schema": "https://xunit.net/schema/current/xunit.runner.schema.json",
  "diagnosticMessages": true
}

If you started out with Jeremy’s example and have followed along , applying my tiny changes you may or may not be up and running by now. I had an additional problem – developing on Windows while running CI on Linux. I solved this with another well judged hack:

public abstract class IntegrationFixture : IAsyncLifetime
{
    private readonly IDockerClient _client;
    private readonly SqlServerContainer _container;

    public IntegrationFixture()
    {
        _client = new DockerClientConfiguration(GetEndpoint()).CreateClient();
        _container = new SqlServerContainer();
    }

    private Uri GetEndpoint()
    {
        return RuntimeInformation.IsOSPlatform(OSPlatform.Windows)
            ? new Uri("tcp://localhost:2375")
            : new Uri("unix:///var/run/docker.sock");
    }

    public async Task DisposeAsync()
    {
        await _container.Stop(_client);
    }

    protected string GetConnectionString() => $"{SqlServerContainer.ConnectionString};Initial Catalog={DatabaseName}";
        
    protected abstract IDatabaseSchemaEnforcer SchemaEnforcer { get; }
    protected abstract string DatabaseName { get; }

    public async Task InitializeAsync()
    {
        await _container.Start(_client);
        await SqlServerContainer.RebuildSchema(SchemaEnforcer, DatabaseName);
    }

    public SqlConnection GetConnection() => new SqlConnection(GetConnectionString());
}

The point is basically, if you are executing on Linux, find the unix socket but if you are stuck on Windows – try TCP.

Github Action

After having a single test – to my surprise – actually pass locally after having created the entire database – I thought it was time to think about the CI portion of this adventure. I had no idea if the Github Action thing would allow me to just pull down docker images, but I thought “probably not”. Still created the yaml, because nobody likes a coward:

# This is a basic workflow to help you get started with Actions

name: API Database tests

# Controls when the action will run. Triggers the workflow on push or pull request
# events but only for the master branch
on:
  push:
    branches: [ master ]
    paths: 
      - '.github/workflows/thisaction.yml'
      - 'test/API.DbUp.Tests/*'
      - 'src/API.DbUp/*'
      - 'src/API/*'
  pull_request:
    branches: [ master ]
    paths: 
      - '.github/workflows/thisaction.yml'
      - 'test/API.DbUp.Tests/*'
      - 'src/API.DbUp/*'
      - 'src/API/*'

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
  # This workflow contains a single job called "test"
  test:
    # The type of runner that the job will run on
    runs-on: ubuntu-latest

    # Steps represent a sequence of tasks that will be executed as part of the job
    steps:
    # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
    - uses: actions/checkout@v2

    # Runs a single command using the runners shell
    - name: Run .NET Core CLI tests
      run: |
        echo Run tests based on docker. Bet u twenty quid this will fail
        dotnet test test/API.DbUp.Tests/API.DbUp.Tests.csproj

You can determine, based on the highlighted line above the level of surprise and elation I felt when after I committed and pushed, github chugged through, downloaded the mssql docker image, recreated my schema, ran the test and returned a success message. I am still in shock.

So what now?

Like Jeremy discusses in his post, the problem with database integration tests is that you want to get a lot of assertions out of each time you created your database due to how expensive it is. In order to do so, and to procrastinate a little, I created a nifty little piece of code to keep track of test data I create in each function, so that I can run tests independent of each other and clean up almost automatically using Stack<T>.

I created little helper functions that would create domain objects when setting up tests. Each test would at the beginning create a Stack<RevertAction> and pass it into each helper function while setting up the tests, and each helper function would push a new RevertAction($"DELETE FROM ThingA WHERE id = {IDofThingAIJustCreated}") onto that stack. At the end of each test, I would invoke the Revert extension method on the stack and pass it some context so that it can access the test database and output test logging if necessary.

public class RevertAction
{
    string _sqlCommandText;

    public RevertAction(string sqlCommandText)
    {
        _sqlCommandText = sqlCommandText;
    }

    public async Task Execute(IntegrationFixture fixture, ITestOutputHelper output)
    {
        using var conn = fixture.GetConnection();
        try
        {
            await conn.ExecuteAsync(_sqlCommandText);
        }
        catch(Exception ex)
        {
            output.WriteLine($"Revert action failed: {_sqlCommandText}");
            output.WriteLine($"Exception: {ex.Message}");
            output.WriteLine($"{ex.ToString()}");
            throw;
        }

    }
}

The revert method is extremely simple:

public static class StackExtensions
{
    public static async Task Revert(this Stack<RevertAction> actions, IntegrationFixture fixture, ITestOutputHelper output)
    {
        while (actions.Any())
        {
            var action = actions.Pop();
            await action.Execute(fixture, output);
        }
    }
}

So – that was it. The things I put in this blog post were the hardest for me to figure out, the rest is just a question of maintaining database integration tests, and that is very implementation specific, so I leave that up to you.