Works on my machine-as-a-service
When building software in the modern workplace, you want to automatically test and statically analyse your code before pushing code to production. This means that rather than tens of test environments and an army of manual testers you have a bunch of automation that runs as close to when the code is written. Tests are run, the rate of how much code is not covered by automated tests is calculated, test results are published to the build server user interface (so that in the event that -heaven forbid – tests are broken, the developer gets as much detail as possible to resolve the problem) and static analysis of the built piece of software is performed to make sure no known problematic code has been introduced by ourselves, and also verifying that dependencies included are free from known vulnerabilities.
The classic dockerfile added by C# when an ASP.NET Core Web Api project is started features a multi stage build layout where an initial layer includes the full C# SDK, and this is where the code is built and published. The next layer is based on the lightweight .NET Core runtime, and the output directory from the build layer is copied here and the entrypoint is configured so that the website starts when you run the finished docker image.
Even tried multi
Multistage builds were a huge deal when they were introduced. You get one docker image that only contains the things you need, any source code is safely binned off in other layers that – sure – are cached, but don’t exist outside this local docker host on the build agent. If you then push the finished image to a repository, none of the source will come along. In the before times you had to solve this with multiple Dockerfiles, which is quite undesirable. You want to have high cohesion but low coupling, and fiddling with multiple Dockerfiles when doing things like upgrading versions does not give you a premium experience and invites errors to an unnecessesary degree.
Where is the evidence?
Now, when you go to Azure DevOps, GitHub Actions or CircleCI to find what went wrong with your build, the test results are available because the test runner has produced and provided output that can be understood by that particular test runner. If your test runner is not forthcoming with the information, all you will know is “computer says no” and you will have to trawl through console data – if that – and that is not the way to improve your day.
So – what – what do we need? Well we need the formatted test output. Luckily dotnet test
will give it to us if we ask it nicely.
The only problem is that those files will stay on the image that we are binning – you know multistage builds and all that – since we don’t want these files to show up in the finished supposedly slim article.
Old world Docker
When a docker image is built, every relevant change will create a new layer, and eventually a final image will be created and published that is an amalgamation of all consistuent layers. In the olden days, the legacy builder would cache all of the intermediate layers and publish a hash in the output so that you could refer back to intermediate layers should you so choose.
This seems like the perfect way of forensically finding the test result files we need. Let’s add a LABEL so that we can find the correct layer after the fact, copy the test data output and push it to the build server.
FROM mcr.microsoft.com/dotnet/aspnet:7.0-bullseye-slim AS base
WORKDIR /app
FROM mcr.microsoft.com/dotnet/sdk:7.0-bullseye-slim AS build
WORKDIR /
COPY ["src/webapp/webapp.csproj", "/src/webapp/"]
COPY ["src/classlib/classlib.csproj", "/src/classlib/"]
COPY ["test/classlib.tests/classlib.tests.csproj", "/test/classlib.tests/"]
# restore for all projects
RUN dotnet restore src/webapp/webapp.csproj
RUN dotnet restore src/classlib/classlib.csproj
RUN dotnet restore test/classlib.tests/classlib.tests.csproj
COPY . .
# test
# install the report generator tool
RUN dotnet tool install dotnet-reportgenerator-globaltool --version 5.1.20 --tool-path /tools
RUN dotnet test --results-directory /testresults --logger "trx;LogFileName=test_results.xml" /p:CollectCoverage=true /p:CoverletOutputFormat=cobertura /p:CoverletOutput=/testresults/coverage/ /test/classlib.tests/classlib.tests.csproj
LABEL test=true
# generate html reports using report generator tool
RUN /tools/reportgenerator "-reports:/testresults/coverage/coverage.cobertura.xml" "-targetdir:/testresults/coverage/reports" "-reporttypes:HTMLInline;HTMLChart"
RUN ls -la /testresults/coverage/reports
ARG BUILD_TYPE="Release"
RUN dotnet publish src/webapp/webapp.csproj -c $BUILD_TYPE -o /app/publish
# Package the published code as a zip file, perhaps? Push it to a SAST?
# Bottom line is, anything you want to extract forensically from this build
# process is done in the build layer.
FROM base AS final
WORKDIR /app
COPY --from=build /app/publish .
ENTRYPOINT ["dotnet", "webapp.dll"]
The way you would leverage this test output is by fishing out the remporary layer from the cache and assign it to a new image from which you can do plain file operations.
# docker images --filter "label=test=true"
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> 0d90f1a9ad32 40 minutes ago 3.16GB
# export id=$(docker images --filter "label=test=true" -q | head -1)
# docker create --name testcontainer $id
# docker cp testcontainer:/testresults ./testresults
# docker rm testcontainer
All our problems are solved. Wrap this in a script and you’re done. I did, I mean they did, I stole this from another blog.
Unfortunately keeping an endless archive of temporary, orphaned layers became a performance and storage bottleneck for docker, so – sadly – the Modern Era began with some optimisations that rendered this method impossible.
The Modern Era of BuildKit
Since intermediate layers are mostly useless, just letting them fall by the wayside and focus on actual output was much more efficient according to the forces that be. The use of multistage Dockerfiles to additionally produce test data output was not recommended or recognised as a valid use case.
So what to do? Well – there is a new command called docker bake that lets you do docker build on multiple docker images, or – most importantly – built targetting multiple targets on the same Dockerfile.
This means you can run one build all the way through to produce the final lightweight image and also have a second run that saves the intermediary image full of test results. Obviously the docker cache will make sure nothing is actually run twice, the second run is just about picking out the layer from the cache and making it accessible.
The Correct way of using bake is to format a bake file in HCL format:
group "default" {
targets = [ "webapp", "webapp-test" ]
}
target "webapp" {
output = [ "type=docker" ]
dockerfile = "src/webapp/Dockerfile"
}
target "webapp-test" {
output = [ "type=image" ]
dockerfile = "src/webapp/Dockerfile"
target = "build"
}
If you run this command line with docker buildx bake -f docker-bake.hcl
, you will be able to fish out the historic intermediary layer using the method described above.
Conclusion
So – using this mechanism you get a minimal number of dockerfiles, you get all the build guffins happening inside docker, giving you freedom from whatever limitations plague your build agent yet the bloated mess that is the build process will be automagically discarded and forgotten as you march on into your bright future with a lightweight finished image.