Skip to content

Instantly share code, notes, and snippets.

@Depechie
Last active January 26, 2024 09:32
Show Gist options
  • Select an option

  • Save Depechie/f96f3736c296acfeb4010b7e8b06ca88 to your computer and use it in GitHub Desktop.

Select an option

Save Depechie/f96f3736c296acfeb4010b7e8b06ca88 to your computer and use it in GitHub Desktop.
Add observability to any microservices environment

Add observability to any microservices environment

Disclaimer, copyright and code

This workshop is licensed under CC BY-NC-SA 4.0 and should not be used commercially without permission.

Of course, the concepts we discuss can and should be used to improve the quality of your application production environment, as such all code samples are licensed under MIT.

Labs

Lab 0 - Environment setup

Install Visual Studio Code, this will be our file editor: https://code.visualstudio.com/

Install docker desktop, this will give us the docker cli and also docker compose: https://docs.docker.com/desktop/

Note that there is a pricing model attached to docker desktop depending on where you use it! Details can be found here https://www.docker.com/pricing/

To verify that docker is installed correctly, open a terminal and run the following commands:

docker --version
docker run hello-world

Install the .NET 8 SDK, this will give us the dotnet cli: https://dotnet.microsoft.com/en-us/download/dotnet/8.0

To verify that .NET is installed correctly, open a terminal and run the following command:

dotnet --version

Lab 1 - Monitoring

Services

Before we can actually start monitoring, we first need to create some services that will represent our simulated production environment. The current workshop will use .NET 8 WebAPIs, but the concepts can be applied to any technology stack.

To create a new WebAPI project, we will be using the .NET cli that we previously installed. Open a terminal and run the following command:

dotnet new webapi --name Catalog.API  --use-program-main --use-controllers

This will scaffold a new WebAPI project in a folder called Catalog.API. It will be pre setup with a weather controller that returns a weather forecast. We will change this code later during the workshop.

Note more info on the .NET cli and the dotnet new command can be found here: https://learn.microsoft.com/en-us/dotnet/core/tools/dotnet-new

When the scaffolding is done, open the project in Visual Studio Code. Start Visual Studio Code and select the option Open Folder and navigate to the Catalog.API folder.

Once the project is loaded, press F5 to start debugging. You can then navigate to https://localhost:<port-number>/weatherforecast to see the result of the weather forecast HTTP GET REST action.

When we want to try out the service during the workshop, to generate some traffic, we will use the swagger UI. This can be found at following url:

https://localhost:<port-number>/swagger

Repeat the same process for the Basket.API project. So again open a terminal window and run the following command:

dotnet new webapi --name Basket.API  --use-program-main --use-controllers

And verify that it is functioning correctly by navigating to it's swagger UI.

Now that the boilerplate code for the services is available, we will change the code to simulate a real world scenario. For both projects, delete the WeatherForecast.cs file in the root of the project and also the WeatherForecastController.cs file in the Controllers folder.

Let's continue with the Catalog.API, inside the Controllers folder, create a new file called CatalogController.cs and add the following code:

using Catalog.API.Models;
using Microsoft.AspNetCore.Mvc;

namespace Catalog.API.Controllers
{
    [Route("api/v1/[controller]")]
    [ApiController]
    public class CatalogController : ControllerBase
    {
        private List<Product> _products = new List<Product>();

        public CatalogController()
        {
            _products.Add(new Product("prod001", "Modern chair")
            {
                ImageURL = "assets/img/product-img/product1.jpg",
                HoverImageURL = "assets/img/product-img/product2.jpg",
                ThumbnailImageURL = "assets/img/bg-img/cart1.jpg",
                Price = 180,
                Rating = 4
            });

            _products.Add(new Product("prod002", "Vintage desk")
            {
                ImageURL = "assets/img/product-img/product4.jpg",
                HoverImageURL = "assets/img/product-img/product3.jpg",
                ThumbnailImageURL = "assets/img/bg-img/cart2.jpg",
                Price = 250,
                Rating = 4
            });

            _products.Add(new Product("prod003", "Lounge chair")
            {
                ImageURL = "assets/img/product-img/product5.jpg",
                HoverImageURL = "assets/img/product-img/product6.jpg",
                ThumbnailImageURL = "assets/img/bg-img/cart3.jpg",
                Price = 300,
                Rating = 4
            });
        }

        [HttpGet]
        [Route("items")]
        public IActionResult Items()
        {
            return Ok(_products);
        }

        [HttpGet]
        [Route("items/{id}")]
        public IActionResult GetItem(string id)
        {
            return Ok(_products.FirstOrDefault(i => string.Equals(i.Id, id, StringComparison.InvariantCultureIgnoreCase)));
        }
    }
}

In the project root, create a new folder called Models and add a new file called Product.cs with the following code:

namespace Catalog.API.Models
{
    public class Product
    {
        public string Id { get; set; }
        public string Name { get; set; }
        public string? ImageURL { get; set; }
        public string? HoverImageURL { get; set; }
        public string? ThumbnailImageURL { get; set; }
        public decimal? Price { get; set; }
        public int Rating { get; set; } = 0;

        public Product(string id, string name)
        {
            Id = id;
            Name = name;
        }
    }
}

Last part is wiring up the API and for that we need to edit the Progam.cs file in the project root folder. Replace the current content with the following code:

using Microsoft.AspNetCore.Diagnostics.HealthChecks;
using Microsoft.Extensions.Diagnostics.HealthChecks;

namespace Catalog.API;

public class Program
{
    public static ConfigurationManager Configuration { get; private set; }

    public static void Main(string[] args)
    {
        var builder = WebApplication.CreateBuilder(args);

        Configuration = builder.Configuration;

        builder.Services.AddControllers();
        builder.Services.AddEndpointsApiExplorer();
        builder.Services.AddSwaggerGen();
        builder.Services.AddHealthChecks();
        builder.Services.AddRouting(options => options.LowercaseUrls = true);

        var app = builder.Build();

        if (app.Environment.IsDevelopment())
        {
            app.UseSwagger();
            app.UseSwaggerUI();
        }

        //app.UseHttpsRedirection(); // For local debugging during this workshop, we will not use HTTPS
        app.UseRouting();
        app.UseAuthorization();

        app.MapHealthChecks("/health", new HealthCheckOptions
        {
            AllowCachingResponses = false,
            ResultStatusCodes =
            {
                [HealthStatus.Healthy] = StatusCodes.Status200OK,
                [HealthStatus.Degraded] = StatusCodes.Status200OK,
                [HealthStatus.Unhealthy] = StatusCodes.Status503ServiceUnavailable
            }
        });
        app.MapControllers();

        app.Run();
    }
}

If everything is done correctly, you should be able to run the API and see the Swagger UI at the following url:

https://localhost:<port-number>/swagger

Showing the 2 newly created endpoints of our CatalogController.

Swagger UI

An extra element we now have to verify, is checking if our API is healthy! This can be done by navigating to the following url:

https://localhost:<port-number>/health

It should show only 1 word: Healthy as result.

Now we can continue with the Basket.API project. In the Controllers folder, add a new file called BasketController.cs with the following code:

using Basket.API.Models;
using Basket.API.Services.Interfaces;
using Microsoft.AspNetCore.Mvc;

namespace Basket.API.Controllers
{
    [Route("api/v1/[controller]")]
    [ApiController]
    public class BasketController : ControllerBase
    {
        private readonly ICatalogService _catalogService;

        public BasketController(ICatalogService catalogService)
        {
            _catalogService = catalogService;
        }

        [HttpPost]
        public async Task<IActionResult> AddBasketItem([FromBody] BasketItem item)
        {
            var product = await _catalogService.GetProduct(item.ProductId);
            return Ok(item);
        }
    }
}

In the project root, create a new folder called Models and add a new file called BasketItem.cs with the following code:

namespace Basket.API.Models
{
    public class BasketItem
    {
        public string ProductId { get; set; }
        public string BasketId { get; set; }
        public int Quantity { get; set; }

        public BasketItem(string productId, string basketId, int quantity)
        {
            ProductId = productId;
            BasketId = basketId;
            Quantity = quantity;
        }
    }
}

And the last file we need to add in the Models folder is called Product.cs with the following code:

namespace Basket.API.Models
{
    public class Product
    {
        public string Id { get; set; }
        public string Name { get; set; }
        public string? ImageURL { get; set; }
        public string? HoverImageURL { get; set; }
        public string? ThumbnailImageURL { get; set; }
        public decimal? Price { get; set; }
        public int Rating { get; set; } = 0;

        public Product(string id, string name)
        {
            Id = id;
            Name = name;
        }
    }
}

In the project root, create a new folder called Services and add a new file called CatalogService.cs with the following code:

using System.Text.Json;
using Basket.API.Services.Interfaces;
using Basket.API.Models;

namespace Basket.API.Services
{
    public class CatalogService : ICatalogService
    {
        private HttpClient _httpClient;
        private JsonSerializerOptions _options;

        public CatalogService(HttpClient httpClient)
        {
            _httpClient = httpClient;

            _options = new JsonSerializerOptions
            {
                PropertyNameCaseInsensitive = true
            };
        }

        public async Task<Product> GetProduct(string productId)
        {
            return JsonSerializer.Deserialize<Product>(await _httpClient.GetStringAsync($"http://localhost:<port-number>/api/v1/catalog/items/{productId}"), _options);
        }
    }
}

Note Be sure to change the port number to the one you are using for the Catalog.API project.

Inside that same Services folder, create a new subfolder Interfaces and add a new file called ICatalogService.cs with the following code:

using Basket.API.Models;

namespace Basket.API.Services.Interfaces
{
    public interface ICatalogService
    {
        Task<Product> GetProduct(string productId);
    }
}

Last part is wiring up the API and for that we need to edit the Progam.cs file in the project root folder. Replace the current content with the following code:

using Basket.API.Services;
using Basket.API.Services.Interfaces;
using Microsoft.AspNetCore.Diagnostics.HealthChecks;
using Microsoft.Extensions.Diagnostics.HealthChecks;

namespace Basket.API;

public class Program
{
    public static ConfigurationManager Configuration { get; private set; }

    public static void Main(string[] args)
    {
        var builder = WebApplication.CreateBuilder(args);

        Configuration = builder.Configuration;

        builder.Services.AddControllers();
        builder.Services.AddEndpointsApiExplorer();
        builder.Services.AddSwaggerGen();
        builder.Services.AddHealthChecks();
        builder.Services.AddRouting(options => options.LowercaseUrls = true);

        builder.Services.AddHttpClient<ICatalogService, CatalogService>();

        var app = builder.Build();

        if (app.Environment.IsDevelopment())
        {
            app.UseSwagger();
            app.UseSwaggerUI();
        }

        //app.UseHttpsRedirection();
        app.UseRouting();
        app.UseAuthorization();

        app.MapHealthChecks("/health", new HealthCheckOptions
        {
            AllowCachingResponses = false,
            ResultStatusCodes =
            {
                [HealthStatus.Healthy] = StatusCodes.Status200OK,
                [HealthStatus.Degraded] = StatusCodes.Status200OK,
                [HealthStatus.Unhealthy] = StatusCodes.Status503ServiceUnavailable
            }
        });
        app.MapControllers();

        app.Run();
    }
}

If everything is done correctly, you should be able to run the API and see the Swagger UI at the following url:

https://localhost:<port-number>/swagger

Showing the newly created endpoint of our BasketController.

Swagger UI

An extra element we now have to verify, is checking if our API is healthy! This can be done by navigating to the following url:

https://localhost:<port-number>/health

It should show only 1 word: Healthy as result.

Note If you get an exception stating that the SSL certificate is from an untrustworthy source, perfom following 2 commands:

dotnet dev-certs https --clean
dotnet dev-certs https --trust

More details at https://learn.microsoft.com/en-us/aspnet/core/security/enforcing-ssl?view=aspnetcore-7.0&tabs=visual-studio#trust-the-aspnet-core-https-development-certificate-on-windows-and-macos.

Docker environment

The actual monitoring will be done through third party products, in this workshop we will work with Prometheus together with Blackbox. Both of these products are available as Docker images, so we will need to create a docker-compose.yml file to run them.

Prometheus will use Blackbox to monitor our API's and will store the metric results in it's internal database. In other words, the Prometheus config file will contain the URL's of the API's we want to monitor.

First create a new folder called otel, this will be our root folder where we will store all the docker files needed for this workshop. Inside the otel folder, create a new file called docker-compose.yml. We will fill the content of this file later on.

Blackbox docker container

Inside the same otel folder create a new file called blackbox.yml with following content:

modules:
  http_2xx:
    prober: http
    timeout: 5s
    http:
      method: GET
      preferred_ip_protocol: "ip4"

Open the docker-compose.yml file and add following content:

For windows

services:
  blackbox:
    # https://github.com/prometheus/blackbox_exporter/releases
    image: prom/blackbox-exporter:v0.24.0
    container_name: blackbox
    restart: unless-stopped
    ports:
      - 9115:9115
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - ./blackbox.yml:/etc/blackbox/blackbox.yml
    command: --config.file=/etc/blackbox/blackbox.yml
    networks:
      - otel

networks:
  otel:

For macos

services:
  blackbox:
    # https://github.com/prometheus/blackbox_exporter/releases
    image: prom/blackbox-exporter:v0.24.0
    container_name: blackbox
    restart: unless-stopped
    ports:
      - 9115:9115
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - ${PWD}/blackbox.yml:/etc/blackbox/blackbox.yml
    command: --config.file=/etc/blackbox/blackbox.yml
    networks:
      otel:

networks:
  otel:

Prometheus docker container

We will first prepare the Prometheus config file, it is here that we specify what endpoints to check and how often this should be done. The actual settings are grouped in Prometheus jobs and our config will contain only 1 job for the moment.

In the root otel folder create a new file called prometheus.yml with following content:

scrape_configs:
  - job_name: 'blackbox'
    scrape_interval: 5s
    metrics_path: /probe
    params:
      module: [http_2xx]  # Look for a HTTP 200 response.
    static_configs:
      - targets:
        - http://host.docker.internal:<port-number>/health #Catalog API
        - http://host.docker.internal:<port-number>/health #Basket API
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox:9115

Open the docker-compose.yml file and add the following content right above the networks section:

For windows

  prometheus:
    # https://github.com/prometheus/prometheus/releases
    image: prom/prometheus:v2.48.0
    container_name: prometheus
    hostname: prometheus
    restart: unless-stopped
    ports:
      - 9090:9090
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus:/prometheus
    depends_on:
      - blackbox
    networks:
      - otel

For macos

  prometheus:
    # https://github.com/prometheus/prometheus/releases
    image: prom/prometheus:v2.48.0
    container_name: prometheus
    hostname: prometheus
    restart: unless-stopped
    ports:
      - 9090:9090
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - ${PWD}/prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus:/prometheus
    depends_on:
      - blackbox
    networks:
      otel:

Underneath the networks section, add following content:

volumes:
  prometheus:

Now that both containers are configured, we can start them by running the following command:

docker-compose up -d
or
docker compose up -d

If everything was done correctly you should now be able to navigate to the following url's:

Note Could be Prometheus keep restarting, this is due to wrong permission settings for the prometheus folder inside the wsl environment. To fix this, follow the following procedure!

On a terminal ( for Windows just the Windows cmd or Terminal app ) run following command:

docker run -it --rm -v /var/lib/docker/volumes/otel/prometheus:/prometheus ubuntu:22.04

This will start an Ubuntu environment and on it's command prompt enter following:

chown -R 65534:65534 prometheus

This will set the needed permissions on the correct folder. Afterwards just enter exit to get out of the Ubuntuenvironment. And stop and start Prometheuswith following commands:

docker stop prometheus
docker start prometheus

If you look at the current result screen of Blackbox you shouls see that it is already trying to probe the API's, but each probe results in a failure. This is because the API's are not yet running!

Blackbox UI

To validate that the probing does actually work, manually start the Catalog and Basket API and refresh the Blackbox UI. After some time you should see successful probe results. A similar result can be seen in the Prometheus UI, in the Expression field enter probe_success or probe_http_status_code and click on Execute.

Prometheus UI Prometheus UI

Although this works great, monitoring through either one of these UI's is not very practical. We will therefore use Grafana to visualize the results.

Grafana docker container

Grafana is an open source data visualization tool created by Grafana Labs. As a user you have the ability to create dashboards that unify several panels showing your data through charts, graphs, tables, gauges and many more different options. It also incorporates a powerful query language called that allows you to query your data sources and even create alerts based on the results.

But without any data, Grafana is not very useful. So we need to link it with our Prometheus instance. This is done through adding Prometheus as a data source. Adding data sources can be done through the Grafana UI, but it is also possible to add them to your docker environment through provisioning! This means they will be automatically added when the container is started. Also dashboards and alerts can be added through provisioning.

Inside the root folder otel create a new folder called grafana and inside that folder create another folder called provisioning.
Inside the provisioning folder, again create a new folder called dashboards and inside that folder create a new file called dashboards.yml with following content:

apiVersion: 1

providers:
  - name: dashboards
    type: file
    updateIntervalSeconds: 30
    options:
      path: /etc/grafana/provisioning/dashboards
      foldersFromFilesStructure: true

In the same dashboards folder again create a new file called prometheusblackboxdashboard.json with following content:

Click to expand json block
{
  "annotations": {
    "list": [
      {
        "builtIn": 1,
        "datasource": "-- Grafana --",
        "enable": true,
        "hide": true,
        "iconColor": "rgba(0, 211, 255, 1)",
        "name": "Annotations & Alerts",
        "target": {
          "limit": 100,
          "matchAny": false,
          "tags": [],
          "type": "dashboard"
        },
        "type": "dashboard"
      }
    ]
  },
  "description": "Prometheus Blackbox Exporter Overview",
  "editable": true,
  "fiscalYearStartMonth": 0,
  "gnetId": 7587,
  "graphTooltip": 0,
  "id": 1,
  "iteration": 1647618346176,
  "links": [],
  "liveNow": false,
  "panels": [
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 24,
        "x": 0,
        "y": 0
      },
      "hiddenSeries": false,
      "id": 138,
      "legend": {
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.4.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "probe_duration_seconds{instance=~\"$target\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "{{ instance }}",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Global Probe Duration",
      "tooltip": {
        "shared": true,
        "sort": 1,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "s",
          "logBase": 1,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "collapsed": false,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
        "y": 8
      },
      "id": 15,
      "panels": [],
      "repeat": "target",
      "title": "$target status",
      "type": "row"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [
            {
              "options": {
                "0": {
                  "text": "DOWN"
                },
                "1": {
                  "text": "UP"
                }
              },
              "type": "value"
            },
            {
              "options": {
                "match": "null",
                "result": {
                  "text": "N/A"
                }
              },
              "type": "special"
            }
          ],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "#d44a3a",
                "value": null
              },
              {
                "color": "rgba(237, 129, 40, 0.89)",
                "value": 1
              },
              {
                "color": "#299c46",
                "value": 1
              }
            ]
          },
          "unit": "none"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 2,
        "w": 4,
        "x": 0,
        "y": 9
      },
      "id": 2,
      "links": [],
      "maxDataPoints": 100,
      "options": {
        "colorMode": "background",
        "graphMode": "none",
        "justifyMode": "auto",
        "orientation": "horizontal",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.4.3",
      "repeatDirection": "v",
      "targets": [
        {
          "expr": "probe_success{instance=~\"$target\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "refId": "A"
        }
      ],
      "title": "Status",
      "type": "stat"
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 6,
        "w": 10,
        "x": 4,
        "y": 9
      },
      "hiddenSeries": false,
      "id": 25,
      "legend": {
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.4.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "probe_http_duration_seconds{instance=~\"$target\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "{{ phase }}",
          "refId": "B"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "HTTP Duration",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "s",
          "logBase": 1,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "aliasColors": {},
      "bars": false,
      "dashLength": 10,
      "dashes": false,
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fill": 1,
      "fillGradient": 0,
      "gridPos": {
        "h": 6,
        "w": 10,
        "x": 14,
        "y": 9
      },
      "hiddenSeries": false,
      "id": 17,
      "legend": {
        "avg": false,
        "current": false,
        "max": false,
        "min": false,
        "show": true,
        "total": false,
        "values": false
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
      "options": {
        "alertThreshold": true
      },
      "percentage": false,
      "pluginVersion": "8.4.3",
      "pointradius": 5,
      "points": false,
      "renderer": "flot",
      "seriesOverrides": [],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
          "expr": "probe_duration_seconds{instance=~\"$target\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "legendFormat": "seconds",
          "refId": "A"
        }
      ],
      "thresholds": [],
      "timeRegions": [],
      "title": "Probe Duration",
      "tooltip": {
        "shared": true,
        "sort": 0,
        "value_type": "individual"
      },
      "type": "graph",
      "xaxis": {
        "mode": "time",
        "show": true,
        "values": []
      },
      "yaxes": [
        {
          "format": "s",
          "logBase": 1,
          "show": true
        },
        {
          "format": "short",
          "logBase": 1,
          "show": true
        }
      ],
      "yaxis": {
        "align": false
      }
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "decimals": 0,
          "mappings": [
            {
              "options": {
                "0": {
                  "text": "N/A"
                },
                "1": {
                  "text": "YES"
                }
              },
              "type": "value"
            },
            {
              "options": {
                "match": "null",
                "result": {
                  "text": "N/A"
                }
              },
              "type": "special"
            }
          ],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "#299c46",
                "value": null
              },
              {
                "color": "rgba(237, 129, 40, 0.89)",
                "value": 201
              },
              {
                "color": "#d44a3a",
                "value": 399
              }
            ]
          },
          "unit": "none"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 2,
        "w": 4,
        "x": 0,
        "y": 11
      },
      "id": 20,
      "links": [],
      "maxDataPoints": 100,
      "options": {
        "colorMode": "none",
        "graphMode": "none",
        "justifyMode": "auto",
        "orientation": "horizontal",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.4.3",
      "repeatDirection": "h",
      "targets": [
        {
          "expr": "probe_http_status_code{instance=~\"$target\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "refId": "A"
        }
      ],
      "title": "HTTP Status Code",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [
            {
              "options": {
                "match": "null",
                "result": {
                  "text": "N/A"
                }
              },
              "type": "special"
            }
          ],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "none"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 2,
        "w": 4,
        "x": 0,
        "y": 13
      },
      "id": 27,
      "links": [],
      "maxDataPoints": 100,
      "options": {
        "colorMode": "none",
        "graphMode": "none",
        "justifyMode": "auto",
        "orientation": "horizontal",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.4.3",
      "targets": [
        {
          "expr": "probe_http_version{instance=~\"$target\"}",
          "format": "time_series",
          "intervalFactor": 1,
          "refId": "A"
        }
      ],
      "title": "HTTP Version",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [
            {
              "options": {
                "0": {
                  "text": "NO"
                },
                "1": {
                  "text": "YES"
                }
              },
              "type": "value"
            },
            {
              "options": {
                "match": "null",
                "result": {
                  "text": "N/A"
                }
              },
              "type": "special"
            }
          ],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "#d44a3a",
                "value": null
              },
              {
                "color": "rgba(237, 129, 40, 0.89)",
                "value": 0
              },
              {
                "color": "#299c46",
                "value": 1
              }
            ]
          },
          "unit": "none"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 2,
        "w": 4,
        "x": 0,
        "y": 15
      },
      "id": 18,
      "links": [],
      "maxDataPoints": 100,
      "options": {
        "colorMode": "value",
        "graphMode": "none",
        "justifyMode": "auto",
        "orientation": "horizontal",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.4.3",
      "repeatDirection": "v",
      "targets": [
        {
          "expr": "probe_http_ssl{instance=~\"$target\"}",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "refId": "A"
        }
      ],
      "title": "SSL",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "decimals": 2,
          "mappings": [
            {
              "options": {
                "0": {
                  "text": "NO"
                },
                "1": {
                  "text": "YES"
                }
              },
              "type": "value"
            },
            {
              "options": {
                "match": "null",
                "result": {
                  "text": "N/A"
                }
              },
              "type": "special"
            }
          ],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "#d44a3a",
                "value": null
              },
              {
                "color": "rgba(237, 129, 40, 0.89)",
                "value": 0
              },
              {
                "color": "#299c46",
                "value": 1209600
              }
            ]
          },
          "unit": "dtdurations"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 2,
        "w": 10,
        "x": 4,
        "y": 15
      },
      "id": 19,
      "links": [],
      "maxDataPoints": 100,
      "options": {
        "colorMode": "value",
        "graphMode": "none",
        "justifyMode": "auto",
        "orientation": "horizontal",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.4.3",
      "repeatDirection": "h",
      "targets": [
        {
          "expr": "probe_ssl_earliest_cert_expiry{instance=~\"$target\"} - time()",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "refId": "A"
        }
      ],
      "title": "SSL Expiry",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [
            {
              "options": {
                "match": "null",
                "result": {
                  "text": "N/A"
                }
              },
              "type": "special"
            }
          ],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "s"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 2,
        "w": 5,
        "x": 14,
        "y": 15
      },
      "id": 23,
      "links": [],
      "maxDataPoints": 100,
      "options": {
        "colorMode": "none",
        "graphMode": "none",
        "justifyMode": "auto",
        "orientation": "horizontal",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.4.3",
      "targets": [
        {
          "expr": "avg(probe_duration_seconds{instance=~\"$target\"})",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "refId": "A"
        }
      ],
      "title": "Average Probe Duration",
      "type": "stat"
    },
    {
      "datasource": {
        "type": "prometheus",
        "uid": "prometheus"
      },
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "thresholds"
          },
          "mappings": [
            {
              "options": {
                "match": "null",
                "result": {
                  "text": "N/A"
                }
              },
              "type": "special"
            }
          ],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              },
              {
                "color": "red",
                "value": 80
              }
            ]
          },
          "unit": "s"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 2,
        "w": 5,
        "x": 19,
        "y": 15
      },
      "id": 24,
      "links": [],
      "maxDataPoints": 100,
      "options": {
        "colorMode": "none",
        "graphMode": "none",
        "justifyMode": "auto",
        "orientation": "horizontal",
        "reduceOptions": {
          "calcs": [
            "lastNotNull"
          ],
          "fields": "",
          "values": false
        },
        "textMode": "auto"
      },
      "pluginVersion": "8.4.3",
      "repeatDirection": "h",
      "targets": [
        {
          "expr": "avg(probe_dns_lookup_time_seconds{instance=~\"$target\"})",
          "format": "time_series",
          "interval": "$interval",
          "intervalFactor": 1,
          "refId": "A"
        }
      ],
      "title": "Average DNS Lookup",
      "type": "stat"
    }
  ],
  "refresh": "10s",
  "schemaVersion": 35,
  "style": "dark",
  "tags": [
    "blackbox",
    "prometheus"
  ],
  "templating": {
    "list": [
      {
        "auto": true,
        "auto_count": 10,
        "auto_min": "10s",
        "current": {
          "selected": false,
          "text": "10s",
          "value": "10s"
        },
        "hide": 0,
        "label": "Interval",
        "name": "interval",
        "options": [
          {
            "selected": false,
            "text": "auto",
            "value": "$__auto_interval_interval"
          },
          {
            "selected": false,
            "text": "5s",
            "value": "5s"
          },
          {
            "selected": true,
            "text": "10s",
            "value": "10s"
          },
          {
            "selected": false,
            "text": "30s",
            "value": "30s"
          },
          {
            "selected": false,
            "text": "1m",
            "value": "1m"
          },
          {
            "selected": false,
            "text": "10m",
            "value": "10m"
          },
          {
            "selected": false,
            "text": "30m",
            "value": "30m"
          },
          {
            "selected": false,
            "text": "1h",
            "value": "1h"
          },
          {
            "selected": false,
            "text": "6h",
            "value": "6h"
          },
          {
            "selected": false,
            "text": "12h",
            "value": "12h"
          },
          {
            "selected": false,
            "text": "1d",
            "value": "1d"
          },
          {
            "selected": false,
            "text": "7d",
            "value": "7d"
          },
          {
            "selected": false,
            "text": "14d",
            "value": "14d"
          },
          {
            "selected": false,
            "text": "30d",
            "value": "30d"
          }
        ],
        "query": "5s,10s,30s,1m,10m,30m,1h,6h,12h,1d,7d,14d,30d",
        "refresh": 2,
        "skipUrlSync": false,
        "type": "interval"
      },
      {
        "current": {
          "selected": false,
          "text": "All",
          "value": "$__all"
        },
        "datasource": {
          "type": "prometheus",
          "uid": "prometheus"
        },
        "definition": "",
        "hide": 0,
        "includeAll": true,
        "multi": true,
        "name": "target",
        "options": [],
        "query": {
          "query": "label_values(probe_success, instance)",
          "refId": "Prometheus-target-Variable-Query"
        },
        "refresh": 1,
        "regex": "",
        "skipUrlSync": false,
        "sort": 0,
        "tagValuesQuery": "",
        "tagsQuery": "",
        "type": "query",
        "useTags": false
      }
    ]
  },
  "time": {
    "from": "now-1h",
    "to": "now"
  },
  "timepicker": {
    "refresh_intervals": [
      "5s",
      "10s",
      "30s",
      "1m",
      "5m",
      "15m",
      "30m",
      "1h",
      "2h",
      "1d"
    ],
    "time_options": [
      "5m",
      "15m",
      "1h",
      "6h",
      "12h",
      "24h",
      "2d",
      "7d",
      "30d"
    ]
  },
  "timezone": "",
  "title": "Prometheus Blackbox Exporter",
  "uid": "xtkCtBkiz",
  "version": 1,
  "weekStart": ""
}

Inside the provisioning folder, create a second folder called datasources and inside that create a file called datasources.yml with the following contents:

apiVersion: 1

datasources:
- name: Prometheus
  type: prometheus
  uid: prometheus
  access: proxy
  url: http://prometheus:9090
  jsonData:
    timeInterval: 10s
  basicAuth: false
  isDefault: true
  version: 1
  editable: false

Open the docker-compose.yml file and add the following content right above the networks section:

For windows

  grafana:
    # https://github.com/grafana/grafana/releases
    image: grafana/grafana:10.2.2
    container_name: grafana
    environment:
      - GF_AUTH_ANONYMOUS_ENABLED=true
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
      - GF_AUTH_DISABLE_LOGIN_FORM=true
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - ./grafana/provisioning:/etc/grafana/provisioning
      - grafana-data:/var/lib/grafana
    restart: unless-stopped
    user: root
    ports:
      - 3001:3000
    depends_on:
      - prometheus
    networks:
      otel:

For macos

  grafana:
    # https://github.com/grafana/grafana/releases
    image: grafana/grafana:10.2.2
    container_name: grafana
    #command:
    environment:
      - GF_AUTH_ANONYMOUS_ENABLED=true
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
      - GF_AUTH_DISABLE_LOGIN_FORM=true
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - grafana-data:/var/lib/grafana
      - ${PWD}/grafana/provisioning:/etc/grafana/provisioning
    restart: unless-stopped
    ports:
      - 3001:3000
    depends_on:
      - prometheus
    networks:
      otel:

Change the volumes section to the followin content:

volumes:
  prometheus:
  grafana-data:

After adjusting the docker-compose.yml file, run the docker compose command again to start the containers:

docker-compose up -d
or
docker composer up -d

If everything went well, you should be able to access Grafana at http://localhost:3001 and see the following screen:

Grafana UI

Because we already provisioned the Promehteus datasource, you should be able to see it in the list of datasources and the Prometheus Blackbox Exporter dashboard should be available in the list of dashboards.

Select the dashboard, it should now show the metrics from the blackbox exporter in a great looking dashboard.

Grafana UI

Lab 2 - Alerting

Grafana environment

Now that we are all set and able to monitor our services, we can start to think about how we can be alerted when something goes wrong. In this lab we will be using Grafana to create alerts, but there are also other options available, such as Prometheus Alertmanager.

You can provision alerts, like we did for the dashboard, but for this lab we will create one manually.

Creating an alert

In the Grafana UI, navigate to Alerting, click on the Alert rules section and click on New alert rule.

Grafana UI

Give the alert a meaningful name Service health.

On the A section, select the Prometheus data source. For the Metric field, select the probe_success value, because it is on this metric that we will base our alert on. You can already verify if the alert is able to get necessary data by clicking on the Run queries button. It should show you a graph with the actual probe_success metric. In my case, the image preview below, the value for both services is 0, meaning the services are not available!

Grafana UI

In the B section, we need to define an operation so that we get a meaningful value where we can base our alert condition on. In this case select the Reduce operation and select Last as the reducer with A as input and Strict as Mode. This will give us the last value of the metric.

In the C section, we will define the alert condition. As input select B and for the expression enter IS BELOW 1.
This will check if the last value of the metric is less than 1. If it is, it means that the service is not available and we want the alert to fire.

Grafana UI

Now that our alert condition is set, we need to define how frequent Grafana has to evaluate it. This is done in the Set evaluation behavior part.
First we need to select a folder to store our rule, press the New folder button and create a new folder called Alerts.

Grafana UI

The alert evaluation interval is defined with the evaluation group, press the New evaluation group button and set the Evaluation interval value to 10s. Also give the group a name, I also called it Alerts.

Grafana UI

Last element to fill in is the Pending period, with this you can define how long the alert needs to be in a firing state before it is actually fired. In our case, we will set it to 30s.

Grafana UI

In the Add annotations part you can add extra information that will be shown when the alert fires. As an example, I'm using the $labels.instance variable so that the name of the service that is down is shown in the alert.

Lastly, you can add extra custom labels, these are handy for routing the alerts. Because depending on those labels, different contact points can be auto selected.

Grafana UI

When everthing is done correctly, your alert will be shown in the Alerts folder we just created and if the services are down, the alert will fire after the evaluation period.

Grafana UI

Contact points

Alerts that fire will trigger a message, this message needs to be sent to a Contact point. There are several options available in the contact point type dropdown. If you want to add one for Microsoft Teams you will need to provide a webhook URL. You can find this URL in the Incoming Webhook section of your Microsoft Teams channel. For more info on how to create a webhook URL, check out the Microsoft Teams documentation.

Once you defined several Contact points, you can route your alerts with the Notifications policies depending on the alert type you set you can change the contact point if needed.

Lab 3 - Logging

Even though later in the lab we will cover tracing to get a birds eye view of how request flow through our production environment, we will first add logging to each separate service to get detailed information about what is happening. For this lab we will use the Serilog library. Serilog is a logging framework that allows us to write logs to a variety of different targets, called sinks.

Our logging solution will be based on OpenTelemetry. OpenTelemetry is an open-source project that provides SDKs, libraries, agents, and collector services to capture logs, distributed traces and metrics from your application. In our case, we need to add OpenTelemetry to our services, so that the necessary information is collected and sent to our tracing solution.

Services

Let's start with the Basket.API service, open the Basket.API.csproj file and add the following PackageReferences:

<PackageReference Include="Serilog.AspNetCore" Version="7.0.0" />
<PackageReference Include="Serilog.Sinks.Console" Version="4.1.0" />
<PackageReference Include="Serilog.Sinks.OpenTelemetry" Version="1.1.0" />

<PackageReference Include="OpenTelemetry" Version="1.6.0" />
<PackageReference Include="OpenTelemetry.Exporter.Console" Version="1.6.0" />
<PackageReference Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" Version="1.6.0" />
<PackageReference Include="OpenTelemetry.Extensions.Hosting" Version="1.6.0" />
<PackageReference Include="OpenTelemetry.Instrumentation.AspNetCore" Version="1.5.1-beta.1" />
<PackageReference Include="OpenTelemetry.Instrumentation.Http" Version="1.5.1-beta.1" />
<PackageReference Include="OpenTelemetry.Instrumentation.Runtime" Version="1.5.1" />

Next, open the appsettings.json file and add replace the Logging section with the SeriLog section:

  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
    }
  },
  "Serilog": {
    "MinimumLevel": {
      "Default": "Information",
      "Override": {
        "Microsoft": "Warning",
        "Microsoft.AspNetCore": "Warning",
        "Microsoft.Hosting.Lifetime": "Information",
        "System": "Warning"
      }
    },
    "Using": [ "Serilog.Sinks.Console" ],
    "WriteTo": [
      {
        "Name": "Console"
      }
    ]
  },
  "Otlp": {
    "ServiceName": "BasketAPI",
    "Endpoint": "http://localhost:4317"
  },

Lastly, open the Program.cs file and add the following using statement:

using Serilog;

Also in the Program.cs file, underneath the builder.Services.AddRouting(options => options.LowercaseUrls = true); line add the following code:

builder.Host.UseSerilog((hostingContext, loggerConfiguration) => loggerConfiguration
    .ReadFrom.Configuration(hostingContext.Configuration)
    .WriteTo.OpenTelemetry(options =>
    {
        options.Endpoint = "http://localhost:4317/v1/logs";
        options.Protocol = Serilog.Sinks.OpenTelemetry.OtlpProtocol.Grpc;
        options.ResourceAttributes = new Dictionary<string, object>
        {
            ["service.name"] = Configuration.GetValue<string>("Otlp:ServiceName")
        };
    }));

This will tell Serilog to write its logs in the OpenTelemetry format to the OpenTelemetry Collector.

At the bottom of the Program.cs file, underneath the app.UseAuthorization(); line add the following code:

app.UseSerilogRequestLogging();

Same setup must be applied to the Catalog.API service, open the Catalog.API.csproj file and add the following PackageReferences:

<PackageReference Include="Serilog.AspNetCore" Version="7.0.0" />
<PackageReference Include="Serilog.Sinks.Console" Version="4.1.0" />
<PackageReference Include="Serilog.Sinks.OpenTelemetry" Version="1.1.0" />

<PackageReference Include="OpenTelemetry" Version="1.6.0" />
<PackageReference Include="OpenTelemetry.Exporter.Console" Version="1.6.0" />
<PackageReference Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" Version="1.6.0" />
<PackageReference Include="OpenTelemetry.Extensions.Hosting" Version="1.6.0" />
<PackageReference Include="OpenTelemetry.Instrumentation.AspNetCore" Version="1.5.1-beta.1" />
<PackageReference Include="OpenTelemetry.Instrumentation.Http" Version="1.5.1-beta.1" />
<PackageReference Include="OpenTelemetry.Instrumentation.Runtime" Version="1.5.1" />

Next, open the appsettings.json file and add replace the Logging section with the SeriLog section:

  "Logging": {
    "LogLevel": {
      "Default": "Information",
      "Microsoft.AspNetCore": "Warning"
    }
  },
  "Serilog": {
    "MinimumLevel": {
      "Default": "Information",
      "Override": {
        "Microsoft": "Warning",
        "Microsoft.AspNetCore": "Warning",
        "Microsoft.Hosting.Lifetime": "Information",
        "System": "Warning"
      }
    },
    "Using": [ "Serilog.Sinks.Console" ],
    "WriteTo": [
      {
        "Name": "Console"
      }
    ]
  },
  "Otlp": {
    "ServiceName": "CatalogAPI",
    "Endpoint": "http://localhost:4317"
  },

Lastly, open the Program.cs file and add the following using statement:

using Serilog;

Also in the Program.cs file, underneath the builder.Services.AddRouting(options => options.LowercaseUrls = true); line add the following code:

builder.Host.UseSerilog((hostingContext, loggerConfiguration) => loggerConfiguration
    .ReadFrom.Configuration(hostingContext.Configuration)
    .WriteTo.OpenTelemetry(options =>
    {
        options.Endpoint = "http://localhost:4317/v1/logs";
        options.Protocol = Serilog.Sinks.OpenTelemetry.OtlpProtocol.Grpc;
        options.ResourceAttributes = new Dictionary<string, object>
        {
            ["service.name"] = Configuration.GetValue<string>("Otlp:ServiceName")
        };
    }));

This will tell Serilog to write its logs in the OpenTelemetry format to the OpenTelemetry Collector.

At the bottom of the Program.cs file, underneath the app.UseAuthorization(); line add the following code:

app.UseSerilogRequestLogging();

Docker environment

OpenTelemetry Collector docker container

The OpenTelemetry Collector is a vendor-agnostic service that allows you to receive, process and export telemetry data. It is designed to be highly performant, with low resource consumption. The Collector can be run as an agent on the host or as a standalone service. More details can be found in the Collector documentation.

First let us add the configuration file for the OpenTelemetry Collector to our project. In the root folder called otel, create a new file called otel.yml with following content:

receivers:
  otlp:
    protocols:
      grpc:
      http:

exporters:
  loki:
    endpoint: http://loki:3100/loki/api/v1/push
    tls:
      insecure: true

service:
  pipelines:
    logs:
      receivers: [otlp]
      exporters: [loki]

Now let's add the OpenTelemetry Collector to our docker-compose file. Open the docker-compose.yml file in the root of your project and add the following code above the networks section:

For Windows

  otel-collector:
    # https://github.com/open-telemetry/opentelemetry-collector-contrib/releases
    container_name: otel
    hostname: otel
    image: otel/opentelemetry-collector-contrib:0.90.1
    command: ["--config=/etc/otel-collector-config.yaml"]
    volumes:
      - ./otel.yml:/etc/otel-collector-config.yaml
    restart: unless-stopped
    ports:
      - "4317:4317"   # OTLP gRPC receiver
    networks:
      otel:

For Mac

  otel-collector:
    # https://github.com/open-telemetry/opentelemetry-collector-contrib/releases
    container_name: otel
    hostname: otel
    image: otel/opentelemetry-collector-contrib:0.84.2
    command: ["--config=/etc/otel-collector-config.yaml"]
    volumes:
      - ${PWD}/otel.yml:/etc/otel-collector-config.yaml
    restart: unless-stopped
    ports:
      - "8888:8888"   # Prometheus metrics exposed by the collector
      - "8889:8889"   # Prometheus exporter metrics
      - "4317:4317"   # OTLP gRPC receiver
      - "9200:55679"  # zpages
    networks:
      otel:

Our logs will be stored inside another Grafana product called Loki, it is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus. It does not index the contents of the logs, but rather a set of labels for each log stream. In the lab we will be using the Loki docker image, so we need to add the Loki service to our docker-compose.yml file.

Loki docker container

Like most other tools we will be using, also Loki needs a configuration file. So in the root folder create a new file called loki.yml and add the following content:

auth_enabled: false

server:
  http_listen_port: 3100
  # grpc_listen_port: 9096

common:
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    # instance_addr: 127.0.0.1
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

We will be using the filesystem storage for Loki, there are also database options available, but for this lab we will keep it simple.

Next up, we need to add the Loki container, open the docker-compose.yml file and add the following above the networkssection:

For Windows

  loki:
    # https://github.com/grafana/loki/releases
    image: grafana/loki:2.9.2
    container_name: loki
    command: -config.file=/etc/loki/local-config.yaml
    volumes:
      - /etc/localtime:/etc/localtime:ro      
      - ./loki.yml:/etc/loki/local-config.yaml
      - loki:/data/loki
    restart: unless-stopped
    ports:
      - 3100:3100
    networks:
      otel:

For Mac

  loki:
    # https://github.com/grafana/loki/releases
    image: grafana/loki:2.9.2
    container_name: loki
    command: -config.file=/etc/loki/local-config.yaml
    volumes:
      - /etc/localtime:/etc/localtime:ro      
      - ${PWD}/loki.yml:/etc/loki/local-config.yaml
      - loki:/data/loki
    restart: unless-stopped
    ports:
      - 3100:3100
    networks:
      otel:

Also adjust the Grafana depends_on section in the docker-compose.yml file to include the loki service:

    depends_on:
      - prometheus
      - loki

And add Loki to the volumes section:

volumes:
  prometheus:
  loki:
  grafana-data:

To be able to view the logs, we will again link our Grafana environment with a new datasource! Open the grafana folder in the root of your project and inside the folder provisioning open the folder datasources, in here open the datasources.yml file. Add the following code to the end of the file:

- name: Loki
  type: loki
  uid: loki
  access: proxy
  url: http://loki:3100
  basicAuth: false
  isDefault: false
  version: 1
  editable: false
  jsonData:
    derivedFields:
      - datasourceUid: tempo
        matcherRegex: "\u0022traceid\u0022:\u0022(\\w+)\u0022"
        name: TraceId
        url: '$${__value.raw}'

As an extra bonus, we added a Regex expression to the jsonData section, this will allow us to link the TraceId from the Tempo datasource to the Loki datasource. This will allow us to view the complete trace for a specific TraceId that generated the log line.

Note This link to an actual trace will only work after you have completed the tracing lab!

Grafana UI

Lab 4 - Tracing

One of the big advantages of having observability is that you can follow what is happening in your application. This is especially useful when you have multiple services that are communicating with each other. The way to accomplish this is by using tracing. Tracing is a way to follow the flow of a request from the moment it is received by the first service to the moment it is returned by the last service. This is done by adding tracing information to the request and passing it along to the next service.

Our distributed tracing solution will be based on OpenTelemetry. OpenTelemetry is an open-source project that provides SDKs, libraries, agents, and collector services to capture logs, distributed traces and metrics from your application. In our case, we need to add OpenTelemetry to our services, so that the necessary information is collected and sent to our tracing solution.

Services

Open the Basket.API project and open the Program.cs file and add the following using statements at the top of the file:

using OpenTelemetry.Resources;
using OpenTelemetry.Trace;

Inside the main method, add the following code above the var app = builder.Build(); line:

        Action<ResourceBuilder> appResourceBuilder =
            resource => resource
                .AddTelemetrySdk()
                .AddService(Configuration.GetValue<string>("Otlp:ServiceName"));

        builder.Services.AddOpenTelemetry()
            .ConfigureResource(appResourceBuilder)
            .WithTracing(builder => builder
                .AddAspNetCoreInstrumentation()
                .AddHttpClientInstrumentation()
                .AddSource("APITracing")
                //.AddConsoleExporter()
                .AddOtlpExporter(options => options.Endpoint = new Uri(Configuration.GetValue<string>("Otlp:Endpoint")))
            );

This will add the OpenTelemetry SDK to our application and configure it to send the collected traces to our tracing solution.

Now perform the same actions to the Catalog.API project. Open the Program.cs file and add the following using statements at the top of the file:

using OpenTelemetry.Resources;
using OpenTelemetry.Trace;

Inside the main method, add the following code below the var app = builder.Build(); line:

        Action<ResourceBuilder> appResourceBuilder =
            resource => resource
                .AddTelemetrySdk()
                .AddService(Configuration.GetValue<string>("Otlp:ServiceName"));

        builder.Services.AddOpenTelemetry()
            .ConfigureResource(appResourceBuilder)
            .WithTracing(builder => builder
                .AddAspNetCoreInstrumentation()
                .AddHttpClientInstrumentation()
                .AddSource("APITracing")
                //.AddConsoleExporter()
                .AddOtlpExporter(options => options.Endpoint = new Uri(Configuration.GetValue<string>("Otlp:Endpoint")))
            );

Docker environment

OpenTelemetry Collector docker container

We are going to adjust the otel.yml file with following content:

receivers:
  otlp:
    protocols:
      grpc:
      http:

exporters:
  loki:
    endpoint: http://loki:3100/loki/api/v1/push
    tls:
      insecure: true
  otlp:
    endpoint: tempo:4007
    tls:
      insecure: true      

service:
  pipelines:
    logs:
      receivers: [otlp]
      exporters: [loki]
    traces:
      receivers: [otlp]
      exporters: [otlp]

As you can see we added 1 receiver and 1 exporter and are using them for the traces pipeline. The receiver is configured to receive data via gRPC. The exporter is configured to send the data to the Tempo service on port 4007. The tls section is configured to use insecure mode, which means that we are not using any certificates for the communication between the OpenTelemetry Collector and the tempo service.

Tempo docker container

We will use Tempo as our tracing solution. Tempo is an open-source distributed tracing system. It is a very lightweight solution that is easy to set up and use.

Tempo also has some configuration settings that we need to supply through a yaml file. Inside the root of the otel folder create a new file called tempo.yml with following content:

server:
  http_listen_port: 3200

distributor:
  receivers:
    otlp:
      protocols:
        http:
        grpc:
          endpoint: tempo:4007

ingester:
  max_block_duration: 5m               # cut the headblock when this much time passes. this is being set for demo purposes and should probably be left alone normally

compactor:
  compaction:
    block_retention: 1h                # overall Tempo trace retention. set for demo purposes, default is 14 days

storage:
  trace:
    backend: local                     # backend configuration to use
    wal:
      path: /tmp/tempo/wal             # where to store the the wal locally
    local:
      path: /tmp/tempo/blocks

In this configuration file we indicate that we want to use the OTLP protocol to send traces to our Tempo instance. We also configure the Tempo instance to use the local storage backend. This means that all traces will be stored in the local file system. Currently we configured the compactor to have a retention of 1 hour. This means that all traces older than 1 hour will be deleted. This is not a good configuration for production environments, but for our demo it is fine.

Let's add the Tempo container to our docker-compose file. Open the docker-compose.yml file in the root folder and add the following code underneath the prometheus container:

For windows

  tempo:
    # https://github.com/grafana/tempo/releases
    image: grafana/tempo:2.2.3
    container_name: tempo
    command: [ "-config.file=/etc/tempo.yaml" ]
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - ./tempo.yml:/etc/tempo.yaml
      - tempo:/tmp/tempo
    restart: unless-stopped
    ports:
      - 3200:3200  # tempo
      - 4007:4317  # otlp grpc
    depends_on:
      - otel-collector
    networks:
      otel:

For macos

  tempo:
    # https://github.com/grafana/tempo/releases
    image: grafana/tempo:2.2.3
    container_name: tempo
    command: [ "-config.file=/etc/tempo.yaml" ]
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - ${PWD}/tempo.yml:/etc/tempo.yaml
      - tempo:/tmp/tempo
    restart: unless-stopped
    ports:
      - 3200:3200  # tempo
      - 4007:4317  # otlp grpc
    depends_on:
      - otel-collector      
    networks:
      otel:

Also adjust the Grafana depends_on section in the docker-compose.yml file to include the tempo service:

    depends_on:
      - prometheus
      - loki
      - tempo

And add Tempo to the volumes section:

volumes:
  prometheus:
  loki:
  tempo:
  grafana-data:

Same way how we added Prometheus as a datasource, we now need to add Tempo as a datasource to our Grafana instance. Open the grafana folder in the root of your project and inside the folder provisioning open the folder datasources, in here open the datasources.yml file. Add the following code to the end of the file:

- name: Tempo
  type: tempo
  uid: tempo
  access: proxy
  url: http://tempo:3200
  basicAuth: false
  isDefault: false
  version: 1
  editable: false

To be sure that all the config settings are picked up, we first need to bring down our docker compose environment. So in the terminal run following command:

docker-compose down
or
docker compose down

After the containers are stopped, we can start them again with the following command:

docker-compose up -d
or
docker composer up -d

When you now run the post action of the Basket.API you should be able to see this as a complete trace in Grafana. To get this overview we have to open the Explore section in Grafana and select Tempo as the datasource. In the Explore section you should see the trace that we just created through the search option:

Grafana UI

Lab 5 - One more thing

As an extra bonus we will add a different type of service, just to show you that you can observe different kind of technology stacks. Many production environments will have some kind of message queue up and running, so we will add a RabbitMQ service to our monitoring stack to emulate this.

Services

To easily use RabbitMQ, we will first create a class library that we can use in different projects. It will allow us to setup a RabbitMQ connection and send and receive messages. So in the root of the project folder run following command line command:

dotnet new classlib --name RabbitMQFactory

First we need to add the RabbitMQ nuget package to our project. We will use the RabbitMQ.Client package. Open the RabbitMQFactory.csproj file and add the following lines to underneath the end tag </PropertyGroup>:

<ItemGroup>
    <PackageReference Include="RabbitMQ.Client" Version="6.6.0" />
</ItemGroup>

Rename Class1.cs to RabbitMQFactory.cs and remove all code with the following:

using RabbitMQFactory.Models;
using RabbitMQFactory.Models.Interfaces;
using RabbitMQ.Client;

namespace RabbitMQFactory;

public class RabbitMQFactory
{
    private static ConnectionFactory _factory;
    private static IConnection _connection;
    private static IModel _channel;

    public static IBus CreateBus(string hostName)
    {
        _factory = new ConnectionFactory() { HostName = hostName, DispatchConsumersAsync = true };
        _connection = _factory.CreateConnection();
        _channel = _connection.CreateModel();

        return new RabbitMQBus(_channel);
    }
}

Add a new folder Models in the root of the project and inside it another folder Interfaces. Add a new file IBus.cs with the following content:

using RabbitMQ.Client;
using RabbitMQ.Client.Events;

namespace RabbitMQFactory.Models.Interfaces;

public interface IBus
{
    IBasicProperties GetBasicProperties();

    Task SendAsync<T>(string queue, T message, IBasicProperties basicProperties = null);

    Task ReceiveAsync<T>(string queue, Action<T, BasicDeliverEventArgs> onMessage);
}

Inside the folder Models add a new file BusType.cs with the following content:

namespace RabbitMQFactory.Models;

public class BusType
{
    public static string LocalHost { get; } = "localhost";
}

Again in the folder Models add a new file LocationRequest.cs with the following content:

namespace RabbitMQFactory.Models;

public class LocationRequest
{
    public double Latitude { get; set; }
    public double Longitude { get; set; }
}

Again in the folder Models add a new file QueueType.cs with the following content:

namespace RabbitMQFactory.Models;

public class QueueType
{
    public static string Processing { get; } = "Processing";
}

And for the last time in the folder Models add a new file RabbitMQBus.cs with the following content:

using System;
using System.Text;
using System.Text.Json;
using RabbitMQFactory.Models.Interfaces;
using RabbitMQ.Client;
using RabbitMQ.Client.Events;

namespace RabbitMQFactory.Models;

public class RabbitMQBus : IBus
{
    private IModel _channel;

    public RabbitMQBus(IModel channel)
    {
        _channel = channel;
    }

    public IBasicProperties GetBasicProperties() => _channel.CreateBasicProperties();

    public async Task ReceiveAsync<T>(string queue, Action<T, BasicDeliverEventArgs> onMessage)
    {
        _channel.QueueDeclare(
            queue: queue,
            durable: true,
            exclusive: false,
            autoDelete: false
        );

        var consumer = new AsyncEventingBasicConsumer(_channel);

        consumer.Received += async (s, e) =>
        {
            var jsonSpecified = Encoding.UTF8.GetString(e.Body.Span);
            var item = JsonSerializer.Deserialize<T>(jsonSpecified);

            onMessage(item, e);

            await Task.Yield();
        };

        _channel.BasicConsume(queue, true, consumer);

        await Task.Yield();
    }

    public async Task SendAsync<T>(string queue, T message, IBasicProperties basicProperties = null)
    {
        await Task.Run(() =>
        {
            _channel.QueueDeclare(queue, true, false, false);

            var properties = basicProperties ?? _channel.CreateBasicProperties();
            properties.Persistent = false;

            var output = JsonSerializer.Serialize(message);

            _channel.BasicPublish(string.Empty, queue, properties, Encoding.UTF8.GetBytes(output));
        });
    }

    public Task SendAsync<T>(string queue, T message)
    {
        throw new NotImplementedException();
    }
}

To be sure all code was added correctly, run a dotnet build in the root of the project. It should build without errors.

Now that we have our RabbitMQ class library, we can create a new project that will use this library to receive messages and act upon them. For demo purposes we will use a console application as our RabbitMQ consumer. Again in the root of the project folder run following command line command:

dotnet new console --name RabbitMQConsumer --use-program-main

First we need to add some nuget packages and a project reference to the RabbitMQFactory class library. Open the RabbitMQConsumer.csproj file and add the following lines to underneath the end tag </PropertyGroup>:

  <ItemGroup>
    <PackageReference Include="Microsoft.Extensions.Configuration" Version="7.0.0" />
    <PackageReference Include="Microsoft.Extensions.Hosting" Version="7.0.1" />

    <PackageReference Include="OpenTelemetry" Version="1.6.0" />
    <PackageReference Include="OpenTelemetry.Extensions.Hosting" Version="1.6.0" />
    <PackageReference Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" Version="1.6.0" />

    <PackageReference Include="Serilog.AspNetCore" Version="7.0.0" />
    <PackageReference Include="Serilog.Sinks.Console" Version="4.1.0" />
    <PackageReference Include="Serilog.Sinks.Grafana.Loki" Version="8.0.1" />
    <PackageReference Include="Serilog.Enrichers.Span" Version="2.3.0" />    
  </ItemGroup>
  <ItemGroup>
    <ProjectReference Include="..\RabbitMQFactory\RabbitMQFactory.csproj" />
  </ItemGroup>
  <ItemGroup>
    <Content Include="appsettings.json">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
    </Content>
  </ItemGroup>

Open the Program.cs file and remove all code with the following:

namespace RabbitMQConsumer;

using System;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using OpenTelemetry.Resources;
using OpenTelemetry.Trace;
using Serilog;

class Program
{
    private static IConfiguration Configuration { get; set; }

    static void Main(string[] args)
    {
        Configuration = new ConfigurationBuilder()
            .AddJsonFile("appsettings.json")
            .AddEnvironmentVariables()
            .AddCommandLine(args)
            .Build();

        CreateHostBuilder(args).Build().Run();
    }

    private static IHostBuilder CreateHostBuilder(string[] args) =>
        Host.CreateDefaultBuilder(args)
            .ConfigureServices((hostContext, services) =>
            {
                services.AddOpenTelemetry()
                    .WithTracing(builder =>
                    {
                        builder.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService(Configuration.GetValue<string>("Otlp:ServiceName")))
                            .AddSource("APITracing")
                            .AddOtlpExporter(options => options.Endpoint = new Uri(Configuration.GetValue<string>("Otlp:Endpoint")));
                    });
                services.AddHostedService<Worker>();
            })
            .UseSerilog((hostingContext, loggerConfiguration) => loggerConfiguration.ReadFrom.Configuration(hostingContext.Configuration));
}

In the root of the project add a new file called Worker.cs with the following content:

namespace RabbitMQConsumer;

using System.Diagnostics;
using System.Text;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using OpenTelemetry.Context.Propagation;
using RabbitMQ.Client;
using RabbitMQ.Client.Events;
using RabbitMQFactory;
using RabbitMQFactory.Models.Interfaces;
using RabbitMQFactory.Models;
using OpenTelemetry;

public class Worker : BackgroundService
{
    private readonly ILogger<Worker> _logger;
    private readonly IBus _bus;

    //Important: The name of the Activity should be the same as the name of the Source added in the Web API startup AddOpenTelemetryTracing Builder
    private static readonly ActivitySource Activity = new("APITracing");
    private static readonly TextMapPropagator Propagator = new TraceContextPropagator();

    public Worker(ILogger<Worker> logger)
    {
        _logger = logger;
        _bus = RabbitMQFactory.CreateBus(BusType.LocalHost);
    }

    protected override async Task ExecuteAsync(CancellationToken cancellationToken)
    {
        await _bus.ReceiveAsync<LocationRequest>(QueueType.Processing, (message, args) =>
        {
            Task.Run(() => { ProcessMessage(message, args); }, cancellationToken);
        });
    }

    private void ProcessMessage(LocationRequest message, BasicDeliverEventArgs args)
    {
        var parentContext = Propagator.Extract(default, args.BasicProperties, ExtractTraceContextFromBasicProperties);
        Baggage.Current = parentContext.Baggage;

        using (var activity = Activity.StartActivity("Process Message", ActivityKind.Consumer, parentContext.ActivityContext))
        {
            AddActivityTags(activity);
            _logger.LogInformation($"Message received location: {message.Latitude} - {message.Longitude}");
        }
    }

    private IEnumerable<string> ExtractTraceContextFromBasicProperties(IBasicProperties props, string key)
    {
        try
        {
            if (props.Headers.TryGetValue(key, out var value))
            {
                var bytes = value as byte[];
                return new[] { Encoding.UTF8.GetString(bytes) };
            }
        }
        catch (Exception ex)
        {
            _logger.LogError($"Failed to extract trace context: {ex}");
        }

        return Enumerable.Empty<string>();
    }

    private void AddActivityTags(Activity activity)
    {
        activity?.SetTag("messaging.system", "rabbitmq");
        activity?.SetTag("messaging.destination_kind", "queue");
        activity?.SetTag("messaging.rabbitmq.queue", "sample");
        activity?.SetTag("messaging.destination", string.Empty);
        activity?.SetTag("messaging.rabbitmq.routing_key", QueueType.Processing);
    }
}

Lastly we need to have our configuration inside an appsettings.json file. Add a new file called appsettings.json to the root of the project and add the following content:

{
    "Serilog": {
      "MinimumLevel": {
        "Default": "Information",
        "Override": {
          "Microsoft": "Warning",
          "Microsoft.AspNetCore": "Warning",
          "System": "Warning"
        }
      },
      "Using": [ "Serilog.Sinks.Console", "Serilog.Sinks.Grafana.Loki", "Serilog.Enrichers.Span" ],
      "Enrich": [ "WithSpan" ],
      "WriteTo": [
        {
          "Name": "Console",
          "Args": {
            "outputTemplate": "[{Timestamp:yyyy-MM-dd HH:mm:ss.fff}] [{Level}] [{SourceContext}] [{SpanId}] [{TraceId}] {Message:lj}{NewLine}{Exception}"
          }
        },
        {
          "Name": "GrafanaLoki",
          "Args": {
            "uri": "http://localhost:3100",
            "textFormatter": "Serilog.Sinks.Grafana.Loki.LokiJsonTextFormatter, Serilog.Sinks.Grafana.Loki",
            "outputTemplate": "[{Timestamp:yyyy-MM-dd HH:mm:ss.fff}] [{Level}] [{SourceContext}] [{SpanId}] [{TraceId}] {Message:lj}{NewLine}{Exception}",
            "labels": [
              {
                "key": "app",
                "value": "RabbitMQWorker"
              }
            ],
            "propertiesAsLabel": [
              "app"
            ]
          }
        }
      ]
    },
    "AllowedHosts": "*",
    "Otlp": {
      "ServiceName": "RabbitMQConsumer",
      "Endpoint": "http://localhost:4317"
    }
  }

Docker environment

RabbitMQ docker container

RabbitMQ also can be installed through a docker container. So open the docker-compose.yml file and add the following code above the networks service:

For windows and macos

  rabbitmq:
    # https://github.com/docker-library/rabbitmq
    image: rabbitmq:3.12-management-alpine
    container_name: rabbitmq
    restart: unless-stopped
    ports:
      - 5672:5672
      - 15672:15672
    networks:
      otel:

You can view the RabbitMQ dashboard by going to http://localhost:15672 in your browser. The default username and password are guest.

With that you should be able to run the project. You won't see the logs and traces for RabbitMQ yet, because we still neeed to send new messages to the message queue from the Web API project. So let's do that.

Open the Catalog.API project and in the csproj file add the following code beneath the last </ItemGroup> tag, to have a project reference to the RabbitMQFactory project:

  <ItemGroup>
    <ProjectReference Include="..\RabbitMQFactory\RabbitMQFactory.csproj">
      <GlobalPropertiesToRemove></GlobalPropertiesToRemove>
    </ProjectReference>
  </ItemGroup>

Then inside the Program.cs file add the following using statements:

using RabbitMQFactory.Models;

Also in the Program.cs file add the following code beneath the builder.Services.AddRouting(options => options.LowercaseUrls = true); line:

builder.Services.AddSingleton(sp => RabbitMQFactory.RabbitMQFactory.CreateBus(BusType.LocalHost));

Open the CatalogController.cs file and change the current public IActionResult GetItem(string id) method with the following code:

public async Task<IActionResult> GetItem(string id)
{
    using (var activity = Activity.StartActivity("RabbitMq Publish", ActivityKind.Producer))
    {
        var basicProperties = _bus.GetBasicProperties();
        AddActivityToHeader(activity, basicProperties);

        await _bus.SendAsync(QueueType.Processing, new LocationRequest()
        {
            Latitude = 40,
            Longitude = 51
        }, basicProperties);
    }
    return Ok(_products.FirstOrDefault(i => string.Equals(i.Id, id, StringComparison.InvariantCultureIgnoreCase)));
}

Also in the CatalogController.cs file add the following 2 new methods:

private void AddActivityToHeader(Activity activity, IBasicProperties props)
{
    try
    {
        Propagator.Inject(new PropagationContext(activity.Context, Baggage.Current), props, InjectContextIntoHeader);
        activity?.SetTag("messaging.system", "rabbitmq");
        activity?.SetTag("messaging.destination_kind", "queue");
        activity?.SetTag("messaging.rabbitmq.queue", "sample");
        activity?.SetTag("messaging.destination", string.Empty);
        activity?.SetTag("messaging.rabbitmq.routing_key", QueueType.Processing);
    }
    catch(Exception ex)
    {
        var t = ex.Message;
    }
}

private void InjectContextIntoHeader(IBasicProperties props, string key, string value)
{
    props.Headers ??= new Dictionary<string, object>();
    props.Headers[key] = value;
}

Change the constructor signature of the CatalogController class to the following, leave the existing code inside the constructor:

public CatalogController(IBus bus)
{
    _bus = bus;

We also need to add that private field to the controller class:

private readonly IBus _bus;
//Important: The name of the Activity should be the same as the name of the Source added in the Web API startup AddOpenTelemetryTracing Builder
private static readonly ActivitySource Activity = new("APITracing");
private static readonly TextMapPropagator Propagator = Propagators.DefaultTextMapPropagator;

And lastly add the needed using statements to the top of the CatalogController.cs file:

using System.Diagnostics;
using OpenTelemetry;
using RabbitMQ.Client;
using RabbitMQFactory.Models;
using RabbitMQFactory.Models.Interfaces;

If you now start all 3 projects and run the swagger page of the Basket.API, you should see all trace info from both the Basket.API and the Catalog.API project, together with the RabbitMQ logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment