Visualizing F&O telemetry data with Grafana
Below is a step-by-step breakdown of how I connected Application Insights to Grafana to visualize batch job errors using a custom metric.
Prerequisites
- To follow these steps, make sure you have:
- An Application Insights instance receiving telemetry from your F&O environment
- An Azure Managed Grafana workspace in your Azure subscription
- Sufficient permissions to query logs and create dashboards
Step 1: Build the KQL query in Application Insights
Start in the Logs blade of your Application Insights resource. The telemetry I wanted to visualize was coming through a custom metric named BatchJobErrorCount. The metric contains a JSON array of batch job error objects.
Here’s the query I used:
customMetrics
| where name == "BatchJobErrorCount"
| extend customDimensions = todynamic(customDimensions)
| extend batchesRaw = tostring(customDimensions.BatchesInError)
| extend batchesInError = parse_json(batchesRaw)
| mv-expand batch = batchesInError
| where batch.Company <> ""
| project timestamp, BatchJobId = tostring(batch.BatchJobId), Status = tostring(batch.Status), BatchJobName = tostring(batch.BatchJobName), Company = tolower(tostring(batch.Company))
| summarize Count = count() by bin(timestamp, 1h), Company
This expands the custom payload, filters the entries, and aggregates error counts per company per hour.
After confirming the results in the table view, I used the Chart tab to preview the output using a stacked column chart. This provided a quick way to verify that data was correctly grouped and visually aligned with what I expected.
Step 2: Export the query to Azure Managed Grafana
Once the query was ready and producing expected results, I used the built-in integration to export it to Grafana.
From the Logs view:
- Click the Save dropdown above the query editor
- Select “Pin to Grafana dashboard”
- Choose your Azure Managed Grafana workspace
- Specify a dashboard name, and a folder (it is also possible to use an existing one)
Grafana will receive the query and automatically configure the data source using Azure Monitor.
For more details on this integration, you can refer to the official documentation: https://learn.microsoft.com/en-us/azure/azure-monitor/visualize/grafana-plugin
Step 3: Adjust the query for Grafana and configure the panel
- Aggregated over time (bin(timestamp, ...))
- Numeric (Count)
- Sorted by time (recommended)
customMetrics
| where name == "BatchJobErrorCount"
| extend customDimensions = todynamic(customDimensions)
| extend batchesRaw = tostring(customDimensions.BatchesInError)
| extend batchesInError = parse_json(batchesRaw)
| mv-expand batch = batchesInError
| where batch.Company <> ""
| project timestamp, BatchJobId = tostring(batch.BatchJobId), Status = tostring(batch.Status), BatchJobName = tostring(batch.BatchJobName), Company = tolower(tostring(batch.Company))
| summarize Count = count() by bin(timestamp, 1h), Company
| order by timestamp asc
Why this matters:
Grafana uses the time column as a reference for aligning data points. Adding order by timestamp asc avoids warnings and ensures the chart renders correctly, especially when displaying long time series.
In the panel settings:
- Change the visualization type to Time series
- Use timestamp as the X-axis
- Use Count as the value
- Split series by Company
- Optionally enable stacking to group total error volume by hour
Once adjusted, the panel should reflect the correct time-based distribution of batch job errors, split by legal entity.
Comments
Post a Comment