Monitoring Batch Job Errors in D365FO using Application Insights
While Dynamics 365 Finance and Operations (D365FO) includes native integration with Application Insights, the out-of-the-box capabilities still have limitations—especially when it comes to monitoring the real-time status of internal processes like batch jobs.
Although batch job task exceptions are tracked, they don’t always reflect the current status of a batch job. Tasks may have failed and later been retried successfully, meaning the job is actually healthy by the time we query it—yet the historical exceptions remain in telemetry.
In this post, I’ll show an alternative technique that extends the telemetry pipeline by programmatically sending custom metrics and structured data to App Insights. We’ll walk through a lightweight batch class that scans for failed batch jobs across all companies and pushes that data into App Insights as both a metric and JSON payload, enabling Kusto-based analysis, dashboards, and alerts.
This approach is flexible and can easily be adapted to monitor the real-time status of other processes within D365FO.
The Batch Class: SendTickMetricsToAppInsights
class SendTickMetricsToAppInsights extends RunBaseBatch implements BatchRetryable
{
public void run()
{
SysApplicationInsightsTelemetryLogger logger = SysApplicationInsightsTelemetryLogger::instance();
BatchJob batchJob;
container c = conNull();
while select crosscompany RecId, Status, Company, Caption from batchJob
where batchJob.Status == BatchStatus::Error
{
str jsonString = strFmt('{"BatchJobId": "%1", "Status": "%2", "Company": "%3", "BatchJobName": "%4"}',
batchJob.RecId, enum2str(batchJob.Status), batchJob.Company, batchJob.Caption);
c += jsonString;
}
str jsonArray = "[" + con2Str(c, ",") + "]";
int64 count_ = conLen(c);
logger.trackMetricWithDimensions("BatchJobErrorCount", count_,
"LegalEntity", curExt(),
"BatchName", "SendTickMetricsToAppInsights",
"BatchesInError", jsonArray);
}
public boolean isRetryable()
{
return false;
}
}
This job:
- Runs cross-company
- Collects all batch jobs in Error status
- Formats the results as a JSON array
- Sends a metric to App Insights with dimensions and the full JSON payload
Scheduling It
I scheduled this job to run every 2 minutes using the built-in batch scheduling. This frequency gives us near-real-time error monitoring with very low overhead.
What It Looks Like in App Insights
customMetrics
| where name == "BatchJobErrorCount"
| extend Company = tostring(customDimensions.LegalEntity),
BatchName = tostring(customDimensions.BatchName),
BatchesInError = parse_json(tostring(customDimensions.BatchesInError))
| project timestamp, Company, BatchName, value, BatchesInError
Or a simple time series by company:
customMetrics
| where name == "BatchJobErrorCount"
| summarize count_=sum(value) by bin(timestamp, 5m), LegalEntity=tostring(customDimensions.LegalEntity)
| render timechart
Final Thoughts
This technique is lightweight, extensible, and gives us a more accurate view of the current state of our batch processes. By sending structured telemetry to Application Insights, we unlock powerful analysis capabilities using Kusto, and build a strong foundation for observability in D365FO.
But tracking is only part of the picture. In a follow-up post, I’ll dive into how we can leverage Application Insights' signals and alerting capabilities—for example, triggering alerts when a spike in batch job errors is detected, or when a specific legal entity starts experiencing failures. This enables proactive monitoring and faster incident response across environments.
Stay tuned!
Comments
Post a Comment