Edit

Share via


Azure Monitor managed service for Prometheus rule groups

Rules in Prometheus act on data as the data is collected, either to precompute values stored in the time series or to alert on predefined conditions in your collected metrics. Azure Monitor managed service for Prometheus provides predefined sets of each type of rule and allows you to create and manage custom rules using the Azure portal.

Rule groups types

A Prometheus rule group is a collection of alert rules/or and recording rules that are evaluated together. Every rule must be a member of a single rule group. Rule groups define the scope of all the rules in the group and the frequency that they're evaluated.

There are two types of Prometheus rules.

Type Description
Alert Alert rules let you create an Azure Monitor alert based on the results of a Prometheus Query Language (PromQL) query. Alerts fired by Azure Managed Prometheus alert rules are processed and trigger notifications in similar ways to other Azure Monitor alerts.
Recording Recording rules allow you to precompute frequently needed or computationally extensive expressions and store their result as a new set of time series. Time series created by recording rules are ingested back to your Azure Monitor workspace as new Prometheus metrics.

Azure Managed Prometheus rule groups follow the structure and terminology of the open-source Prometheus rule groups. Rule names, expressions, labels, and annotations are all supported in Azure.

There are some differences between Azure Managed Prometheus rule groups and open-source Prometheus rule groups though. Azure Managed Prometheus rule groups are managed as Azure resources and include information required for resource management, such as the subscription and resource group where the Azure rule group should reside. Alert rules include dedicated properties, such as alert severity, action group association, and alert autoresolve configuration, that allow alerts to be processed like other Azure Monitor alerts.

Scope of a rule group

The scope of a rule group in Azure Managed Prometheus rule groups defines what resources the rules in the group are applied to. Individual rules can't be applied directly to a Kubernetes cluster. The following table describes the different rule group scopes.

Scope Description
All clusters in the workspace All enabled rules in the group will be applied to all clusters currently connected to the Azure Monitor workspace.
Specific cluster - Cluster name All enabled rules in the group will be applied only to the selected cluster.
Specific cluster - Cluster name in query All enabled rules in the group will be applied clusters with the specified text in their name.

View Prometheus rule groups

There are multiple ways to view Prometheus rule groups and their rules in the Azure portal.

Rules in an Azure Monitor workspace Select Rule groups from an Azure Monitor workspace in the Azure portal to view all rule groups in that workspace. You can expand any rule group to view the list of rules in that group. Select any group or rule to view its details.

Screenshot of Prometheus rule groups from Azure Monitor workspace.

All rules From the Alerts page in the Monitor menu in the Azure portal, select Prometheus rule groups to view all rules groups in subscriptions you have access to.

Screenshot that shows how to view Prometheus rule groups from the alerts screen.

This view identifies the workspace where the rule group is located, whether it's enabled, and the cluster if the rule group is limited to a specific cluster scope. Use the filters at the top of the screen to narrow the list of rule groups by various properties. You can delete multiple rule groups from this view by selecting them and then clicking Delete. This can be useful, for example, to cleanup rule groups that are no longer needed after deleting a cluster.

Screenshot of all Prometheus rule groups.

Tip

You can also access this same view from the Alerts page of a Kubernetes cluster. This will set the initial filter to the rule groups scoped to that cluster.

Create Prometheus rule groups and rules

Open the All rules view described about and select + Create

Screenshot that shows option to create a new Prometheus rule group.

Scope

Setting Description
Azure Monitor workspace The Azure Monitor workspace the rule group will query data from. This value can't be changed for an existing rule group.
Location Location of the selected Azure Monitor workspace.
Cluster Specifies where rule group applies to all clusters in the workspace or a specific cluster. Either select a specific cluster or enter text to match against cluster names.

Details

Setting Description
Subscription Subscription where the rule group resource will be created. This value can't be changed for an existing rule group.
Resource group Resource group where the rule group resource will be created. This value can't be changed for an existing rule group.
Name Name of the rule group resource. This name must be unique within the selected resource group. This value can't be changed for an existing rule group.
Description Description of the rule group.
Evaluate every Frequency that the rules in the group are evaluated. Default is 1 minute.
Enabled Enable or disable the rule group. Disabled rule groups will still be created, but the rules will only be run if the group is enabled.
Labels Optional label key/value pairs for the rule. These labels are added to the metric created by the rule.

Rules Select Add recording rule or Add alert rule to add rules to the group. Each tpe of rule has different settings as described below.

Recording rules

Setting Description
Name Name of the recording rule. This name is used for the metric created by the rule.
Enabled Specifies whether the rule is enabled or disabled. Disabled rules will be created, but won't be evaluated until enabled.
Expression PromQL expression that defines the rule. Select Run Query to see the results of the expression query visualized in the preview chart. Modify the preview time range to zoom in or out on the expression result history.

Alert rules

Setting Description
Name Name of the recording rule. This name is the name of alerts fired by the rule.
Severity Severity value for alerts fired by this rule.
Expression PromQL expression that defines the rule. Select Run Query to see the results of the expression query visualized in the preview chart. Modify the preview time range to zoom in or out on the expression result history.
Wait for Time period between when the alert expression first becomes true and until the alert is fired.
Labels Optional label key/value pairs for the rule. These labels are added to the alerts fired by the rule.
Annotations Optional annotation key/value pairs for the rule. These annotations are added to the alerts fired by the rule.
Action groups Action groups that define the response to the alert being fired.
Enabled Specifies whether the rule is enabled or disabled. Disabled rules will be created, but won't be evaluated until enabled.
Automatically resolve alerts Automatically resolve alerts if the rule condition is no longer true during the Time to auto-resolve period.

Configure the rule group scope

On the Scope tab:

  1. Select the Azure Monitor workspace from a list of workspaces that are available in your subscriptions. The rules in this group query data from this workspace.

  2. To limit your rule group to a cluster scope, select the Specific cluster option:

    • Select the cluster from the list of clusters that are already connected to the selected Azure Monitor workspace.
    • The default Cluster name value is entered for you. Change this value only if you changed your cluster label value by using cluster_alias.
  3. Select Next to configure the rule group details.

    Screenshot that shows configuration of Prometheus rule group scope.

Convert Prometheus rules file to a Managed Prometheus rule group

If you have a Prometheus rules configuration file in YAML format, you can convert it to an ARM template for an Azure Managed Prometheus rule group using the az-prom-rules-converter utility. The rules file can contain the definition of one or more rule groups.

In addition to the rules file, the utility requires other properties needed to create the Azure Prometheus rule groups including subscription, resource group, location, target Azure Monitor workspace, target cluster ID and name, and action groups. The utility creates a template file that you can deploy using any standard methods for deploying ARM templates.

Limit rules to a specific cluster

You can optionally limit the rules in a rule group to query data originating from a single specific cluster by adding a cluster scope to your rule group or by using the rule group clusterName property. Limit rules to a single cluster if your Azure Monitor workspace contains a large amount of data from multiple clusters. In such a case, there's a concern that running a single set of rules on all the data might cause performance or throttling issues. By using the cluster scope, you can create multiple rule groups, each configured with the same rules, with each group covering a different cluster.

To limit your rule group to a cluster scope using an ARM template, add the Azure resource ID value of your cluster to the rule group scopes[] list. The scopes list must still include the Azure Monitor workspace resource ID. The following cluster resource types are supported as a cluster scope:

  • Azure Kubernetes Service clusters (Microsoft.ContainerService/managedClusters)
  • Azure Arc-enabled Kubernetes clusters (Microsoft.kubernetes/connectedClusters)
  • Azure connected appliances (Microsoft.ResourceConnector/appliances)

In addition to the cluster ID, you can configure the clusterName property of your rule group. The clusterName property must match the cluster label that's added to your metrics when scraped from a specific cluster. By default, this label is set to the last part (resource name) of your cluster ID. If you changed this label by using the cluster_alias setting in your cluster scraping ConfigMap, you must include the updated value in the rule group clusterName property. If your scraping uses the default cluster label value, the clusterName property is optional.

Here's an example of how a rule group is configured to limit query to a specific cluster:

{
    "name": "sampleRuleGroup",
    "type": "Microsoft.AlertsManagement/prometheusRuleGroups",
    "apiVersion": "2023-03-01",
    "location": "northcentralus",
    "properties": {
         "description": "Sample Prometheus Rule Group limited to a specific cluster",
         "scopes": [
             "/subscriptions/<subscription-id>/resourcegroups/<resource-group-name>/providers/microsoft.monitor/accounts/<azure-monitor-workspace-name>",
             "/subscriptions/<subscription-id>/resourcegroups/<resource-group-name>/providers/microsoft.containerservice/managedclusters/<myClusterName>"
         ],
         "clusterName": "<myCLusterName>",
         "rules": [
             {
                ...
             }
         ]
    }
}        

If both the cluster ID scope and clusterName property aren't specified for a rule group, the rules in the group query data from all the clusters in the workspace from all clusters.

Configure the rule group details

Screenshot that shows configuration of a Prometheus rule group recording rule.

Screenshot that shows configuration of Prometheus rule group alert rule.

Note

For alert rules, the expression query typically returns only time series that fulfill the expression condition. If the preview chart isn't shown and you get the message "The query returned no result," it's likely that the condition wasn't fulfilled in the preview time range.

Finish creating the rule group

  1. On the Tags tab, set any required Azure resource tags to be added to the rule group resource.

    Screenshot that shows the Tags tab when creating a new alert rule.

  2. On the Review + create tab, the rule group is validated and lets you know about any issues. On this tab, you can also select the View automation template option and download the template for the group that you're about to create.

  3. After validation passes and you review the settings, select Create.

    Screenshot that shows the Review + create tab when you create a new alert rule.

  4. You can follow up on the rule group deployment to make sure that it finishes successfully or to be notified of any error.

View the resource health states of your Prometheus rule groups

You can now view the resource health state of your Prometheus rule group in the portal. You can detect problems in your rule groups, such as incorrect configuration, or query throttling problems.

  1. In the portal, go to the overview of the Prometheus rule group that you want to monitor.

  2. On the left pane, under Help, select Resource health.

    Screenshot that shows how to view the resource health state of a Prometheus rule group.

  3. On the Resource health pane, you can see the current availability state of the rule group. You can also see a history of recent resource health events, up to the last 30 days.

    Screenshot that shows how to view the resource health history of a Prometheus rule group.

    • If the rule group is marked as Available, it's working as expected.
    • If the rule group is marked as Degraded, one or more rules in the group aren't working as expected. The rule query might be throttled, or other issues might cause the rule evaluation to fail. Expand the status entry for more information on the detected problem, suggestions for mitigation, or further troubleshooting.
    • If the rule group is marked as Unavailable, the entire rule group isn't working as expected. There might be a configuration issue (for example, the Azure Monitor workspace can't be detected) or internal service issues. Expand the status entry for more information on the detected problem, suggestions for mitigation, or further troubleshooting.
    • If the rule group is marked as Unknown, the entire rule group is disabled or is in an unknown state.

Disable and enable rule groups

To enable or disable a rule, select the rule group in the Azure portal. Select either Enable or Disable to change its status.