This article addresses common questions regarding the results and behavior of Bandwidth's Anomaly Detection for Voice feature.

Why didn't an alert trigger even though the observed traffic line went outside the expected bounds?

An alert is only generated when three specific conditions are met:

The observed value falls outside the upper or lower bounds set by the solution.
The minimum traffic volume requirement is met.
The minimum anomaly score requirement (the threshold) is met.

The solution focuses on alerting due to significant deviation and impact. In some cases, even if the observed traffic exceeds the bounds, the anomaly score may not be high enough. For instance, if the score reaches 70, but the default minimum required score (threshold) is 80, no alert will be generated.

Furthermore, an issue must impact enough calls to be considered significant enough to alert on (the volume requirement). If an issue affects only a very small number of calls—for example, 10 failed calls when 3 were expected, out of 10,000 total calls, it’s not considered a substantial enough traffic impact to warrant an alert, even if the score is high and the traffic is clearly outside the expected bounds.

The solution is alerting because of a change in expected traffic, not an error. Why is it alerting seemingly "incorrectly?"

The solution is designed to alert on sudden traffic changes, even if those changes are healthy shifts in customer behavior. The machine learning model cannot inherently distinguish between a planned, healthy increase or decrease in traffic (a traffic shift) and an unhealthy drop in traffic (like an outage). From the model's perspective, both scenarios represent a sudden disappearance of expected traffic. Alerting on this drop is expected behavior.

How long will it take for the solution to adapt to the new pattern, if a traffic shift occurs?

The solution works quickly to adjust its expectations following a persistent traffic shift. Broadly, here’s the expected adjustment timeline:

0–1 days after: The solution views the change as anomalous behavior and will likely alert.
2–6 days after: The solution adjusts its boundaries, meaning it should no longer alert, but it has not yet fully learned the new expected traffic value.
7+ days after: The solution has successfully learned the new traffic pattern, and monitoring generally returns to normal.

The solution maintains some memory of the prior traffic patterns for up to three weeks, so you may occasionally see areas of less certainty during that period, though this should be rare.

How does the solution handle holidays?

The feature attempts to avoid unnecessary alerts during holidays. It uses a list of US federal holidays, along with significant additions like Good Friday and Easter.

The solution applies a separate "holiday behavior" model not just on the holiday itself, but also on business days immediately preceding or following it (e.g., the Friday after July 4th or the day before Christmas Eve).

During days treated with holiday behavior, the solution assumes that traffic volume will be much lower than normal, but the underlying traffic pattern will remain the same. The solution calculates how much the traffic is scaled down throughout the day and reduces its predictions accordingly. For example, if a customer’s traffic is 80% of its normal volume on a holiday, the solution adjusts its expectations down by 80%. If the traffic then suddenly drops by 95% or 100%, that would be far outside the expected holiday reduction, and an alert would be triggered.

What is the anomaly score?

The anomaly score is a number ranging from 0 to 100. It represents the confidence the model has that an anomaly has occurred in the monitored traffic.

How is the anomaly score determined?

The score is influenced by several significant factors:

Difference between expected and observed values: The larger the gap between the traffic volume the solution expected to see and the volume actually observed, the higher the score will be. For example, a 50% drop in call volume scores higher than a 30% drop, assuming all other factors are equal.
Duration/persistence of the issue: The score increases the longer the traffic remains outside the expected bounds. This helps ensure that issues that fail to recover quickly are detected, even if the initial difference wasn't massive.
Consistency of the traffic pattern: If the typical data pattern is highly consistent (smooth traffic curve), the scoring will be more sensitive. If the data is inherently inconsistent or erratic, the solution applies a "score penalty" because it has less confidence that an extreme-looking movement is truly an anomaly. This measure of consistency updates over time.
Traffic volume (impact): The solution is more sensitive to changes in patterns that have a higher volume of traffic. An issue impacting a large number of calls will score higher than the same percentage drop impacting a small number of calls, because the customer impact is greater the more traffic is affected.

Why is the default threshold 80?

The "threshold" displayed in the user interface is the minimum anomaly score required to trigger an alert. The default value of 80 was empirically derived from Bandwidth’s experience and experimentation with voice data. This high threshold ensures that when a customer receives an alert, the solution is highly confident that the alarm represents a significant deviation in the monitored traffic. While this doesn't guarantee every alarm is an actionable problem, it means the alert will only fire if there is something substantial worth checking.

When should you adjust the default threshold?

You should view the score threshold as a "sensitivity knob":

A higher threshold requires a larger difference or longer duration for an alert to trigger.
A lower threshold means less of a difference is required to trigger an alert.

The valid range for thresholds is generally from 0 to 100. We recommend constraining the score threshold to the range of 50 to 90:

Setting the threshold below 50 is likely to result in more alerts than desired. Notably, if you set the score threshold lower than 50, you might receive alerts solely because an observed value has been outside the bounds repeatedly, even if the actual difference between expected and observed traffic is minimal.
Setting the threshold higher than 90 is likely to miss even particularly significant issues.
Setting the threshold to 101 will completely disable alerting for that metric.

When should you lower the threshold (increase sensitivity)?

Lowering the threshold will necessarily lead to more alerts, potentially including alerts for minor traffic changes or "false positives" (alerts that are not actual problems).

Consider lowering the threshold if:

The alert metric is highly critical, and you need to be notified as soon as possible, even at the cost of receiving more false positives.
The metric you’re monitoring is extremely reliable and should never meaningfully deviate from the normal pattern.
A recent incident occurred, but it didn’t trigger an alert because the maximum score failed to exceed the default threshold.

When should you raise the threshold (decrease sensitivity)?

Raising the threshold will necessarily lead to fewer alerts, which might mean missing actual issues that are deemed too minor to trigger an alert.

Consider raising the threshold if:

The current alert setting is "too noisy" and generates a high volume of false positives.
You anticipate upcoming unusual traffic (such as during a migration or maintenance) and don’t wish to receive alerts during that period.
You only care about being notified for the most extreme problems and wish to filter out minor fluctuations.

What is the difference between Voice Error Threshold Alerts and Voice Anomaly Alerts?

Standard Threshold Percentages: Relies on pre-set, fixed numeric limits (e.g., error rate > 5%) to flag issues; simple and easy to implement, but can miss subtle patterns and may create false positives if thresholds are too rigid.

Anomaly Detection with Machine Learning: Dynamically identifies unusual patterns or deviations from normal behavior based on learned data distributions; more adaptive and capable of detecting complex, non-obvious anomalies that static thresholds might overlook.

Metrics monitored

Metric	Voice Error Threshold	Voice Anomaly Detection
Volume of Successful Calls	No	Yes
Volume of Attempted Calls	No	Yes
MOU	No	Yes
ASR	No	Yes
NER	No	Yes
Volume of 4XX, 5XX, 6XX	Yes	Yes
Volume of Failures	Yes	Yes
Specific SIP Errors	Yes	No

Traffic scenarios monitored

Scenario	Voice Error Threshold	Voice Anomaly Detection
Drops in traffic	No	Yes
Problems upstream	No	Yes
Holidays and weekends	No	Yes
Organic growth in volume	No	Yes
Minor long running failures	No	Yes
Extreme SIP failures	Yes	Yes
Low volume and sporadic traffic	Yes	No*

*If and when the sporadic traffic is on an account which has historically had high volume, then yes.

Anomaly Detection for Voice FAQ