SLO告警策略

最近又进入可观测和SRE领域了,有意思,特有意思。

参考url:https://www.datadoghq.com/blog/monitor-service-performance-with-slo-alerts/
连续三篇,都好有用的。

一,datadog上设置的图

image.png

二,SLO告警的两种方案

Error budget和Burn rate告警。一个是低级别关注,一个可能会是高级别关注。
而Burn rate涉及多窗口多燃烧率,关键概念就是这些。

Error budget alerts can give you time to bring your service’s performance into SLO compliance, but they only notify you after the defined portion of the budget is gone. For an even more proactive approach to SLO monitoring, you can create a burn rate alert to notify you if you’re consuming your error budget more quickly than expected

Your burn rate fluctuates according to changes in your service’s performance. Datadog automatically [calculates your service’s burn rate](https://docs.datadoghq.com/monitors/service_level_objectives/burn_rate/#how-burn-rate-alerts-work) across a subset of your SLO’s time window, known as an **alerting window**. A burn rate alert uses a long alerting window (to help prevent flapping and alert fatigue) and a short alerting window (to allow the alert to recover quickly when the burn rate falls back below the threshold). The alert will trigger when the burn rate across both windows is above your alert’s threshold.

版权声明:
作者:倾城
链接:https://www.techfm.club/p/61847.html
来源:TechFM
文章版权归作者所有,未经允许请勿转载。

THE END
分享
二维码
< <上一篇
下一篇>>