Getting started with Bosun
Remarks#
Bosun is an open-source, MIT licensed, monitoring and alerting system created by Stack Overflow. It has an expressive domain specific language for evaluating alerts and creating detailed notifications. It also lets you test your alerts against historical data for a faster development experience. More details at https://bosun.org/.
Bosun uses a config file to store all the system settings, macros, lookups, notifications, templates, and alert definitions. You specify the config file to use when starting the server, for example /opt/bosun/bosun -c /opt/bosun/config/prod.conf
. Changes to the file will not be activated until bosun is restarted, and it is highly recommended that you store the file in version control.
Versions#
Version | Release Date |
---|---|
0.3.0 | 2015-06-13 |
0.4.0 | 2015-09-18 |
0.5.0 | 2016-03-15 |
Sample Alert
Bosun alerts are defined in the config file using a custom DSL. They use functions to evaluate time series data and will generate alerts when the warn or crit expressions are non-zero. Alerts use templates to include additional information in the notifications, which are usually an email message and/or HTTP POST request.
template sample.alert {
body = `<p>Alert: {{.Alert.Name}} triggered on {{.Group.host}}
<hr>
<p><strong>Computation</strong>
<table>
{{range .Computations}}
<tr><td><a href="{{$.Expr .Text}}">{{.Text}}</a></td><td>{{.Value}}</td></tr>
{{end}}
</table>
<hr>
{{ .Graph .Alert.Vars.metric }}`
subject = {{.Last.Status}}: {{.Alert.Name}} cpu idle at {{.Alert.Vars.q | .E}}% on {{.Group.host}}
}
notification sample.notification {
email = alerts@example.com
}
alert sample.alert {
template = sample.template
$q = avg(q("sum:rate:linux.cpu{host=*,type=idle}", "1m"))
crit = $q < 40
notification = sample.notification
}
The alert would send an email with the subject Critical: sample.alert cpu idle at 25% on hostname
for any host who’s Idle CPU usage has averaged less than 40% over the last 1 minute. This example is a “host scoped” alert, but Bosun also supports cluster, datacenter, or globally scoped alerts (see the fundamentals video series for more details).
Sample Configuration File
Here is an example of a Bosun config file used in a development environment:
tsdbHost = localhost:4242
httpListen = :8070
smtpHost = localhost:25
emailFrom = bosun@example.org
timeAndDate = 202,75,179,136
ledisDir = ../ledis_data
checkFrequency = 5m
notification example.notification {
email = alerts@example.org
print = true
}
In this case the config file indicates Bosun should connect to a local OpenTSDB instance on port 4242, listen for requests on port 8070 (on all IP addresses bound to the host), use the localhost SMTP system for email, display additional time zones, use built in Ledis instead of external Redis for system state, and default alerts to a 5 minute interval.
The config also defines an example.notification that can be assigned to alerts, which would usually be included at the end of the config file (see sample alert example).
Docker Quick Start
The quick start guide includes information about using Docker to stand up a Bosun instance.
$ docker run -d -p 4242:4242 -p 80:8070 stackexchange/bosun
This will create a new instance of Bosun which you can access by opening a browser to https://docker-server-ip. The docker image includes HBase/OpenTSDB for storing time series data, the Bosun server, and Scollector for gathering metrics from inside the bosun container. You can then point additional scollector instances at the Bosun server and use Grafana to create dashboards of OpenTSDB or Bosun metrics.
The Stackexchange/Bosun image is designed only for testing. There are no alerts defined in the config file and the data will be deleted when the docker image is removed, but it is very helpful for getting a feel for how bosun works. For details on creating a production instance of Bosun see https://bosun.org/resources