Bosun

Topics related to Bosun:

Getting started with Bosun

Bosun is an open-source, MIT licensed, monitoring and alerting system created by Stack Overflow. It has an expressive domain specific language for evaluating alerts and creating detailed notifications. It also lets you test your alerts against historical data for a faster development experience. More details at http://bosun.org/.

Bosun uses a config file to store all the system settings, macros, lookups, notifications, templates, and alert definitions. You specify the config file to use when starting the server, for example /opt/bosun/bosun -c /opt/bosun/config/prod.conf. Changes to the file will not be activated until bosun is restarted, and it is highly recommended that you store the file in version control.

lscount

Deprecation

The LogStash query functions are deprecated, and only for use with v1.x of ElasticSearch. If you are running v2 or above of ElasticSearch, then you should refer to the Elastic Query functions.

Caveats

  • There is currently no escaping in the keystring, so if you regex needs to have a comma or double quote you are out of luck.
  • The regexs in keystring are applied twice. First as a regexp filter to elastic, and then as a go regexp to the keys of the result. This is because the value could be an array and you will get groups that should be filtered. This means regex language is the intersection of the golang regex spec and the elastic regex spec. Elastic uses lucene style regex. This means regexes are always anchored (see the documentation).
  • If the type of the field value in Elastic (aka the mapping) is a number then the regexes won’t act as a regex. The only thing you can do is an exact match on the number, ie “eventlogid:1234”. It is recommended that anything that is a identifier should be stored as a string since they are not numbers even if they are made up entirely of numerals.
  • Alerts using this information likely want to set ignoreUnknown, since only “groups” that appear in the time frame are in the results

lsstat

Deprecation

The LogStash query functions are deprecated, and only for use with v1.x of ElasticSearch. If you are running v2 or above of ElasticSearch, then you should refer to the Elastic Query functions.

Caveats

  • There is currently no escaping in the keystring, so if you regex needs to have a comma or double quote you are out of luck.
  • The regexs in keystring are applied twice. First as a regexp filter to elastic, and then as a go regexp to the keys of the result. This is because the value could be an array and you will get groups that should be filtered. This means regex language is the intersection of the golang regex spec and the elastic regex spec. Elastic uses lucene style regex. This means regexes are always anchored (see the documentation).
  • If the type of the field value in Elastic (aka the mapping) is a number then the regexes won’t act as a regex. The only thing you can do is an exact match on the number, ie “eventlogid:1234”. It is recommended that anything that is a identifier should be stored as a string since they are not numbers even if they are made up entirely of numerals.
  • Alerts using this information likely want to set ignoreUnknown, since only “groups” that appear in the time frame are in the results

Templates: HTTPGet and HTTPGetJSON

Notifications: Overview

Complete Examples

Templates: Overview

Bosun templates are based on the Go html/template package and can be shared across multiple alerts, but a single template is used to render all Bosun Notifications for that alert. Alerts reference which template to use via the template directive and specify which notifications to use via the warnNotification and critNotification directives (can have multiple warn/crit notifications defined for each alert).

Templates are rendered when an alert instance is triggered and can:

The template subject will be displayed as headers on the dashboard, as the subject line of email notifications, and as the default contents of HTTP POST notifications. The template body will be displayed when an alert instance is expanded and as the body of email notifications.

Templates: Graph and GraphAll

Bosun Templates can include graphs to provide more information when sending a notification. The graphs can use variables from the alert and filter base on the tagset for the alert instance or use the GraphAll function to graph all series. When viewed on the Dashboard or in an email you can click on the graph to load it in the Expression page.

You can also create a Generic Template with optional Graphs that can be shared across multiple alerts.

Scollector: Overview

Scollector is a monitoring agent that can be used to send metrics to Bosun or any system that accepts OpenTSDB style metrics. It is modelled after OpenTSDB's tcollector data collection framework but is written in Go and compiled into a single binary. One of the design goals is to auto-detect services so that metrics will be sent with minimal or no configuration needed. You also can create external collectors that generate metrics using a script or executable and use Scollector to queue and send the metrics to the server.

You are NOT required to use Scollector when using Bosun, as you can also send metrics directly to the /api/put route, use another monitoring agent, or use a different backend like Graphite, InfluxDB, or ElasticSearch.

Scollector: External Collectors

Scollector supports tcollector style external collectors that can be used to send metrics to Bosun via custom scripts or executables. External collectors are a great way to get started collecting data, but when possible it is recommended for applications to send data directly to Bosun or to update scollector so that it natively supports additional systems.

The ColDir configuration key specifies the external collector directory, which is usually set to something like /opt/scollector/collectors/ in Linux or C:\Program Files\scollector\collectors\ in Windows. It should contain numbered directories just like the ones used in OpenTSDB tcollector. Each directory represents how often scollector will try to invoke the collectors in that folder (example: 60 = every 60 seconds). Use a directory named 0 for any executables or scripts that will run continuously and create output on their own schedule. Any non-numeric named directories will be ignored, and a lib and etc directory are often used for library and config data shared by all collectors.

External collectors can use either the simple data output format from tcollector or they can send JSON data if they want to include metadata.

Scollector: Process and Service Monitoring

Scollector can be used to monitor processes and services in Windows and Linux. Some processes like IIS application pools are monitored automatically, but usually you need to specify which processes and services you want to monitor.

Packages and Initialization Scripts

There currently aren't any installation packages provided for Bosun or Scollector, only binaries on the Bosun release page. It is up to the end user to find the best way to deploy the files and run them as a service.

Expression Tips and Tricks

Silencing and Squelching Alerts

Notifications: Chat Systems

Alerts: Advanced Scoping