Tutorial

Bookinfo Tutorial

This tutorial will show you how to use Gloo Shot to apply chaos experiments to a simple service mesh app. We will use a slight modification of the familiar bookinfo app from Istio’s sample app repo. We have modified the reviews service to include a vulnerability that can lead to cascading failure. We will use Gloo Shot to detect this weakness.

The Goal

Services should be built to be resilient when dependencies are unavailable in order to avoid cascading failures. In this example, we show how to detect cascading failures: failures where an error in one service disables other services that interact with it. In the diagram below, we show two versions of a reviews service. The version on the top right fails when it does not receive a valid response from the ratings. The version on the bottom right handles the error more gracefully. It still provides review information even though the ratings data is not available.

The book info app consists of three services. If the ratings service fails, we do not want it to break the reviews service, as shown in the top-right frame. In a resilient app, the reviews service will continue to work, even if one of its dependencies is unavailable, as shown in the bottom-right frame.

Prerequisites

To follow this demo, you will need the following:

Setup

Deploy Gloo Shot

Install a service mesh (if you have not already)

Provide metric source configuration to Prometheus

Prometheus is a powerful tool for aggregating metrics. To use Prometheus most effectively, you need to tell it where it can find metrics by specifying a list of scrape configs.

Here is an example config for how Istio’s metrics should be handled by Prometheus. As you can see, scrape configs that are both insightful and resource-efficient can be quite complicated. Additionally, managing Prometheus configs for multiple scrape targets can be difficult.

Fortunately, SuperGloo provides a powerful utility for configuring your Prometheus instance in such a way that is appropriate for your chosen service mesh.

By default, glooshot init deploys an instance of Prometheus (this can be disabled). For best results, you should configure this instance of Prometheus with the metrics that are relevant to your particular service mesh. We will use the supergloo set mesh stats utility for this.

supergloo set mesh stats \
    --target-mesh glooshot.istio-istio-system \
    --prometheus-configmap glooshot.glooshot-prometheus-server

Note that we just had to tell SuperGloo where to find the mesh description and where to find the config map that we want to update. SuperGloo knows which metrics are appropriate for the target mesh and sets these on the active prometheus config map. You can find more details on setting Prometheus configurations with SuperGloo here.

Deploy the bookinfo app

Create an experiment

Repeat the experiment on a new version of the app