OpenTracing

OpenTracing is an open standard for distributed tracing. Distributed tracing can be used for optimizing end-user latency (the trace gives a breakdown of where time has been spent in distributed requests), root-cause analysis for errors (errors can be annotated in the trace and show how other parts of a distributed system relate to an error), and understanding the bigger picture of the system (traces can give insight into the distinct pieces of a distributed system and how they are connected).

As an example, here’s a simple message flow across actors:

Actor A sends messages to actors B and C (which is running in a different actor system), and actor C sends a message to actor D.

Here’s what a possible trace for this message flow looks like conceptually:

A trace shows a dataflow or an execution path through a distributed system. Each span in the trace represents a logical unit of work. In the case of actors, each span represents the processing of a message by an actor. The duration of the span is recorded. Spans may be nested to model causal relationships, with spans referencing other spans, and for actor tracing these relationships are message sends. Events can be logged within a span.

An actor trace shows the flow of messages, and records when messages were processed and how long it took to process each message. Message sends to other actors are logged within the trace span, as well as any actor events such as actor failures, unhandled messages, dead letters, or logged errors and warnings.

Actor configuration

Actors need to be enabled for tracing, similar to metrics and events. This is an extension of the actor configuration, with a traceable setting that can be enabled for any actor selection.

For example, actors can be selected by class or path and then enabled as traceable, such as in the following configuration:

cinnamon.akka {
  actors {
    "com.example.a.b.*" {
      report-by = class
      traceable = on
    }
    "/user/x/*" {
      report-by = class
      traceable = on
    }
  }
}

Tracing configuration

The OpenTracing integration for both Jaeger and Zipkin build on the Jaeger client. The tracer supports the following configuration:

Setting a service-name for each node is useful (the default is to use the main class).

Note: Tracing can produce a very high volume of data, so sampling is applied (at the beginning of a trace). The sampler used, and its settings, can be configured. The default sampler is a rate-limiting sampler that captures up to 10 traces per second.

As an example, the following configuration sets the service-name to my-component and configures a rate-limiting sampler with a maximum of 25 traces per second:

Configuration
cinnamon.opentracing {
  tracer {
    service-name = "my-component"

    sampler = rate-limiting-sampler

    rate-limiting-sampler {
      max-traces-per-second = 25
    }
  }
}
Defaults
cinnamon.opentracing {
  tracer {

    # Service name for this application, defaults to main class when not set
    service-name = null

    # Trace sampler to use
    sampler = rate-limiting-sampler

    rate-limiting-sampler {
      # Maximum number of sampled traces per second
      max-traces-per-second = 10
    }

    probabilistic-sampler {
      # Probabilistic sampling rate, between 0.0 and 1.0
      sampling-rate = 0.001
    }

    const-sampler {
      # Constant decision on whether to sample traces
      # Note: this sampler is NOT recommended for production
      decision = true
    }

    # Log trace spans with SLF4J (can be used for debugging the tracer)
    # Set `cinnamon.opentracing.tracer.reporters += trace-logging`
    trace-logging {
      # Name of SLF4J logger to use when logging
      logger = "cinnamon.opentracing.Tracer"
    }

  }
}

Note: These settings are defined in the reference.conf. You only need to specify any of these settings when you want to override the defaults.

Jaeger reporter

Jaeger is a distributed tracing system with support for OpenTracing.

Cinnamon Jaeger dependency

First make sure that your build is configured to use the Cinnamon Agent.

To enable the Jaeger reporter, add the following dependency to your build:

sbt
libraryDependencies += Cinnamon.library.cinnamonOpenTracingJaeger
Maven
<dependency>
  <groupId>com.lightbend.cinnamon</groupId>
  <artifactId>cinnamon-opentracing-jaeger_2.11</artifactId>
  <version>2.4.2</version>
</dependency>
Gradle
dependencies {
  compile group: 'com.lightbend.cinnamon', name: 'cinnamon-opentracing-jaeger_2.11', version: '2.4.2'
}

Jaeger configuration

Jaeger reporting can be configured. For example, set a different endpoint for the Jaeger agent by configuring the host and port settings:

Configuration
cinnamon.opentracing {
  jaeger {
    host = "localhost"
    port = 5432
  }
}
Defaults
cinnamon.opentracing {
  jaeger {

    # Host for Jaeger trace span collector
    host = "localhost"

    # UDP port for Jaeger trace span collector
    port = 5775

    # Max size for UDP packets
    max-packet-size = 65000

    # Flush interval for trace span reporter
    flush-interval = 1s

    # Max queue size of trace span reporter
    max-queue-size = 1000

  }
}

Note: These settings are defined in the reference.conf. You only need to specify any of these settings when you want to override the defaults.

Running Jaeger

See the Jaeger documentation for running Jaeger. The Jaeger getting started shows how to run Jaeger locally for development and testing.

Here’s what an example actor trace in Jaeger looks like:

Jaeger trace

Zipkin reporter

Zipkin is a distributed tracing system with support for OpenTracing.

Cinnamon Zipkin dependency

First make sure that your build is configured to use the Cinnamon Agent.

To enable the Zipkin reporter, add the following dependency to your build:

sbt
libraryDependencies += Cinnamon.library.cinnamonOpenTracingZipkin
Maven
<dependency>
  <groupId>com.lightbend.cinnamon</groupId>
  <artifactId>cinnamon-opentracing-zipkin_2.11</artifactId>
  <version>2.4.2</version>
</dependency>
Gradle
dependencies {
  compile group: 'com.lightbend.cinnamon', name: 'cinnamon-opentracing-zipkin_2.11', version: '2.4.2'
}

Zipkin configuration

The default Zipkin sender is the URL connection sender, which can be used for sending trace spans directly to the Zipkin API. This sender can be configured. For example, set a different endpoint for the Zipkin trace span collector by configuring the endpoint setting:

Configuration
cinnamon.opentracing {
  zipkin {
    url-connection {
      endpoint = "http://my.zipkin.host:9411/api/v1/spans"
    }
  }
}
Defaults
cinnamon.opentracing {
  zipkin {

    # Flush interval for trace span reporter
    flush-interval = 1s

    # Max queue size of trace span reporter
    max-queue-size = 1000

    # Zipkin sender to use for reporting trace spans
    sender = url-connection

    # URL connection sender for reporting directly to a Zipkin API endpoint
    url-connection {
      # POST URL for Zipkin's v1 api, usually "http://zipkinhost:9411/api/v1/spans"
      endpoint = "http://localhost:9411/api/v1/spans"

      # Encoding to use for trace spans (thrift or json)
      encoding = "thrift"

      # Timeout for establishing URL connection
      connect-timeout = 10s

      # Timeout for connection reads
      read-timeout = 60s

      # Whether GZIP compression is enabled
      compression = true

      # Maximum size of messages
      max-message-size = 5MiB
    }

  }
}

Note: These settings are defined in the reference.conf. You only need to specify any of these settings when you want to override the defaults.

See the following sections for configuring the Zipkin sender for Kafka or Scribe.

Zipkin Kafka sender

Zipkin can be configured to send traces to a Kafka topic. This sender supports Kafka 0.10.2+.

To enable the Zipkin Kafka sender, add the following dependency to your build:

sbt
libraryDependencies += Cinnamon.library.cinnamonOpenTracingZipkinKafka
Maven
<dependency>
  <groupId>com.lightbend.cinnamon</groupId>
  <artifactId>cinnamon-opentracing-zipkin-kafka_2.11</artifactId>
  <version>2.4.2</version>
</dependency>
Gradle
dependencies {
  compile group: 'com.lightbend.cinnamon', name: 'cinnamon-opentracing-zipkin-kafka_2.11', version: '2.4.2'
}

You can then configure the Zipkin reporter to use the Kafka sender. You must specify the Kafka bootstrap servers to use. You can also override any of the producer configs using the properties configuration section.

Configuration
cinnamon.opentracing {
  zipkin {
    sender = kafka

    kafka {
      bootstrap-servers = ["my.kafka.host1:9091", "my.kafka.host2:9091"]
    }
  }
}
Defaults
cinnamon.opentracing {
  zipkin {
    kafka {
      # Initial set of kafka servers to connect to (must be specified)
      bootstrap-servers = []

      # Kafka topic to send trace spans to
      topic = "zipkin"

      # Encoding to use for trace spans (thrift or json)
      encoding = "thrift"

      # Property overrides for producer configs (http://kafka.apache.org/0102/documentation.html#producerconfigs)
      properties {}

      # Maximum size of messages
      max-message-size = 1MB
    }
  }
}

Note: These settings are defined in the reference.conf. You only need to specify any of these settings when you want to override the defaults.

Zipkin Scribe sender

Zipkin can be configured to send traces to Scribe.

To enable the Zipkin Scribe sender, add the following dependency to your build:

sbt
libraryDependencies += Cinnamon.library.cinnamonOpenTracingZipkinScribe
Maven
<dependency>
  <groupId>com.lightbend.cinnamon</groupId>
  <artifactId>cinnamon-opentracing-zipkin-scribe_2.11</artifactId>
  <version>2.4.2</version>
</dependency>
Gradle
dependencies {
  compile group: 'com.lightbend.cinnamon', name: 'cinnamon-opentracing-zipkin-scribe_2.11', version: '2.4.2'
}

You can then configure the Zipkin reporter to use the Scribe sender. You can set the Scribe endpoint using the host and port settings:

Configuration
cinnamon.opentracing {
  zipkin {
    sender = scribe

    scribe {
      host = "my.scribe.host"
      port = 9410
    }
  }
}
Defaults
cinnamon.opentracing {
  zipkin {
    scribe {
      # Host of Scribe trace collector
      host = "localhost"

      # Port of Scribe trace collector
      port = 9410

      # Timeout for socket reads
      socket-timeout = 60s

      # Timeout for connections
      connect-timeout = 10s

      # Maximum size of messages (scribe default is 16384000 bytes)
      max-message-size = 16000KiB
    }
  }
}

Note: These settings are defined in the reference.conf. You only need to specify any of these settings when you want to override the defaults.

Running Zipkin

See the Zipkin documentation for running Zipkin. The Zipkin quickstart shows how to run Zipkin locally for development and testing.

Here’s what an example actor trace in Zipkin looks like:

Zipkin trace