Aeron Insights - CLI

The command-line tool gives you more detailed insight into your Aeron application than is possible with a Grafana dashboard. It can include data that is non-numeric or not a time series, such as error logs and configuration options, and can perform bespoke analysis of this data.

To run the command-line tool, execute the insights-cli.jar jar with the name of the insight to run. For example, cluster-status or errors.

You can see the full set of supported program arguments and Insights by running with --help.

Example:

java -Daeron.observable.sources=/aeron/config/ -jar /aeron/libs/insights-cli.jar cluster-status

You can experiment with the Insights CLI by running it on the cluster nodes in the Insights Example.

Configuring Aeron Insights

In order for Aeron Insights to monitor components, property files must be provided files containing information about the component such as its data directories, the type of component it is, etc.

See Observable Property Files for more details.

Sample Output

Below you can see sample output for the usage menu and the Cluster Status Insight.

Usage Menu

 Usage: LocalInsightsMain [-Daeron.observable.sources=<path(s)>] <command>

  Arguments:
    -Daeron.observable.sources=<path(s)>  Path(s) to the Aeron driver, cluster and archive configuration files or directories.
    <command>                             Command to execute (mandatory, only one command allowed).

  Options:
    --help, -h            Show this help message and exit.

  Available insights:
    processing-time        Provides information about processing time of the components of an Aeron application.
    loss                   Provides details of any packet loss observed by the Media Driver.
    cluster-log            Provides insights into the cluster log
    backpressure           Provides information about backpressure that has been observed on this node.
    hardware-sanity-check  Checks system hardware to determine suitability for a production Aeron application.
    diagnostic             Checks for various different indicators of problems, such as errors, loss, etc.
    describe-file          Provides a brief description of the specified file or directory.
    log-buffers            Provides information about log buffers.
    directory              Provides insights into the various directories
    cluster-status         Provides information about the status of the specified cluster.
    errors                 Provides information about errors that have been observed on this node.

Cluster Status

  Cluster ID: 0
  Last driver heartbeat: 2025-03-19 11:31:41+0000 (GMT) - 558 ms ago
  Last cluster activity: 2025-03-19 11:31:42+0000 (GMT) - 149 ms ago
  Member ID: 1
  Leadership term ID: 1
  Consensus module state: ACTIVE
  Cluster node role: LEADER
  Cluster election state: CLOSED
  Log position: 4,576

Loss

! #OBSERVATION_COUNT,TOTAL_BYTES_LOST,FIRST_OBSERVATION,LAST_OBSERVATION,SESSION_ID,STREAM_ID,CHANNEL,SOURCE
! 64,4736,2025-03-18 18:46:03.959+0000,2025-03-18 18:46:15.762+0000,480590929,108,aeron:udp?endpoint=node1:9103|term-length=64k,172.18.100.10:46581

! Total number of NAKs received: 63
! Total number of NAKs sent:     64
  Loss gaps filled:              0
! First NAK sent:                2025-03-18 18:46:03+0000 (UTC). (13,087 ms ago)
! Last NAK sent:                 2025-03-18 18:46:15+0000 (UTC). (1,284 ms ago)

Processing Time

===================== Processing times for the Media Driver ====================
    conductor:     Maximum time spent executing a duty cycle: 1,068,247 ns.
    conductor:     Total number of times a duty cycle breached the slow-processing threshold: 0.
    sender:        Maximum time spent executing a duty cycle: 1,068,106 ns.
    sender:        Total number of times a duty cycle breached the slow-processing threshold: 0.
    receiver:      Maximum time spent executing a duty cycle: 1,068,106 ns.
    receiver:      Total number of times a duty cycle breached the slow-processing threshold: 0.
    name-resolver: Maximum time spent resolving an address: 31,630 ns.
    name-resolver: Total number of times name resolution breached the slow-processing threshold: 0.

======================= Processing times for the Archive =======================
!   conductor: Maximum time spent executing a duty cycle: 9,223,372,036 ns.
!   conductor: Total number of times a duty cycle breached the slow-processing threshold: 1.
!   recorder:  Maximum time spent executing a duty cycle: 9,223,372,036 ns.
!   recorder:  Total number of times a duty cycle breached the slow-processing threshold: 1.
!   replayer:  Maximum time spent executing a duty cycle: 9,223,372,036 ns.
!   replayer:  Total number of times a duty cycle breached the slow-processing threshold: 1.
    recorder:  Maximum time spent writing to the archive: 0 ns.
    replayer:  Maximum time spent reading from the archive: 0 ns.

Describe File

  Describe File: /dev/shm/aeron-root/cnc.dat
  Type: Media Driver Command and Control file
  Description: Prevents multiple media drivers from using the same directory. Contains buffers to facilitate communication between the driver and its clients, aeron counters, and the driver error log.
  Last heartbeat was 995 milliseconds ago
  Found 143 counters
  Found 8 publications
  Found 9 subscriptions
  Found 0 errors
  Describe File: /aeron/data/archive/archive.catalog
  Type: Archive Catalog File
  Description: Tracks recordings and their Segment files.  It contains a series of RecordingDescriptor records.
  Found 1 recording

Errors

===================== Error report for Clustered Service 0 =====================
! Total errors for Clustered Service 0:    1
! Distinct errors for Clustered Service 0: 1
! Most recent error:                       2025-03-19 11:44:48+0000 (UTC) (891 ms ago)

! Observations from 2025-03-19 11:44:48+0000 (UTC) to 2025-03-19 11:44:48+0000 (UTC) for: java.lang.RuntimeException: Chaos error
    at io.aeron.insights.examples.cluster.ExampleClusterMain$ExampleClusteredService.chaos(ExampleClusterMain.java:166)
    at io.aeron.insights.examples.cluster.ExampleClusterMain$ExampleClusteredService.onSessionMessage(ExampleClusterMain.java:147)
    at io.aeron.cluster.service.ClusteredServiceAgent.onSessionMessage(ClusteredServiceAgent.java:473)
    at io.aeron.cluster.service.BoundedLogAdapter.onMessage(BoundedLogAdapter.java:156)
    at io.aeron.cluster.service.BoundedLogAdapter.onFragment(BoundedLogAdapter.java:70)
    at io.aeron.Image.boundedControlledPoll(Image.java:599)
    at io.aeron.cluster.service.BoundedLogAdapter.poll(BoundedLogAdapter.java:133)
    at io.aeron.cluster.service.ClusteredServiceAgent.doWork(ClusteredServiceAgent.java:242)
    at org.agrona.concurrent.AgentRunner.doWork(AgentRunner.java:304)
    at org.agrona.concurrent.AgentRunner.workLoop(AgentRunner.java:296)
    at org.agrona.concurrent.AgentRunner.run(AgentRunner.java:162)
    at java.base/java.lang.Thread.run(Thread.java:1583)

  No errors recorded for: Driver
  No errors recorded for: Archive
  No errors recorded for: Cluster