Aeron Insights - CLI
The command-line tool gives you more detailed insight into your Aeron application than is possible with a Grafana dashboard. It can include data that is non-numeric or not a time series, such as error logs and configuration options, and can perform bespoke analysis of this data.
To run the command-line tool, execute the insights-cli.jar jar with
the name of the insight to run. For example, cluster-status or
errors.
You can see the full set of supported program arguments and Insights by running with
--help.
Example:
java -Daeron.observable.sources=/aeron/config/ -jar /aeron/libs/insights-cli.jar cluster-status
You can experiment with the Insights CLI by running it on the cluster nodes in the Insights Example.
Configuring Aeron Insights
In order for Aeron Insights to monitor components, property files must be provided files containing information about the component such as its data directories, the type of component it is, etc.
See Observable Property Files for more details.
Sample Output
Below you can see sample output for the usage menu and the Cluster Status Insight.
Usage Menu
Usage: LocalInsightsMain [-Daeron.observable.sources=<path(s)>] <command>
Arguments:
-Daeron.observable.sources=<path(s)> Path(s) to the Aeron driver, cluster and archive configuration files or directories.
<command> Command to execute (mandatory, only one command allowed).
Options:
--help, -h Show this help message and exit.
Available insights:
processing-time Provides information about processing time of the components of an Aeron application.
loss Provides details of any packet loss observed by the Media Driver.
cluster-log Provides insights into the cluster log
backpressure Provides information about backpressure that has been observed on this node.
hardware-sanity-check Checks system hardware to determine suitability for a production Aeron application.
diagnostic Checks for various different indicators of problems, such as errors, loss, etc.
describe-file Provides a brief description of the specified file or directory.
log-buffers Provides information about log buffers.
directory Provides insights into the various directories
cluster-status Provides information about the status of the specified cluster.
errors Provides information about errors that have been observed on this node.
Cluster Status
Cluster ID: 0 Last driver heartbeat: 2025-03-19 11:31:41+0000 (GMT) - 558 ms ago Last cluster activity: 2025-03-19 11:31:42+0000 (GMT) - 149 ms ago Member ID: 1 Leadership term ID: 1 Consensus module state: ACTIVE Cluster node role: LEADER Cluster election state: CLOSED Log position: 4,576
Loss
! #OBSERVATION_COUNT,TOTAL_BYTES_LOST,FIRST_OBSERVATION,LAST_OBSERVATION,SESSION_ID,STREAM_ID,CHANNEL,SOURCE ! 64,4736,2025-03-18 18:46:03.959+0000,2025-03-18 18:46:15.762+0000,480590929,108,aeron:udp?endpoint=node1:9103|term-length=64k,172.18.100.10:46581 ! Total number of NAKs received: 63 ! Total number of NAKs sent: 64 Loss gaps filled: 0 ! First NAK sent: 2025-03-18 18:46:03+0000 (UTC). (13,087 ms ago) ! Last NAK sent: 2025-03-18 18:46:15+0000 (UTC). (1,284 ms ago)
Processing Time
===================== Processing times for the Media Driver ====================
conductor: Maximum time spent executing a duty cycle: 1,068,247 ns.
conductor: Total number of times a duty cycle breached the slow-processing threshold: 0.
sender: Maximum time spent executing a duty cycle: 1,068,106 ns.
sender: Total number of times a duty cycle breached the slow-processing threshold: 0.
receiver: Maximum time spent executing a duty cycle: 1,068,106 ns.
receiver: Total number of times a duty cycle breached the slow-processing threshold: 0.
name-resolver: Maximum time spent resolving an address: 31,630 ns.
name-resolver: Total number of times name resolution breached the slow-processing threshold: 0.
======================= Processing times for the Archive =======================
! conductor: Maximum time spent executing a duty cycle: 9,223,372,036 ns.
! conductor: Total number of times a duty cycle breached the slow-processing threshold: 1.
! recorder: Maximum time spent executing a duty cycle: 9,223,372,036 ns.
! recorder: Total number of times a duty cycle breached the slow-processing threshold: 1.
! replayer: Maximum time spent executing a duty cycle: 9,223,372,036 ns.
! replayer: Total number of times a duty cycle breached the slow-processing threshold: 1.
recorder: Maximum time spent writing to the archive: 0 ns.
replayer: Maximum time spent reading from the archive: 0 ns.
Describe File
Describe File: /dev/shm/aeron-root/cnc.dat Type: Media Driver Command and Control file Description: Prevents multiple media drivers from using the same directory. Contains buffers to facilitate communication between the driver and its clients, aeron counters, and the driver error log. Last heartbeat was 995 milliseconds ago Found 143 counters Found 8 publications Found 9 subscriptions Found 0 errors
Describe File: /aeron/data/archive/archive.catalog Type: Archive Catalog File Description: Tracks recordings and their Segment files. It contains a series of RecordingDescriptor records. Found 1 recording
Errors
===================== Error report for Clustered Service 0 =====================
! Total errors for Clustered Service 0: 1
! Distinct errors for Clustered Service 0: 1
! Most recent error: 2025-03-19 11:44:48+0000 (UTC) (891 ms ago)
! Observations from 2025-03-19 11:44:48+0000 (UTC) to 2025-03-19 11:44:48+0000 (UTC) for: java.lang.RuntimeException: Chaos error
at io.aeron.insights.examples.cluster.ExampleClusterMain$ExampleClusteredService.chaos(ExampleClusterMain.java:166)
at io.aeron.insights.examples.cluster.ExampleClusterMain$ExampleClusteredService.onSessionMessage(ExampleClusterMain.java:147)
at io.aeron.cluster.service.ClusteredServiceAgent.onSessionMessage(ClusteredServiceAgent.java:473)
at io.aeron.cluster.service.BoundedLogAdapter.onMessage(BoundedLogAdapter.java:156)
at io.aeron.cluster.service.BoundedLogAdapter.onFragment(BoundedLogAdapter.java:70)
at io.aeron.Image.boundedControlledPoll(Image.java:599)
at io.aeron.cluster.service.BoundedLogAdapter.poll(BoundedLogAdapter.java:133)
at io.aeron.cluster.service.ClusteredServiceAgent.doWork(ClusteredServiceAgent.java:242)
at org.agrona.concurrent.AgentRunner.doWork(AgentRunner.java:304)
at org.agrona.concurrent.AgentRunner.workLoop(AgentRunner.java:296)
at org.agrona.concurrent.AgentRunner.run(AgentRunner.java:162)
at java.base/java.lang.Thread.run(Thread.java:1583)
No errors recorded for: Driver
No errors recorded for: Archive
No errors recorded for: Cluster