LTTng trace format

This document was written after trying to understand the LTTng format. It hopefully provides a useful reference to others. Parts of which I am unsure of are marked in red. UPDATE: I have since found the official documentation, so this now documents my questions about the format.

Things that I do not understand are marked in red. Things that are currently unused or unimplemted are marked in green.

Directory format

LTTng stores data in a directory containing multiple files, rather than just a single file. This directory contains the following a subdirectory eventdefs which contain the XML-like definitions of the events. It contains a directory control which contains a set of trace files of the form tracefilename_n where n is the CPU number. One of the sets of traces is called facilities. These files are special as the contain facility load events, which load event definitions from the XML-like files. Other sets of files are:

Trace set nameDescription
interruptsContains any interrupt events
modulesContains module load and unload events
processesContains important process events such as creation and exit.

Most of the events are stored in the traceset called cpu. These tracefiles are stored in the root of the trace directory, not under control. These files seem to contain most events.

Block format

Each tracefile consists of a list of blocks. Each block contains a header followed by a list of events.

Each block starts with a header containg the following fields:

Block header format
NameOffset (bytes)Typedescription
begin_cycle_count0uint64_t Value of the timestamp counter for the first entry in this block
begin_freq8uint64_t CPU frequency at the start of the block.
end_cycle_count16uint64_t Value of the timestamp counter for the last entry in this block
end_freq24uint64_t Clock frequency at the end of the block. Will be used on variable frequency processors.
lost_size32uint32_t The number of unused bytes in this block. As events don't always fill the block entirely this indicates how many bytes are unused at the end.
buf_size36uint32_t The size of this buffer.

The number of blocks in a file is determined by dividing the size of file by the size of the first block. In the future blocks will be of variable size will be used. Determing numbers of blocks in this case will require as scan of the file..

Following the block header, is the trace file header. This is contained in each block to support flight recorder mode of operation. The trace file header has a set of general headers which are defined for all trace files, and then, depending on the trace file version, a set of extra fields. The basic fields are:

Generic trace file header format
NameOffset (bytes)Typedescription
magic_number40uint32_t A magic number 0x00D6B7ED. This field can be read to determine the endianess of the file.
arch_type44uint32_t Architecture of the traced machine. Valid values are:
  • 1:i386
  • 2:ppc
  • 3:sh
  • 4:s390
  • 5:MIPS
  • 6:ARM
  • 7:ppc64
  • 8:x86_64
arch_variant48uint32_t Architecture variant. May not be used on all architecutres. Valid values are:
  • 0: No varianet
float_word_order52uint32_t Byte order of float and doubles. Valid values are:
  • 0:No floats in the trace
  • __LITTLE_ENDIAN (1234):Trace has little endian floats
  • __BIG_ENDIAN (4321):Trace has big endian floats
arch_size56uint8_t Size of void * in bytes.
major_version57uint8_t Major version number of the file format.
minor_version58uint8_t Minor version number of the file format.
flight_recorder59uint8_t Is flight recorded mode activated?
has_heartbeat60uint8_t Does this trace have a heartbeat? Valid values:
  • 1 (yes):Event header has 32-bit timestamp.
  • 0 (no):Event header has 64-bit timestamp.
alignment61uint8_t Alignment of event header and event data within the trace file. (Not was previously badly named has_alignment).
freq_scale62uint32_t Amount by which frequency is scaled. real_freq = freq / freq_scale.
Version 0.7 trace file header format
NameOffset (bytes)Typedescription
start_freq66uint64_t Frequency at the start of the trace. This may differ from the begin_freq field in the first block if LTT was operating in flight recorder mode.
start_tsc74uint64_t Time stamp counter at the start of the trace. This may differ from the begin_cycle_count field in the first block if LTT was operating in flight recorder mode.
unused74uint64_tUNUSED
start_time_sec82uint64_t Seconds component of time at start of the trace. This is the only reference to real time in the trace.
start_time_usec82uint64_t Microseconds component of time at start of the trace.

Event format

Following the trace file header is a list of events. Each event consists of an event header, which is generic to all types of events, and event data which which is specific to the event type. Each component is aligned to the alignment field in the trace header.

Event header format (no heartbeat)
NameOffset (bytes)Typedescription
timestamp0uint64_t timestamp counter of this event
facility_id8uint8_t References which facility this event is in.
event_id9uint8_t Identifies the specific event in the given facility.
event_size10uint16_t Size of event data. Valid values are:
  • 0 - 0xFFFE:Size of the event data.
  • 0xFFFF:Event data too big. Parse must determine data size.
Event header format (with heartbeat)
NameOffset (bytes)Typedescription
timestamp0uint32_t 32 lowest significant bits of timestamp counter of this event
facility_id4uint8_t References which facility this event is in.
event_id5uint8_t Identifies the specific event in the given facility.
event_size10uint16_t Size of event data. Valid values are:
  • 0 - 0xFFFE:Size of the event data.
  • 0xFFFF:Event data too big. Parse must determine data size.

Future stuff.

bookmarks.xml

bookmarks.xml will eventually be generated by the viewer to add annotations to the trace (for example, a specific time marked with some text information). It has not been implemented yet.

system.xml

system.xml should be an XML file that contains the system information at trace start as it is seen by lttctl. Note that whis information _should_ be optional, as a trace can start before there is any existing user space VFS. Moreover, this information is not always relevant : think of a vserver system, where the hostname is different for processes in each vserver : as the whole kernel is traced, which hostname will be chosen ? The answer is : the hostname as seen in the environment variables of the lttctl process. This, too, has not been implemented yet.

Event type definitions

The format of the event data depends on the specific event it is. These events are defined in XML-like files found in the eventdefs subdirectory of the trace.

The core events are defined in core.xml, and are available at startup. Other event definitions are loaded explicity from events in the facilities trace files.

Event definition format

This is best described by the XML schema description.

Events

The following documentation is generated directly from the XML files.

Template: core

The core facility contains the basic tracing related events.

Type: timestamp

Event: facility_load

Facility is loaded.

Template: core

The core facility contains the basic tracing related events.

Type: timestamp

Event: facility_load

Facility is loaded.

name
Name of the facility to load.
checksum
XXX: Unknown
id
The ID to assign to this facility.
int_size
Default size of integer for facility.
long_size
Size of long for facility.
pointer_size
Size of pointers for facility.
size_t_size
Size of size_t for facility.
has_alignment
XXX: Size of alignment, or bool?

Event: facility_unload

Facility is unloaded.

id
Idenfities the facility to unload.

Event: time_heartbeat

System time values sent periodically to detect cycle counter rollovers. Useful when only the 32 LSB of the TSC are saved in events header : we save the full 64 bits in this event.

timestamp

Event: state_dump_facility_load_per_trace

Facility is loaded while in state dump. XXX: I am not sure why this is special, and needed in addition to facility load.

name
Name of the facility to load.
checksum
XXX: Unknown
id
The ID to assign to this facility.
int_size
Default size of integer for facility.
long_size
Size of long for facility.
pointer_size
Size of pointers for facility.
size_t_size
Size of size_t for facility.
has_alignment
XXX: Size of alignment, or bool?

Template: fs

The fs facility contains events related to file system operation

Event: buf_wait_start

Staring to wait for a buffer

address
Address of the buffer head.

Event: buf_wait_end

Ending to wait for a buffer

address
Address of the buffer head.

Event: exec

Executing a file

filename
File name

Event: open

Opening a file

filename
File name
fd
File descriptor

Event: close

Closing a file descriptor

fd
File descriptor

Event: read

Reading from a file descriptor

fd
File descriptor
count
Number of bytes to read

Event: write

Write to a file descriptor

fd
File descriptor
count
Number of bytes to write

Event: seek

Seek a file descriptor

fd
File descriptor
offset
Number of bytes to write
origin
Number of bytes to write

Event: ioctl

Do a IOCTL on a file descriptor

fd
File descriptor
cmd
Command
arg
Argument

Event: select

Do a select on a file descriptor

fd
File descriptor
timeout
Time out

Event: poll

Do a poll on a file descriptor

fd
File descriptor

Template: ipc

The ipc facility contains events related to Inter Process Communication

Event: call

IPC call

call_number
Number of IPC call
first
First argument

Event: msg_create

Get an IPC message queue identifier

id
Message queue identifier
flags
Message flags

Event: sem_create

Get an IPC semaphore identifier

id
Semaphore identifier
flags
Semaphore flags

Event: shm_create

Get an IPC shared memory identifier

id
Shared memory identifier
flags
Shared memory flags

Template: kernel

The kernel facility has events related to kernel execution status.

Type: tasklet_priority

Type: irq_mode

Event: trap_entry

Entry in a trap

trap_id
Trap number
address
Address where trap occured

Event: trap_exit

Exit from a trap

Event: soft_irq_entry

Soft IRQ entry

softirq_id
Soft IRQ number

Event: soft_irq_exit

Soft IRQ exit

softirq_id
Soft IRQ number

Event: tasklet_entry

Tasklet entry

priority
Tasklet priority
address
Tasklet function address
data
Tasklet data address

Event: tasklet_exit

Tasklet exit

priority
Tasklet priority
address
Tasklet function address
data
Tasklet data address

Event: irq_entry

Entry in an irq

irq_id
IRQ number
mode
Are we executing kernel code

Event: irq_exit

Exit from an IRQ

Template: kernel_arch

The kernel facility has events related to kernel execution status for the i386 architecture.

Type: syscall_name

Event: syscall_entry

System call entry

syscall_id
Syscall entry number in entry.S
address
Address from which call was made

Event: syscall_exit

System call exit

Template: memory

The memory facility has memory management events.

Event: page_alloc

Page allocation

order
Order of the page to allocate
address
Assigned page address, or 0 if failed.

Event: page_free

Page free

order
Order of the page to free
address
Address of the page to free.

Event: swap_in

Page swapped into memory

address
Address of the page to swap in.

Event: swap_out

Page swapped to disk

address
Address of the page to swap out.

Event: page_wait_start

Staring to wait for a page

address
Address of the page we wait for.

Event: page_wait_end

Ending wait for a page

address
Address of the page we wait for.

Template: network

The network facility contains events related to low level network operations

Event: packet_in

A packet is arriving

skbuff
Socket buffer pointer : identify the socket buffer
protocol
Protocol of the packet

Event: packet_out

We send a packet

skbuff
Socket buffer pointer : identify the socket buffer
protocol
Protocol of the packet

Template: process

The process facility has events related to process handling in the kernel.

Type: signal_name

Event: fork

Process fork

parent_pid
PID of the parent process
child_pid
PID of the child process

Event: kernel_thread

Just created a new kernel thread

pid
PID of the kernel thread
function
Function called

Event: exit

Process exit

pid
PID of the process

Event: wait

Process wait

parent_pid
PID of the waiting process
child_pid
PID of the process waited for

Event: free

Process kernel data structure free (end of life of a zombie)

pid
PID of the freed process

Event: kill

Process kill system call

pid
PID of the process
target_pid
PID of the process to kill
signal
Signal number

Event: signal

Process signal reception

pid
PID of the receiving process
signal
Signal number

Event: wakeup

Process wakeup

pid
PID of the receiving process
state
State of the awakened process. -1 unrunnable, 0 runnable, >0 stopped.

Event: schedchange

Scheduling change

out
Outgoing process
in
Incoming process
out_state
Outgoing process' state. -1 unrunnable, 0 runnable, >0 stopped.

Template: socket

The socket facility contains events related to sockets

Event: call

Generic socket call : FIXME : should be more detailed.

call_number
Number of socket call
first_argument
First argument of socket call

Event: create

Create a socket

socket
Socket structure address
family
Socket family
type
Socket type
protocol
Socket protocol
fd
Socket file descriptor

Event: sendmsg

Sending a socket message

socket
Socket structure address
family
Socket family
type
Socket type
protocol
Socket protocol
size
Size of the message

Event: recvmsg

Receiving a socket message

socket
Socket structure address
family
Socket family
type
Socket type
protocol
Socket protocol
size
Size of the message

Template: statedump

The statedump facility contains the events generated at trace startup

Type: execution_mode

Type: execution_submode

Type: process_status

Event: enumerate_file_descriptors

List of open file descriptors

name
File name
PID
Process identifier
fd
File descriptor index in this process's task_struct

Event: enumerate_vm_maps

List of active vm maps

PID
Process identifier
start
VM's start address
end
VM's end address
flags
VM area flags
pgoff
VM's page offset
inode
Inode associated with this VM

Event: enumerate_modules

List of loaded kernel modules

name
Module name
state
Module's state
ref
Number of references to this module

Event: enumerate_interrupts

List of registered interrupts

name
Interrupt name
action
action triggered by interrupt
num
Interrupt number

Event: enumerate_process_state

State of each process when statedump is performed

pid
Process identifier
parent_pid
Parent process identifier
name
Process name
mode
Execution mode
submode
Execution submode
status
Process status

Event: statedump_end

Kernel state dump complete

Template: timer

The timer facility has events related to timer events in the kernel.

Type: itimer_kind

Event: expired

A timer or itimer has expired.

pid
PID of the process to wake up.

Event: softirq

The timer softirq is currently runned.

Event: set_itimer

An interval timer is set.

which
kind of interval timer.
interval_seconds
interval_microseconds
value_seconds
value_microseconds