Naming timestamps "TSC" or "tsc" is an historical artefact dating from
the implementation of libringbuffer, where the initial intent was to use
the x86 "rdtsc" instruction directly, which ended up not being what was
done in reality.
Rename uses of "TSC" and "tsc" to "timestamp" to clarify things and
don't require reviewers to be fluent in x86 instruction set.
5cc8729...
by
Kienan Stewart <email address hidden>
docs: Correct GitHub URLs in lttng-ust.3
The branches follow the format `stable-X.YZ` rather than `vX.YZ`.
Furthermore, when rendering the man pages from source, the URLs were
omitted completely as the subsitution `{lttng_version}` was not
defined. This hasn't been an issue for the published HTML versions as
those are produced via a different script in the `lttng-www` project
which presumably sets the substitution properly.
fix: handle EINTR correctly in get_cpu_mask_from_sysfs
If the read() in get_cpu_mask_from_sysfs() fails with EINTR, the code is
supposed to retry, but the while loop condition has (bytes_read > 0),
which is false when read() fails with EINTR. The result is that the code
exits the loop, having only read part of the string.
Use (bytes_read != 0) in the while loop condition instead, since the
(bytes_read < 0) case is already handled in the loop.
Original fix in liburcu from Benjamin Marzinski <email address hidden>:
commit 9922f33e2986 ("fix: handle EINTR correctly in get_cpu_mask_from_sysfs")
commit 4d4838bad480 ("Use MAP_POPULATE to reduce pagefault when available")
was first introduced in tag v2.11.0 and never backported to stable
branches. Its purpose was to reduce the tracer fast-path latency caused
by handling minor page faults the first time a given application writes
to each page of the ring buffer after mapping them. The discussion
thread leading to this commit can be found here [1]. When using
LTTng-UST for diagnosing real-time applications with very strict
constraints, this added latency is unwanted.
That commit introduced the MAP_POPULATE flag when mapping the ring
buffer pages, which causes the kernel to pre-populate the page table
entries (PTE).
This has, however, unintended consequences for the following scenarios:
* Short-lived applications which write very little to the ring buffer end
up taking more time to start, because of the time it takes to
pre-populate all the ring buffer pages, even though they typically won't
be used by the application.
* Containerized workloads using cpusets will also end up having longer
application startup time than strictly required, and will populate
PTE for ring buffers of CPUs which are not present in the cpuset.
There are, therefore, two sets of irreconcilable requirements:
short-lived and containerized workloads benefit from lazily populating
the PTE, whereas real-time workloads benefit from pre-populating them.
This will therefore require a tunable environment variable that will let
the end-user choose the behavior for each application.
Solution
--------
Allow users to specify whether they want to pre-populate
shared memory pages within the application with an environment
variable.
LTTNG_UST_MAP_POPULATE_POLICY
If set, override the policy used to populate shared memory pages within the
application. The expected values are:
none
Do not pre-populate any pages, take minor faults on first access while
tracing.
cpu_possible Pre-populate pages for all possible CPUs in the system, as listed by /sys/devices/system/cpu/possible.
Default: none. If the policy is unknown, use the default.
Choice of the default
---------------------
Given that users with strict real-time constraints already have to setup
their tracing with specific options (see the "--read-timer"
lttng-enable-channel(3) option [2]), it makes sense that the default
is to lazily populate the ring buffer PTE, and require users with
real-time constraints to explicitly enable the pre-populate through an
environment variable.
Effect on default behavior
--------------------------
The default behavior for ring buffer PTE mapping will be changing across
LTTng-UST versions in the following way:
glibc 2.34 implements close_range(2), which is used by the ssh client
(amongst others). This needs to be overridden to make sure ssh does not
close lttng-ust file descriptors.
`len_type' of a sequence field must be of type unsigned integer. Some
provided examples in the man page were incorrectly using a type signed
integer, resulting in correct compilation, but error while decoding.