Configure logs collection
Elastic Stack Serverless Observability
Learn how to configure and customize logs collection through the Elastic Distribution of OpenTelemetry Collector.
Elasticsearch Ingest Pipelines are not yet applicable to OTel-native data. Use OTel Collector processing pipelines for pre-processing and parsing of logs.
You can parse logs that come in JSON format through
filelog
receiver's operators. Use the router
to check if the format is JSON and route the logs to json-parser
. For example:
# ...
receivers:
filelog:
# ...
operators:
# Check if format is json and route properly
- id: get-format
routes:
- expr: body matches "^\\{"
output: json-parser
type: router
# Parse body as JSON https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/json_parser.md
- type: json_parser
id: json-parser
on_error: send_quiet
parse_from: body
parse_to: body
# ...
You can parse multiline logs using the
multiline
option as in the following example:
receivers:
filelog:
include:
- /var/log/example/multiline.log
multiline:
line_start_pattern: ^Exception
The previous configuration can parse the following logs that span across multiple lines and recombine them properly into one single log message:
Exception in thread 1 "main" java.lang.NullPointerException
at com.example.myproject.Book.getTitle(Book.java:16)
at com.example.myproject.Author.getBookTitles(Author.java:25)
at com.example.myproject.Bootstrap.main(Bootstrap.java:14)
Exception in thread 2 "main" java.lang.NullPointerException
at com.example.myproject.Book.getTitle(Book.java:16)
at com.example.myproject.Author.getBookTitles(Author.java:25)
at com.example.myproject.Bootstrap.main(Bootstrap.java:44)
You can configure applications instrumented with OpenTelemetry SDKs to write their logs in OTLP/JSON
format in files stored on disk. The filelog receiver can then collect and parse the logs and
forward them to the otlpjson
connector, which extracts the OTLP
logs from the OTLP/JSON
log lines.
An example OTLP/JSON
log is the following:
{
"resourceLogs": [
{
"resource": {
"attributes": [
{
"key": "deployment.environment.name",
"value": {
"stringValue": "staging"
}
},
{
"key": "service.instance.id",
"value": {
"stringValue": "6ad88e10-238c-4fb7-bf97-38df19053366"
}
},
{
"key": "service.name",
"value": {
"stringValue": "checkout"
}
},
{
"key": "service.namespace",
"value": {
"stringValue": "shop"
}
},
{
"key": "service.version",
"value": {
"stringValue": "1.1"
}
}
]
},
"scopeLogs": [
{
"scope": {
"name": "com.mycompany.checkout.CheckoutServiceServer$CheckoutServiceImpl",
"attributes": []
},
"logRecords": [
{
"timeUnixNano": "1730435085776869000",
"observedTimeUnixNano": "1730435085776944000",
"severityNumber": 9,
"severityText": "INFO",
"body": {
"stringValue": "Order order-12035 successfully placed"
},
"attributes": [
{
"key": "customerId",
"value": {
"stringValue": "customer-49"
}
},
{
"key": "thread.id",
"value": {
"intValue": "44"
}
},
{
"key": "thread.name",
"value": {
"stringValue": "grpc-default-executor-1"
}
}
],
"flags": 1,
"traceId": "42de1f0dd124e27619a9f3c10bccac1c",
"spanId": "270984d03e94bb8b"
}
]
}
],
"schemaUrl": "https://opentelemetry.io/schemas/1.24.0"
}
]
}
You can use the following configuration to properly parse and extract the OTLP
content from these log lines:
receivers:
filelog/otlpjson:
include: [/path/to/myapp/otlpjson.log]
connectors:
otlpjson:
service:
pipelines:
logs/otlpjson:
receivers: [filelog/otlpjson]
processors: []
exporters: [otlpjson]
logs:
receivers: [otlp, otlpjson]
processors: []
exporters: [debug]
...
You can parse logs of a known technology, like Apache logs, through filelog receiver's operators. Use the regex_parser
operator to parse the logs that follow the specific pattern:
receivers:
# Receiver to read the Apache logs
filelog:
include:
- /var/log/*apache*.log
start_at: end
operators:
# Operator to parse the Apache logs
# This operator uses a regex to parse the logs
- id: apache-logs
type: regex_parser
regex: ^(?P<source_ip>\d+\.\d+.\d+\.\d+)\s+-\s+-\s+\[(?P<timestamp_log>\d+/\w+/\d+:\d+:\d+:\d+\s+\+\d+)\]\s"(?P<http_method>\w+)\s+(?P<http_path>.*)\s+(?P<http_version>.*)"\s+(?P<http_code>\d+)\s+(?P<http_size>\d+)$
The OpenTelemetry Collector also supports dynamic logs collection for Kubernetes Pods by defining Pods annotations. For detailed examples refer to Dynamic workload discovery on Kubernetes now supported with EDOT Collector and the Collector documentation.
Make sure that the Collector configuration includes the k8s_observer
and the receiver_creator
:
receivers:
receiver_creator/logs:
watch_observers: [k8s_observer]
discovery:
enabled: true
receivers:
# ...
extensions:
k8s_observer:
# ...
service:
extensions: [k8s_observer]
pipelines:
logs:
receivers: [ receiver_creator/logs]
In addition, make sure to remove or comment out any static filelog receiver. Restrict the log file pattern to avoid log duplication.
Annotating the pod activates custom log collection targeted only for the specific Pod.
# ...
metadata:
annotations:
io.opentelemetry.discovery.logs/enabled: "true"
io.opentelemetry.discovery.logs/config: |
operators:
- id: container-parser
type: container
# Check if format is json and route properly
- id: get-format
routes:
- expr: body matches "^\\{"
output: json-parser
type: router
- id: json-parser
type: json_parser
on_error: send_quiet
parse_from: body
parse_to: body
- id: custom-value
type: add
field: attributes.tag
value: custom-value
spec:
containers:
# ...
Targeting a single container's scope is also possible by scoping the annotation using containers' names, like io.opentelemetry.discovery.logs.my-container/enabled: "true"
. Refer to the Collector's documentation for additional information.
Use the following example to collect and parse Apache logs by annotating Apache containers:
metadata:
annotations:
io.opentelemetry.discovery.logs.apache/enabled: "true"
io.opentelemetry.discovery.logs.apache/config: |
operators:
- type: container
id: container-parser
- id: apache-logs
type: regex_parser
regex: ^(?P<source_ip>\d+\.\d+.\d+\.\d+)\s+-\s+-\s+\[(?P<timestamp_log>\d+/\w+/\d+:\d+:\d+:\d+\s+\+\d+)\]\s"(?P<http_method>\w+)\s+(?P<http_path>.*)\s+(?P<http_version>.*)"\s+(?P<http_code>\d+)\s+(?P<http_size>\d+)$
spec:
containers:
- name: apache
# ...
You can use OpenTelemetry Transform Language (OTTL) functions in the transform processor to parse logs of a specific format or logs that follow a specific pattern.
Use the following transform
processor configuration to parse logs in JSON
format:
processors:
transform:
error_mode: ignore
log_statements:
- context: log
statements:
# Parse body as JSON and merge the resulting map with the cache map, ignoring non-json bodies.
# cache is a field exposed by OTTL that is a temporary storage place for complex operations.
- merge_maps(cache, ParseJSON(body), "upsert") where IsMatch(body, "^\\{")
- set(body,cache["log"]) where cache["log"] != nil
The following configuration can parse Apache access log using OTTL and the transform processor:
exporters:
debug:
verbosity: detailed
receivers:
filelog:
include:
- /Users/chrismark/otelcol/log/apache.log
processors:
transform/apache_logs:
error_mode: ignore
log_statements:
- context: log
statements:
- 'merge_maps(attributes, ExtractPatterns(body, "^(?P<source_ip>\\d+\\.\\d+.\\d+\\.\\d+)\\s+-\\s+-\\s+\\[(?P<timestamp_log>\\d+/\\w+/\\d+:\\d+:\\d+:\\d+\\s+\\+\\d+)\\]\\s\"(?P<http_method>\\w+)\\s+(?P<http_path>.*)\\s+(?P<http_version>.*)\"\\s+(?P<http_code>\\d+)\\s+(?P<http_size>\\d+)$"), "upsert")'
service:
pipelines:
logs:
receivers: [filelog]
processors: [transform/apache_logs]
exporters: [debug]
A more detailed example about using OTTL and the transform processor can be found at the nginx_ingress_controller_otel integration.