Configure your Collector to gather LangSmith telemetry
As seen in the previous section, the various services in a LangSmith deployment emit telemetry data in the form of logs, metrics and traces. You may already have telemetry collectors set up in your Kubernetes cluster, or would like to deploy one to monitor your application.
This section will show you how to configure an OTel Collector to gather telemetry data from LangSmith. Note that all of the concepts discussed below can be translated to other collectors such as Fluentd or FluentBit.
This section is only applicable for Kubernetes deployments.
Receivers
Logs
As discussed previously, logs are read from the filesystem of the nodes/containers running the application. An example configuration for reading logs from files:
filelog:
exclude: []
include:
- /var/log/pods/*/*/*.log
include_file_name: false
include_file_path: true
operators:
- id: container-parser
max_log_size: 102400
type: container
retry_on_failure:
enabled: true
start_at: end
The above configuration reads logs from all files in the cluster that the collector has access to. If you would like to only read LangSmith logs, you need to either:
-
Only include files from containers in your LangSmith namespace.
or
-
Filter out logs from other namespaces and/or application in your processing logic.
Metrics
Metrics can be scraped using the Prometheus endpoints. The configuration below scraps all LangSmith database and service metrics:
prometheus:
config:
scrape_configs:
- job_name: database-metrics
scrape_interval: 15s
static_configs:
- targets:
- <postgres_service_name>.<namespace>.svc.cluster.local:<metrics_port>
- <redis_service_name>.<namespace>.svc.cluster.local:<metrics_port>
- <clickhouse_service>.<namespace>.svc.cluster.local:<metrics_port
metrics_path: /metrics
- job_name: service-metrics
scrape_interval: 15s
static_configs:
- targets:
- <backend_service_name>.<namespace>.svc.cluster.local:<metrics_port>
- <platform_backend_service_name>.<namespace>.svc.cluster.local:<metrics_port>
- <playground_service_name>.<namespace>.svc.cluster.local:<metrics_port>
metrics_path: /metrics
Traces
For traces, you need to enable the OTLP receiver. The following configuration can be used to listen to HTTP traces on port 4318, and GRPC on port 4317:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
Processors
Recommended OTEL Processors
The following processors are recommended when using the OTel collector:
- Batch Processor: Groups the data into batches before sending to exporters.
- Memory Limiter: Prevents the collector from using too much memory and crashing. When the soft limit is crossed, the collector stops accepting new data.
- Kubernetes Attributes Processor: Adds Kubernetes metadata such as pod name into the telemetry data.
Exporters
Exporters just need to point to an external endpoint of your liking. The following configuration allows you to configure a separate endpoint for logs, metrics and traces:
otlphttp/logs:
endpoint: <your_logs_endpoint>
tls: false
otlphttp/metrics:
endpoint: <your_metrics_endpoint>
tls: false
otlphttp/traces:
endpoint: <your_traces_endpoint>
tls: false
The OTel Collector also supports exporting directly to a Datadog endpoint.
Example Collector Configuration:
Note that this configuration uses a filter to drop any logs from files other than LangSmith logs in the deployment namespace.
receivers:
filelog:
exclude: []
include:
- /var/log/pods/*/*/*.log
include_file_name: false
include_file_path: true
operators:
- id: container-parser
max_log_size: 102400
type: container
retry_on_failure:
enabled: true
start_at: end
prometheus:
config:
scrape_configs:
- job_name: database-metrics
scrape_interval: 15s
static_configs:
- targets:
- <postgres_service_name>.<namespace>.svc.cluster.local:<metrics_port>
- <redis_service_name>.<namespace>.svc.cluster.local:<metrics_port>
- <clickhouse_service>.<namespace>.svc.cluster.local:<metrics_port
metrics_path: /metrics
- job_name: service-metrics
scrape_interval: 15s
static_configs:
- targets:
- <backend_service_name>.<namespace>.svc.cluster.local:<metrics_port>
- <platform_backend_service_name>.<namespace>.svc.cluster.local:<metrics_port>
- <playground_service_name>.<namespace>.svc.cluster.local:<metrics_port>
metrics_path: /metrics
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
send_batch_size: 8192
timeout: 1s
memory_limiter:
check_interval: 1m
limit_percentage: 75
spike_limit_percentage: 25
filter:
error_mode: ignore
logs:
log_record:
- 'resource.attributes["k8s.namespace.name"] != "<langsmith_namespace>"'
- 'resource.attributes["k8s.app.name"] != "<langsmith_app_name>"'
k8sattributes:
extract:
labels:
- from: pod
key: app.kubernetes.io/name
tag_name: k8s.app.name
metadata:
- k8s.namespace.name
exporters:
otlphttp/logs:
endpoint: <your_logs_endpoint>
tls:
insecure: false
otlphttp/metrics:
endpoint: <your_metrics_endpoint>
tls:
insecure: false
otlphttp/traces:
endpoint: <your_traces_endpoint>
tls:
insecure: false
service:
pipelines:
logs/langsmith:
receivers: [filelog]
processors: [k8sattributes, filter, batch, memory_limiter]
exporters: [otlphttp/logs]
metrics/langsmith:
receivers: [prometheus]
processors: [batch, memory_limiter]
exporters: [otlphttp/metrics]
traces/langsmith:
receivers: [otlp]
processors: [batch, memory_limiter]
exporters: [otlphttp/traces]