Skip to content

Config v2.0#1351

Open
MrAlias wants to merge 25 commits intoopen-telemetry:mainfrom
MrAlias:config-v2
Open

Config v2.0#1351
MrAlias wants to merge 25 commits intoopen-telemetry:mainfrom
MrAlias:config-v2

Conversation

@MrAlias
Copy link
Copy Markdown
Contributor

@MrAlias MrAlias commented Feb 23, 2026

  • Design plan methodology for v2.0 of OBI config
  • JSON Schema for v2.0
  • Migration plan
  • Examples and tooling

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 23, 2026

Codecov Report

❌ Patch coverage is 0% with 321 lines in your changes missing coverage. Please review.
✅ Project coverage is 43.31%. Comparing base (4793b9b) to head (8c77956).
⚠️ Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
devdocs/config/version-2.0/verify.go 0.00% 321 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1351      +/-   ##
==========================================
- Coverage   43.76%   43.31%   -0.45%     
==========================================
  Files         308      313       +5     
  Lines       33495    34214     +719     
==========================================
+ Hits        14658    14819     +161     
- Misses      17894    18439     +545     
- Partials      943      956      +13     
Flag Coverage Δ
integration-test 21.55% <ø> (-0.13%) ⬇️
integration-test-arm 0.00% <ø> (ø)
integration-test-vm-x86_64-5.15.152 ?
integration-test-vm-x86_64-6.10.6 ?
k8s-integration-test 2.31% <ø> (-0.02%) ⬇️
oats-test 0.00% <ø> (ø)
unittests 44.12% <0.00%> (-0.48%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@NimrodAvni78 NimrodAvni78 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!
i know its in draft but wanted to give some input as well

startup_dump: ""
debug_trace_output: disabled
# Optional profiling endpoint.
profiling:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if super related to the new config initiative, but
for profiling and internal metrics, we can think on how to use the existing instrastructure in the collector (pprof extension and internal telemetry) while allowing to also run OBI as independent binary

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about removing these from the OBI config and moving into the command options/args?

Meaning the obi command would take these as flags and setup appropriate profiling/internal_metrics and then, since they are not in the config, the collector receiver would just assume the user will configure these profiling/interna metric settings in the appropriate part of the collector configuration.

@MrAlias MrAlias marked this pull request as ready for review February 25, 2026 20:11
@MrAlias MrAlias requested a review from a team as a code owner February 25, 2026 20:11
Copilot AI review requested due to automatic review settings February 25, 2026 20:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces the v2.0 configuration schema for OBI (OpenTelemetry Binary Instrumentation), representing a comprehensive redesign of the configuration model to better align with OpenTelemetry's declarative configuration format and improve user experience.

Changes:

  • Design documentation defining principles, user journeys, and configuration structure for v2.0
  • JSON Schema definition for the extensions.obi section with comprehensive validation rules
  • Migration plan outlining the strategy for transitioning from v1 to v2 configuration
  • Example default configuration in v2 format with detailed comments
  • Verification tooling (Go and Python) to validate schema correctness and ensure feature parity between v1 and v2

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
devdocs/config/version-2.0/config-v2.md Design document describing principles, user journeys, and the target v2 configuration structure with detailed field mappings from v1
devdocs/config/version-2.0/migration.md Migration plan outlining parsing behavior, CLI tooling requirements, and phased rollout strategy
devdocs/config/version-2.0/obi-extension.schema.json JSON Schema (Draft 2020-12) defining the complete v2 configuration structure for the extensions.obi section
devdocs/config/version-2.0/examples/default-configuration.yaml Comprehensive example showing the default v2 configuration with inline comments and OTel integration
devdocs/config/version-2.0/verify.go Go verification tool that validates feature parity between v1 defaults and v2 defaults through 94+ mapping checks
devdocs/config/version-2.0/validate_example.py Python validation script that validates configuration files against both the OBI extension schema and OTel declarative schema
devdocs/config/version-2.0/.verify/dump_default_config.go Utility to dump the current v1 default configuration for verification purposes
devdocs/config/version-2.0/.verify/default-config-current.yaml Snapshot of the current v1 default configuration used as baseline for parity verification

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@fstab
Copy link
Copy Markdown
Member

fstab commented Feb 26, 2026

Hi Tyler, thanks a lot for your awesome work!

Quick question: One requirement that we are frequently seeing at Grafana is configuration per service. For example, if tracing causes issues for a shopping cart service, you might want to disable it for that specific service but keep it for all other services. Another example: You may want specific HTTP route matchers for individual services.

Is per service configuration possible with the new config? If not, it would be great to add that.

@MrAlias
Copy link
Copy Markdown
Contributor Author

MrAlias commented Feb 26, 2026

One requirement that we are frequently seeing at Grafana is configuration per service.

@fstab yes, that should be possible. As I have designed the configuration, something like this would address the user concern:

extensions:
  obi:
    version: "2.0"
    selection:
      policy:
        # Exclude all services not matched
        default_action: exclude
      rules:
        - action: include
          match:
            process:
              exe_path_glob:
                - "/path/to/my_service"

@MrAlias
Copy link
Copy Markdown
Contributor Author

MrAlias commented Feb 26, 2026

Another example: You may want specific HTTP route matchers for individual services.

This is an interesting idea. I do not think it is possible in the current configuration, nor explicitly what I have specified here. I can take a look at the HTTP routes section a bit more and see about per-service matching. 👍

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fantastic! I really like the structure and the focus on what the end user wants to achieve.

I just have a comment on providing a way to override defaults for a selection -> rule. Given we run in a "daemonset" mode and instrument many services, more and more so, we've had the request to make certain settings be configurable per selection criteria.

It would be really nice if we could provide a way to override certain global settings.

# Ignore very new processes until they are old enough to evaluate.
min_process_age: 5s
# Ordered include/exclude rules.
rules:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to allow for defining overrides from the defaults per selection criteria. Initially when we built the config, all selection was global, but overtime we've had the need to override for given rules of the selection criteria.

One example is overriding what telemetry is exported for a rule, e.g. maybe globally we want just metrics, but for this subset we want traces. We also allow folks to override the sampler config for a given rule, maybe some namespace is not very important, so it's OK to sample with 1%.

Right now we have these:

	// Configures what to export. Allowed values are 'metrics', 'traces',
	// or an empty array (disabled). An unspecified value (nil) will use the
	// default configuration value
	ExportModes ExportModes `yaml:"exports"`

	SamplerConfig *SamplerConfig `yaml:"sampler"`

	Routes *CustomRoutesConfig `yaml:"routes"`

	// Metrics configuration that is custom for this service match
	Metrics perapp.SvcMetricsConfig `yaml:"metrics" env:"-"`

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re right: v2 has scoped filtering controls, but it does not yet have explicit per-selection-rule overrides equivalent to legacy exports/sampler/routes/metrics. I'm looking into ways to address this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated with details on how we can support custom sampling and still use the OTel declarative configuration semantics for the tracing pipeline as the unique definition. Given the sampler there is extensible, we can provide our own sampler with matching and tuning configuration.

I'm still looking at the refinement/override of instrumentation per selection.

Copy link
Copy Markdown
Contributor

@mariomac mariomac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing! I'd like to drop few minor comments.

service_name: ""
service_namespace: ""
metrics:
features: 8
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The global "features" section was deprecated in favor of each entry of "discovery > instrument".

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the auto-generation seems to have failed because the export.Features type is internally an integer, but we actually override the UnmarshalYAML method to accept a list of strings.

The real default value should be: ["application"]

Comment on lines +269 to +271
services: []
exclude_services: []
default_exclude_services:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

services, exclude_services and default_exclude_services (Regexp-based) were kept for backwards compatibility, but they were deprecated in favor of "instrument, exclude_instrument and default_exclude_instrument (Glob-based).

Maybe we can take the opportunity to remove them.

enrich:
# Runtime configuration for metadata enrichers.
enrichers:
kubernetes:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is run as a collector receiver it'll be duplicated cache/load on the k8s api with https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/k8sattributesprocessor which does the same enrichment

from: kubernetes
description: Default Kubernetes label mapping into canonical service identity.
map:
service.name:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k8sattributes does this as well following OTel semantic convensions for sourcing servce.name as prescribed here

outgoing: outgoing
incoming: incoming
# Attribute enrichment/selection controls.
attributes:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be done by collector processors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants