Core Principle

OpenTelemetry Collector feature gates carry a Stage field (alpha → beta → stable → deprecated → removed) that conventionally signals maturity progression toward default-on behavior. But some gates are designed as permanent switches — intentionally held at alpha forever — and reading a gate’s status as “not yet mature” is incorrect for these cases. Gate status must be interpreted with intent, not just the stage label.

The transitional stages describe a pre-GA readiness trajectory: a gate moves alpha → beta as the feature stabilizes, then stable as it becomes the default. The permanent pattern is different: the gate exists as a perpetual opt-in control, typically for security or data-integrity reasons, and its alpha status reflects “you must explicitly consent to this behavior” rather than “this isn’t ready yet.”

Why This Matters

  • Seeing an “alpha feature gate that’s 3+ years old” invites the wrong mental model: “unstable, avoid, wait for graduation.” The correct model for some gates is: “this will never graduate and has long-standing stable semantics — the alpha label is the access control mechanism.”
  • Conflating the two leads to wrong planning. You write comments like “remove this override once it graduates to stable” and then discover years later that it’s not going anywhere.
  • Contributors repeatedly propose promotion for permanent gates, and maintainers repeatedly decline. The decline isn’t “not yet” — it’s “never, by design.” This is easy to miss if you only read the gate’s docs and not the original RFC/issue.

Evidence/Examples

filelog.allowFileDeletion — guards the delete_after_read option in the filelog receiver. Introduced in v0.70.0 (Jan 2023), stage defined in pkg/stanza/fileconsumer/metadata.yaml as stage: alpha with no to_version. Maintainer @djaglowski on the original issue #16314 in 2022:

“The worst case scenario is that a malicious actor is able to delete important files… putting this functionality behind a permanent feature gate… a malicious actor who is only able to reconfigure the collector (but not in control of the startup command) is unable to delete files.”

Three+ years later, still alpha, no graduation in progress, no activity suggesting it will move. The alpha label is functioning as a security control requiring an explicit --feature-gates=filelog.allowFileDeletion flag at startup — inaccessible via config file modification alone.

Other candidates in the same category: security-sensitive or policy-sensitive toggles where the gate is the UX, not the readiness indicator. Worth scanning any long-lived alpha gate’s original RFC for language like “permanent,” “security,” “opt-in,” “defense-in-depth.”

Implications

  • Read the originating issue, not just the docs, when an alpha gate is load-bearing in your config. Look for maintainer comments on the policy intent. Search for graduation discussion; absence is the signal that there’s no plan.
  • Document override intent in your own config. If you’re enabling a permanent alpha gate, write a comment explaining it’s permanent, not “until it graduates.” Prevents future drift in your own planning and onboarding.
  • Don’t time your infrastructure removal cycle to gate graduation for permanent gates. They don’t graduate. Your override is permanent too.
  • Security-relevant features are plausibly permanent alpha candidates. If a feature causes data deletion, exfiltration, or irreversible mutation, suspect the gate is a permanent access control until proven otherwise.
  • OTel Metric Temporality - Delta vs Cumulative — another case where OTel behavior surprises you based on misreading implementation details. Temporality preference is “set” but the backend cache invalidation isn’t what you’d expect; gate stage “alpha” isn’t what you’d expect. Both reward going to primary sources (source files, RFCs, maintainer comments) over docs.

Questions

  • What other long-lived alpha gates in OTel Collector are permanent by design? A systematic audit of stage: alpha gates with from_version older than 18 months would likely reveal a handful.
  • Does OTel have a formal “permanent gate” convention, or is it pattern-only? The featuregate.Stage enum doesn’t distinguish — arguably it should, as a StagePermanent or similar.
  • Do other observability ecosystems (Grafana Agent, Vector, Fluent Bit) have analogous permanent-gate patterns, or do they handle irreversible-action opt-in differently?