Project Proposal: Ecosystem Explorer #3000

jaydeluca · 2025-09-19T10:11:57Z

This PR contains a project proposal for a new standalone "ecosystem explorer" documentation website.

POC can be seen here: https://jaydeluca.github.io/instrumentation-explorer/

Note: This project proposal is dependent on identifying collaborators for the staffing bit. We will use this project proposal doc to socialize the effort in hopes of finding people interested in participating.

svrnm

Thanks for leading this @jaydeluca -- this also replaces #2246

projects/instrumentation-documentation.md

mx-psi

Is the Collector intentionally out of scope? From the discussions I have had with @svrnm it seems like this should be similar enough (instead of libraries we would be talking about components), so I feel like it would be a good idea to include it as well

jaydeluca · 2025-09-19T12:34:33Z

Is the Collector intentionally out of scope? From the discussions I have had with @svrnm it seems like this should be similar enough (instead of libraries we would be talking about components), so I feel like it would be a good idea to include it as well

@mx-psi yes and no. It's not entirely left out of the scope (see here), I have added evaluating the feasibility of this approach for both the collector and javascript as within scope, but I personally cannot commit to doing all the legwork for those as well. If we were to commit to a more concrete deliverable for those two projects, I think we will need a larger team.

Do you think the language I used around this isn't clear, or do you think it should be changed?

mx-psi · 2025-09-19T12:38:35Z

Is the Collector intentionally out of scope? From the discussions I have had with @svrnm it seems like this should be similar enough (instead of libraries we would be talking about components), so I feel like it would be a good idea to include it as well

@mx-psi yes and no. It's not entirely left out of the scope (see here), I have added evaluating the feasibility of this approach for both the collector and javascript as within scope, but I personally cannot commit to doing all the legwork for those as well. If we were to commit to a more concrete deliverable for those two projects, I think we will need a larger team.

Do you think the language I used around this isn't clear, or do you think it should be changed?

I guess my concern is with the naming, maybe something like "Ecosystem Documentation", and "Ecosystem explorer" would make people think that the Collector is (potentially) included, the current naming seems more focused on language libraries

jaydeluca · 2025-09-19T12:39:50Z

guess my concern is with the naming, maybe something like "Ecosystem Documentation", and "Ecosystem explorer" would make people think that the Collector is (potentially) included, the current naming seems more focused on language libraries

Ah yes, that makes sense, I can update. Thanks @mx-psi

svrnm · 2025-09-19T14:26:38Z

@jaydeluca can you add a section about the existing registry, how this project and the registry are related to each other, for me it would be totally fine to say that this is going to replace the registry eventually.

thompson-tomo · 2025-10-01T03:14:12Z

So I have come here from the comment in the backstage channel.

Thinking about this what are the thoughts about building a backstage plugin to add support for open telemetry in a similar fashion to api's (https://github.com/backstage/backstage/blob/master/plugins/api-docs/README.md). We could then use backstage rather than building a tool for the ecosystem explorer.

I would see the development steps as being:

introducing the telemetry section to list the signals and show the definitions in backstage
Allow components (libraries) to define what signals they produce and link it to the definition just like they can for apis.
Provide a way to show technology specific docs ie oracledb which provides descriptive info and the corresponding signals.

I also think it would be beneficial to ensure that weaver can be used to generate the file to add to the ecosystem Explorer so that as a user has one tool to use.

A nice thing I forsee is that we could potentially focus the sem conv specification section of the website on defining the base signals and the informative info is captured in the eco system Explorer.

jaydeluca · 2025-10-01T19:08:52Z

Thinking about this what are the thoughts about building a backstage plugin to add support for open telemetry in a similar fashion to api's (https://github.com/backstage/backstage/blob/master/plugins/api-docs/README.md). We could then use backstage rather than building a tool for the ecosystem explorer.

Thanks for raising this @thompson-tomo, it is something we also considered and explored a bit.

I think the main concern is the tradeoff in complexity. Backstage brings a lot of functionality, but it also requires provisioning infrastructure, ongoing upgrades, and database maintenance. In my experience, it often needs a dedicated team to keep it running smoothly. For our use case, much of that extra functionality isn't really essential, and could actually slow us down.

The initial POC focused on a static-site approach which is much simpler to operate, has almost no ongoing maintenance burden, and still meets our immediate needs. Personally, I'd much prefer a solution that requires as little operational overhead as possible.

trask · 2025-10-01T19:11:49Z

+1 on optimizing for low operational overhead since that's an area where we struggle as an Open Source project

thompson-tomo · 2025-10-06T05:26:18Z

I am all for a lower operational overhead but I do get worried when I see suggestions to build something as that can create tech debt especially when it is not a key part of the organisation business/objective. Hence suggestion to use an established product.

With the objective of low operational effort and static site generation in mind, we should be exploring how weaver can be contribute to the solution.

What I could foresee is:

User defines a weaver registry file which imports signals from the sem conv registry
User refines the imported signals based on what they are implementing
User defines either a seperate implementation metadata file or adds it to the registry file. TBD
Weaver codegen runs ie weaver implementation generate for the refined signals and metadata. This would produce boilerplate code which is used in the implementation and exports a yaml definition file (resolved schema).
the exported yaml could be added to the eco-system which is proceeded by weaver implementation describe to emit the static content.

All of that would be reusable especially if the repo/package readme could also be generated in the same manner.

jaydeluca · 2025-10-06T10:17:57Z

@thompson-tomo who is a "user" in your example? This project is specifically aimed at creating documentation for java instrumentation (and possibly the collector and other languages as a stretch goal). There are over 250 instrumentations in the java project, and at this time none of them use weaver to define their attributes. I have experimented with introducing weaver to the project in order to generate metrics/attributes for an instrumentation, but as of this time, there are no existing plans by the SIG to rewrite everything in that repo to use weaver (that I am aware of).

I am all for integrating weaver where it makes sense, but if we are saying we need to convert that repo to use weaver before we move ahead with this project, I fear it would be a very long time before we could deliver anything useful. There's also a significant number of things involved here that go beyond just documenting the attributes that I don't think weaver helps with.

Are you perhaps suggesting a pattern for non-official-opentelemetry instrumentations to be included in the ecosystem, as opposed to thinking about the existing OpenTelemetry instrumentation? Or do you think every language will soon refactor all OpenTelemetry instrumentations to use weaver?

As of now, I was thinking of including an integration with weaver's live-check to be able to provide a "score card" for each instrumentation, which indicates whether the signals it emits are semconv compliant. If there are other use cases I would love to explore them, but I don't think the example flow you lay applies here (or at least I'm having a hard time understanding how it applies).

I do get worried when I see suggestions to build something as that can create tech debt especially when it is not a key part of the organisation business/objective.

Could you elaborate on which organization business/objectives you are referring to? Is "OpenTelemetry" the "business" in this sense? Or is it CNCF? Are these objectives documented somewhere? Sorry if this is a stupid question, this is my first time proposing a project and it seems like there might be some additional context I need?

thompson-tomo · 2025-10-06T11:52:52Z

The User would be the instrumentation author.

at this time none of them use weaver to define their attributes

I don't actually think that is the case given alot of them would be following semantic conventions & using the semantic conventions package which is/should be generated by weaver.

there are no existing plans by the SIG to rewrite everything in that repo to use weaver

That is fine, Re-writing is an optional step and could be done incrementally ie switch to generated attributes rather than the package first. It is also perfectly ok, to just have the registry file and not generate any code.

Are you perhaps suggesting a pattern for non-official-opentelemetry instrumentations to be included in the ecosystem

It would be a process for both official and third party instrumentations to follow.

I agree weaver doesn't do everything needed but what it does do is, enable you to design/document your telemetry. This is where I see it playing a key role.

Put it this way, I see the output of weaver as telemetry version's of an openapi/asyncapi document. This output can either be used to generate a human readable file ie markdown using weaver or it can be used directly in a Web portal.

For me The more tools an author needs to use, the more barriers we are creating especially when they are not integrated and data is needing to be duplicated.

To summarise I feel it is beneficial to be using weaver to document/describe the signals as well as the instrumentation emitting them as they should be sharing significant amounts of data. This way the eco-system Explorer can then focus on presenting and making discoverable the emitted definitions.

projects/ecosystem-explorer.md

Co-authored-by: Vitor Vasconcellos <[email protected]>

jack-berg · 2025-11-10T15:29:30Z

When I read this proposal, I see this project as a an attempt to build a new registry (registry 2.0). I'm very supportive of this as I've been complaining about various deficiencies of the registry for some time.

Significantly, the proposal is to launch the new effort in parallel to the existing registry, allowing the contributors to iterate quickly and avoid the friction of a bunch of up front data model design work and consensus gathering.

Later, once we've worked out the sharp edges, we can merge registry 1.0 into registry 2.0, and EOL registry 1.0.

Some thoughts:

The registry already has a data model for capturing meta data about components (schema here), but its insufficient. As you note, there's all sorts of data which is important to capture and visualize for end users, which as of now, is very difficult for users to discover. The key examples being: configuration schema, schema of telemetry emitted, instructions on how to get started.
The registry already has tools for searching for components, and displaying bits of information about them, but its insufficient. More than anything else, the registry is an index of links, requiring the user to follow the link and find installation / configuration / telemetry schema details in the linked website / repository. Except this information is no always available, and when it is, there's no standardization about the display. A richer registry schema would allow a richer / more useful visualization experience.
The registry already has tooling for scanning for project repositories and creating / updating registry component entries. The problem is, its all centralized in the opentelemetry.io project and standardized, meaning that its only possible to scrape the types of meta data which can scraped from a central script. In his prototype, @jaydeluca has built tooling which leverages domain knowledge specific to the opentelemetry-java-instrumentation project, and is therefore able to generate much richer meta data that is captured by the registry today. Any effort for a better registry is probably going to require some coordination between the repos where components live (i.e SDK, instrumentation, contrib, collector, etc) and the opentelemetry.io repo. Some shared tooling can / should be centralized in opentelemetry.io, but the repos should also work to do a better job of publishing the types of meta data that can't be generated through simple scanner scripts.

jaydeluca added 2 commits September 18, 2025 20:59

first draft

d5e9885

updates

23339bc

svrnm requested changes Sep 19, 2025

View reviewed changes

svrnm mentioned this pull request Sep 19, 2025

Project proposal: Registry Data Quality Improvements #2246

Closed

mx-psi reviewed Sep 19, 2025

View reviewed changes

review notes

dcc13d4

instrumentation explorer -> ecosystem explorere

0f2e546

jaydeluca changed the title ~~Project Proposal: Instrumentation Documentation~~ Project Proposal: Ecosystem Explorer Sep 19, 2025

fix bad links

6b930d7

jaydeluca marked this pull request as ready for review September 19, 2025 14:31

jaydeluca requested review from a team, alolita, austinlparker, danielgblanco, jpkrohling, mtwo, tedsuo and trask as code owners September 19, 2025 14:31

add notes about replacing the existing registry

38235a6

svrnm added the area/project-proposal Submitting a filled out project template label Oct 1, 2025

trask mentioned this pull request Oct 3, 2025

Define/clarify ownership of go.opentelemetry.io #3049

Open

jaydeluca mentioned this pull request Oct 3, 2025

Instrumentation Metadata System open-telemetry/opentelemetry-java-instrumentation#13468

Closed

24 tasks

updates on layers of the project and responsibilities

0307f01

mx-psi reviewed Oct 29, 2025

View reviewed changes

projects/ecosystem-explorer.md Outdated Show resolved Hide resolved

projects/ecosystem-explorer.md Show resolved Hide resolved

mx-psi mentioned this pull request Oct 29, 2025

Ensure end user-focused component documentation is available in opentelemetry.io open-telemetry/opentelemetry-collector#14096

Open

update staffing

8909a1c

vitorvasc reviewed Oct 30, 2025

View reviewed changes

projects/ecosystem-explorer.md Show resolved Hide resolved

Update projects/ecosystem-explorer.md

b5c6615

Co-authored-by: Vitor Vasconcellos <[email protected]>

pellared mentioned this pull request Nov 3, 2025

SIG meeting notes open-telemetry/opentelemetry-go#6648

Open

mx-psi mentioned this pull request Nov 5, 2025

Changing requirements for contributing new components open-telemetry/opentelemetry-collector-contrib#44057

Open

clarify weaver should be used for metadata and metadata is out of scope

b7d6ed1

jaydeluca requested a review from maryliag as a code owner November 6, 2025 20:51

jaydeluca added 2 commits November 6, 2025 15:53

clarify no additional SIG

2a49e65

spell check

cc8a428

Project Proposal: Ecosystem Explorer #3000

Are you sure you want to change the base?

Project Proposal: Ecosystem Explorer #3000

Uh oh!

Conversation

jaydeluca commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

svrnm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mx-psi left a comment

Choose a reason for hiding this comment

Uh oh!

jaydeluca commented Sep 19, 2025

Uh oh!

mx-psi commented Sep 19, 2025

Uh oh!

jaydeluca commented Sep 19, 2025

Uh oh!

svrnm commented Sep 19, 2025

Uh oh!

thompson-tomo commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jaydeluca commented Oct 1, 2025

Uh oh!

trask commented Oct 1, 2025

Uh oh!

thompson-tomo commented Oct 6, 2025

Uh oh!

jaydeluca commented Oct 6, 2025

Uh oh!

thompson-tomo commented Oct 6, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jack-berg commented Nov 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

jaydeluca commented Sep 19, 2025 •

edited

Loading

thompson-tomo commented Oct 1, 2025 •

edited

Loading