Skip to content

Conversation

@sreekanth-db
Copy link
Contributor

@sreekanth-db sreekanth-db commented Oct 30, 2025

This PR implements Phase 1 of telemetry tag definitions (telemetry-activity-based-design.md) for the C# Databricks ADBC driver, establishing structured telemetry collection with privacy-aware export controls.

Changes

Introduces a declarative tag definition system across three telemetry event types:

1. Connection Events - Captures connection lifecycle and driver configuration

  • Identity: workspace ID, session ID
  • Driver metadata: version, OS, .NET runtime
  • Feature flags: CloudFetch, LZ4, direct results, multiple catalog, trace propagation
  • Local-only: server addresses

2. Statement Execution Events - Tracks query performance and resource usage

  • Result metrics: format, chunk count, bytes downloaded, row count
  • Polling metrics: poll count, latency
  • Operation metadata: operation type (SELECT, INSERT, etc.)
  • Local-only: SQL text, catalog/schema names

3. Error Events - Records error patterns and retry behavior

  • Classification: error type, HTTP status codes, SQL states
  • Retry context: retry count, operation context
  • Local-only: error messages, stack traces

Export Scope Design

Tags use TagExportScope enum to control export destinations:

  • ExportLocal - Sensitive data only (SQL, errors, addresses)
  • ExportDatabricks - Only to databricks service
  • ExportAll - Non-sensitive operational metrics (both local & Databricks)

TelemetryTagRegistry provides centralized privacy filtering via GetDatabricksExportTags() and ShouldExportToDatabricks().

Design Note: Added TODO comment acknowledging the current approach duplicates export scope between attributes and whitelist methods.

Testing

  • ✅ All 7 unit tests pass (TelemetryTagRegistryTests)
  • ✅ Privacy filters validated (sensitive tags excluded from Databricks export)
  • ✅ Unknown event types handled gracefully

Next Steps

Future phases will add ActivityListener, metrics aggregator, and exporter components.

@github-actions github-actions bot added this to the ADBC Libraries 21 milestone Oct 30, 2025
@jadewang-db
Copy link
Contributor

please make sure to have good PR descriptions

public TelemetryTagAttribute(string tagName)
{
TagName = tagName ?? throw new ArgumentNullException(nameof(tagName));
ExportScope = TagExportScope.ExportAll;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this used? Why hardcode to ExportAll?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defaulted to export local as discussed offline

public const string DriverRuntime = "driver.runtime";

// Feature Flags
[TelemetryTag("feature.cloudfetch", ExportScope = TagExportScope.ExportDatabricks, Description = "CloudFetch enabled")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these only for Export to Databricks? Not local?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to export all

/// <summary>
/// Gets tags allowed for Databricks export (privacy whitelist).
/// </summary>
// TODO: Explore alternate approaches to avoid maintaining separate GetDatabricksExportTags methods in each event class.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a todo comment to explore alternate approaches to GetDatabricksExportTags method (other than reflection), or changing the flow based on how we use this method (in the next phases)

Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have some questions and recommendations for changes.


namespace Apache.Arrow.Adbc.Drivers.Databricks.Telemetry.TagDefinitions
{
public static class TelemetryTagRegistry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public static class TelemetryTagRegistry
internal static class TelemetryTagRegistry

This is only used from inside the driver, right?

/// <summary>
/// Checks if a tag should be exported to Databricks.
/// </summary>
public static bool ShouldExportToDatabricks(TelemetryEventType eventType, string tagName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's difficult to get a sense for how this code will be used before it's integrated with the rest of the code, it looks like this is probably the primary entry point to the list of tags. If I say something like ShouldExportToDatabricks(TelemetryEventType.Error, "tag1") || ShouldExportToDatabricks(TelemetryEventType.Error, "tag2") then the current code will end up instantiating the same HashSet<string> twice -- once per call to ShouldExportToDatabricks. If this is only called fairly rarely, then maybe that's not a big deal. But if this is going to be called fairly frequently then it probably makes more sense to cache the HashSet<string> as a static inside each class rather than instantiating a new one each time.

/// Defines the types of telemetry events that can be emitted by the driver.
/// Each event type has its own set of allowed tags defined in corresponding *Event classes.
/// </summary>
public enum TelemetryEventType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public enum TelemetryEventType
public enum TelemetryEventType

This is only used from inside the driver, right? If it's not already that way, we can use InternalsVisibleTo to make it available to tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants