Skip to content

Delete stream data from the remote tier when a stream is deleted #17

@the-mikedavis

Description

@the-mikedavis

Is your feature request related to a problem? Please describe.

Currently stream deletion is not handled for the remote tier. There's a TODO in rabbitmq_stream_s3_log_manifest:delete/1 (callback for osiris_log_manifest:delete/1). When a stream queue is deleted, its files in the remote tier should be deleted too.

Describe the solution you'd like

When deleting a stream, rabbit_stream_coordinator deletes individual members (writers & replicas) without a hint that it's deleting the entire cluster. We might be able to make a change in the server and/or Osiris to expose that information. Or we could work around it by tracking membership ourselves and when we see the number of members drop to zero we know that a stream is deleted. Something periodic similar to CMR could run every now and then and compare the set of stream queues to what we have in the remote tier.

Describe alternatives you've considered

The easiest alternative is to do nothing during deletions. S3 storage without access is dirt cheap so storing even TB of data in the long run is not noticeably expensive. Having data that can't actually be deleted which does cost money seems like a gap though, so this suggestion is not really serious.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions