Skip to content

Condition publication on mirroring status #103

@rgaudin

Description

@rgaudin

With the current setup, the publication of new ZIM files happen in this order:

  1. File is uploaded from worker to master mirror (as a tmp file that is then renamed).
    At this moment, anyone with the path can download it from mirrorbrain or master mirror URL (MB forwards to master if not known).
  2. At random library-gen scans files and rebuilds library.xml.
    At this moment file is in the catalog for everybody.
  3. At random mirrorbrain scans folder. Discovering new files, it needs to compute a lot of hashes which can summed to minutes for large file on busy server.
    Once complete, .md5, .torrent and other endpoints are available.
  4. At random each mirror discovers via rsync the new file and syncs it.
    Once complete, the file is downloadable using the mirror URL
  5. At random mirrobrain scans the mirrors and discovers that the file is present on the mirror.
    Only then can it forward clients to that mirror for that file.
    Keep in mind mirror can have geo-restriction so maybe this one only serves some country•ies.

That's the journey of a ZIM file. Problems can occur for instance:

  • After step 2, it's visible and thus users can attempt to download. It's only on master mirror so every user is served by master mirror.
  • After step 2, other tools like imager-service sees the content and use it (it's most likely an update of some existing content). Because hashes dont exist yet, building image fails.

We want I think a condition to only publish after step 3.
We should also be able to configure mirroring threshold for worldwide availability.

This obviously mandates that the load-balancer shares its state and config in a machine readable format.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions