Skip to content

feat(datasets): French translation upload and provenance export#555

Merged
SanjeevLakhwani merged 23 commits into
masterfrom
feature/dataset-french-translation
Jun 25, 2026
Merged

feat(datasets): French translation upload and provenance export#555
SanjeevLakhwani merged 23 commits into
masterfrom
feature/dataset-french-translation

Conversation

@SanjeevLakhwani

@SanjeevLakhwani SanjeevLakhwani commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Redmine ticket 2830
Requires: bento-platform/katsu#705

  • Adds a typed API layer (src/api/datasetTranslations.ts) for the katsu translation endpoints (GET/POST/PUT /api/datasets/{id}/translations/{lang}), with bearer token support via useAuthorizationHeader
  • Adds a French Translation modal on each dataset card that checks for an existing translation on open and presents a drag-and-drop JSON upload; validates locally with Zod, posts to the API, and displays DRF field-keyed errors inline
  • The translation button label dynamically reflects whether a translation exists ("Add French Translation" / "Edit French Translation") by probing the API on mount and after the modal closes
  • Adds an Export dropdown next to "View Provenance" with options for English (canonical, downloaded instantly from local state) and French (fetched then downloaded); the French option is hidden when no translation exists

@SanjeevLakhwani SanjeevLakhwani requested a review from gsfk June 8, 2026 10:27
@gsfk

gsfk commented Jun 8, 2026

Copy link
Copy Markdown
Member

I'll have to throw together some examples for testing I guess.... first comment is that it's unconditionally checking for translations for each dataset, producing a lot of calls with 404 errors and jamming up my console.

@gsfk

gsfk commented Jun 8, 2026

Copy link
Copy Markdown
Member

Also not sure what you mean by "private" mode in bento_web.

@gsfk gsfk left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly find this extremely hard to work with:

  • Is there a better way to check for existing translations than checking if calls are 404? This litters my console and network tab with errors and makes development harder. It is particularly bad for multiple datasets.
  • There are now lots of buttons on the dataset card, with overlapping uses, arguably the new export and translation buttons belong inside the Provenance modal. ("Export" at the top level of a dataset arguably should mean "export data").
  • Validation does not work as promised. Most validation errors just produce the antd error message "JSON validation failed: Invalid input" with no more details, even in the browser console (instead your console is just 404 errors).
  • A small number of errors will produce the zod validation error message. But if you add to your JSON one of the many kinds of errors that produce only the antd message, you will lose your zod feedback (i.e. antd always dominates).
  • I also had one zod feedback message that appeared but then disappeared when the modal tried to recheck for an existing translation.
  • This pr continues the creep towards using the word "Provenance" to mean the entire dataset model, I find this confusing. I'm no longer even sure what provenance is supposed to be.
  • it ignores existing French translations of discovery fields, even though discovery fields are included in "Export".
  • when a translation exists, the button changes name to "Edit French Translation", but then on clicking it will sometimes go through the routine of checking for an existing translation again.

…tions array

Use dataset.translations (already present on all dataset objects) instead of
querying the translations/fr endpoint on mount to check existence.

- Remove checkFrTranslation fetch + useEffect from Dataset.js
- Derive hasFrTranslation directly from value.translations
- Remove checking/fetchError states from DatasetTranslationModal
- isEdit now derived from dataset.translations prop, not async fetch
- Add onSave callback; dispatches fetchProjectsWithDatasets on success
  so translations array stays in sync after an upsert
Replace the separate "View Provenance" (read-only) and "Edit" buttons with
a single "Provenance" button that opens one modal. The modal starts read-only
and shows an "Edit" button in the footer (private mode only) to switch to
edit mode in-place, with full draft-saving and JSON import support.

- DatasetProvenanceModal absorbs all edit logic from DatasetFormModal
- Dataset.js: one "Provenance" button, remove "Edit" button and onEdit prop
- Project.js: remove onEditDataset prop threading
- RoutedProject.js: remove handleDatasetEdit, selectedDataset, and the
  edit-mode DatasetFormModal instance
Dataset card header now has only Provenance + Delete (private). The
provenance modal footer hosts the Export dropdown and Add/Edit French
Translation button (private only), keeping all dataset actions in one place.

- DatasetProvenanceModal: add export dropdown, FR translation button,
  and nested DatasetTranslationModal; pull in fetchTranslation +
  authHeader + metadataUrl + downloadJson from Dataset.js
- Dataset.js: remove export dropdown, FR translation button,
  DatasetTranslationModal, and all related state/handlers/imports
The Dataset component is only rendered in the authenticated manager UI,
so the private/public conditional branching was dead weight.

- Remove mode prop and isPrivate derived flag from Dataset
- Inline the truthy branch everywhere (delete button, linked field set
  controls, modal rendering, tab content)
- Remove isPrivate prop from DatasetProvenanceModal, DatasetOverview,
  DatasetDataTypes; always show edit/translation buttons and actions column
- DatasetOverview: drop project prop (only used for the public project
  statistic col that is now removed)
- Project.js: remove mode="private" from Dataset usage
When a French translation exists (edit mode), show a danger "Delete
Translation" button in the modal footer behind a Popconfirm. On success,
dispatches onSave (triggers dataset refresh) and closes the modal.

Adds deleteTranslation() to the datasetTranslations API (DELETE .../translations/fr).
- Add isEdit to handleUpload useCallback deps (exhaustive-deps)
- Remove dataset.translations (now redundant via isEdit)
- Inline Alert message prop (prettier)
Button label (Add/Edit French Translation) was only updating after the
async fetchProjectsWithDatasets() resolved. Fix with an optimistic local
override in DatasetProvenanceModal that takes effect instantly, then clears
once the store delivers the updated translations array.

- onSave now carries hasFrNow: boolean so callers know the new state
- DatasetTranslationModal passes true (add/update) or false (delete)
- DatasetProvenanceModal sets hasFrOverride immediately, resets on store sync
… changes

Instead of refetching all projects, dispatch a targeted GET /api/datasets/{id}
that updates only the affected dataset slice in the Redux store. This ensures
dataset.translations stays in sync so the FR translation button label
(Add/Edit) reflects reality without an optimistic override.

- Add REFRESH_DATASET action type and refreshDataset(datasetId) to actions.js
- Add REFRESH_DATASET.RECEIVE case to reducer, updating items, itemsByID,
  datasets, and datasetsByID
- DatasetProvenanceModal: use refreshDataset for both provenance save and
  translation onSave; remove hasFrOverride optimistic state
- DatasetTranslationModal: revert onSave back to () => void
…object display

- Validate uploaded JSON language matches expected lang before sending to API;
  reject with error instead of silently overriding the language field
- Remove language override in upsertTranslation; file language is now pre-validated
- Coerce funder objects to name string in prepareInitialValues to fix [object Object] display
…translation modal

Replace auto-dismissing toasts for pre-upload errors (invalid JSON, Zod validation
failures, wrong language) with a persistent closable Alert; show all Zod errors
instead of only the first
@gsfk gsfk self-requested a review June 10, 2026 14:43

@gsfk gsfk left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much nicer than yesterday, although more comments:

  • found an existing bug: the "Remove" buttons in dataset editing don't work, the fields persist, even after hard refresh. For the "counts" field you can even see the data you deleted reappear in the UI when you close the modal.
  • zod error messages persist after closing
  • can we go back to plain old "Edit" for the button text instead of "Provenance"?
  • is there a way to tell which fields it will actually use the translation for? There are some fields where it probably shouldn't accept a translation at all, like PCGL DAC ID.
  • it accepts translations for the entire discovery config but ignores them.

@SanjeevLakhwani SanjeevLakhwani requested a review from gsfk June 10, 2026 17:26

@gsfk gsfk left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • "Remove" is still broken
  • repeating my question whether the "edit/view dataset information" button needs to be called "Provenance", this is probably worth discussing in a meeting

Two bugs caused field removals (e.g. counts, pcgl_dac_id) to appear
to revert after a successful save:

1. cleanFormValues stripped cleared fields entirely; the PUT merge
   ({ ...dataset, ...values }) then resurrected old values from the
   existing dataset. Now emit explicit null for each cleared top-level
   field so the merge overwrites correctly.

2. onSuccess called form.resetFields() before refreshDataset completed,
   snapping the form back to the stale preparedInitialValues snapshot
   (memoised on identifier, never recomputed). Removed the resetFields
   call; a new useEffect syncs the form to the refreshed dataset prop
   whenever editing is false.
@SanjeevLakhwani SanjeevLakhwani changed the base branch from refactor/unify-datasets-v2-endpoint to master June 15, 2026 20:07
- funding_sources items can be FundingSource or Link, but UI only
  rendered FundingSource fields (funder/grant_numbers)
- z.union([FundingSource, Link]) tried FundingSource first; since all
  its fields are optional it silently matched Link objects too,
  stripping label/url. Reordered to [Link, FundingSource].
- added _type discriminator + FundingSourceCard component (Form.useWatch)
  so the UI renders the right fields per item type
- prepareInitialValues now tags existing link-shaped entries with
  _type: link so they load correctly
- added client-side URL pattern validation matching the backend's
  http(s):// requirement so bad URLs are caught before submit
@gsfk gsfk self-requested a review June 23, 2026 15:26

@gsfk gsfk left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good. I think most of the annoyances and edge cases are gone. The few that remain arguably aren't that interesting, for example the translation can add fields, but is never permitted to remove them.

@SanjeevLakhwani SanjeevLakhwani merged commit f91ab03 into master Jun 25, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants