Task-based history export tracking
New pull request from David López merged: Add task-based history export tracking
TLDR
This PR enables the tracking of every history export (and potentially any other exportable object) in the Galaxy database
by adding a new table containing structured metadata about the export context.
It also creates a new HistoryExport
component in the UI that is based on the asynchronous task framework (Celery)
instead of the regular job system for handling exports and short-term downloads (https://github.com/galaxyproject/galaxy/pull/14511).
Backend Changes
- Adds a new
store_export_association
table to track any object export associated with a task. This table has anexport_metadata
column containing information about the export context (request and result) in a structured JSON (modeled with Pydantic objects). - Adds a new
StoreExportTracker
class to associate export metadata with any exportable object. - Refactors the
HistoryExportView
toHistoryExportManager
for managing the exports associated with histories. - Adapts the
model_stores
to generate and associate the export metadata to history downloads and remote exports. - Refactors the Tasks API to use an abstract
AsyncTaskManager
and the correspondingCeleryAsyncTaskManager
implementation to query tasks status. - Modifies the
/api/histories/{id}/exports
endpoint to accept a new typeapplication/vnd.galaxy.task.export+json
that will return the new task-based export associations. It will continue returning the previous "JEHAs" using the default request accept type so it's backward compatible. - Some other small refactorings for the schemas, removing unused code, etc.
Frontend Changes
Added several new components to display information on the last export and previous ones along with actions to download them again or re-import them from remote sources, or even choose the preferred archive format. They provide similar functionality as the previous components (that still are used when Celery is not enabled) and include a bit more related to the export tracking.
Choose Export Options
Now you can decide which export format and dataset files want to include in the archive.
Direct Download (Temporal)
When you use the direct download option, it will be reused (the package won't be regenerated) whenever you try to download it again until:
- The download link expires
- The history changes
- The export parameter change
Export to Remote File Source (Long term)
When you export your history to a remote file source a reference will be stored in the record and you can reimport it later as long as the remote file source maintains the exported package.
List of previous exports
You can download or reimport copies (snapshots) of previous versions of the history as long as they have not expired or the remote source maintains them.
Thanks to the reviewers John Chilton and Marius van den Beek. Check out the code at #14839