The Galaxy Europe team at the ELIXIR All Hands Meeting 2024
The Galaxy Europe team has participated in the ELIXIR All Hands meeting 2024 in Uppsala, Sweden.
The 10th ELIXIR All Hands meeting was held in Uppsala, Sweden, from June 10 to 12, 2024. The Galaxy Europe team attended this meeting, which aimed to bring together members of the ELIXIR community from across the ELIXIR Nodes, as well as collaborators from partner organisations, to review ELIXIR's achievements and activities and discuss future plans.
In the Deciphering federated analysis: What is it and how far it can take us session on the first day, the participants discussed the following question. How can we conduct collaborative analysis while respecting data privacy and following data protection regulations? Multiple ELIXIR initiatives are working on federated analysis, which will solve this problem. The Galaxy Europe team, as part of the EuroScienceGateway project, was present at this gathering. Dr. Björn Grüning presented How Galaxy Can Be Used for Federated Analytics? during this meeting. He discussed how the Galaxy architecture is similar to EUCAIM, as well as Pulsar Nodes, ELIXIR AAI, and the GA4GH TES Cluster, and the use of distributed nodes for various types of efficiency, including carbon efficiency. He also addressed the storage problem for federated data analysis and the Galaxy team's solutions, which include connecting to and selecting multiple public or personal storage alternatives via API, command line, and GUI, in addition to the temporarily expanding users' scratch space. Pulsar provides a varied collection of alternatives, including S3 storage from several countries, commercial S3, OneData, iRODS, and Rucio. In the same session, Dr. Tim Beck presented the role of EOSC-ENTRUST in secure federated data analysis and discussed RO-Crate and workflow processing, which enable the interoperability framework and execution approach for practical pan-Trusted Research Environment (TRE) analysis.
On the second day of the event, the Tackling single-cell data management and interoperability in Galaxy session discussed the advancements of analysis tool suites and the abundance of data. Poor interoperability of diverse single-cell data types has been identified as a challenge in this field. New tools and regular releases of R and Python tools bring new data formats, many of which lack backward compatibility, posing challenges for tutorial and workflow sustainability. In this session, ELIXIR members explored incorporating ELIXIR data management techniques and how to use available tools such as the ELIXIR single-cell RDM toolkit to assist users in standardizing RDM and data interoperability. Wendi Bacon and Pavankumar Videm presented the lifetime of single-cell data on Galaxy and explored the problems in standards and data formats. Participants in this workshop were divided into groups, led as well by Björn Grüning and Wolfgang Maier ran breakout groups, to create and test workflows and documentation, incorporate public metadata standards into the workflows, and expand the ELIXIR-RDM toolkit page applicable standards. The output of these workflows was formatted in the research object crate (RO-Crate) format, which is compliant with ELIXIR specifications.
During the Where is the dark matter? The Galaxy Community shedding light on activities, connections, scope, and unaddressed shortcomings, on the second day of the event, the Galaxy Community in ELIXIR explored the prospects Galaxy brings as both a platform and a community. In short, the primary goal of this workshop was to promote knowledge about the wide range of accessible tools, procedures, and projects from closely related domains. This session was organized as a world cafe setup and divided into three major groups: (a) Research Data Management in Galaxy moderated by Dr. Frederik Coppens and Dr. Rafael Andrade Buono (b) Training and Skills moderated by Dr. Bérénice Batut (c) Galaxy Workflows and the IWC moderated by Dr. Wolfgang Maier.
On the 3rd day of this event, during the Connecting ELIXIR's technical ambitions session, the Galaxy Community opportunities and advancements were presented in the Reproducible analytics and infrastructure by Dr. Frederik Coppens. During this session, EuroScienceGateway and the Galaxy community's advancements, such as increasing content on the Galaxy Training Network (GTN), as well as infrastructure advancements, such as Bring Your Own Storage (BYOS) and Bring Your Own Compute (BYOC) features, were discussed. The project's future path, which includes lowering environmental impact, community sustainability, federated sensitive data analysis, and growing the array of tools and workflows, was presented to the audience. In addition, the Galaxy Community's ties to other ELIXIR components such as LS Login, Bioschemas, RO-Crate, WorkflowHub, ELIXIR bio.tools, and OpenEBench were discussed. Finally, the different ways available for contributing to and getting involved in the Galaxy Community has been discussed.
On the third day of the event, during the ELIXIR Cloud - A roadmap towards operationalisation and expansion workshop, the members discussed the ELIXIR Cloud, which is a modular and interoperable cloud infrastructure for genome-scale data analysis. This is based on community standards established by the Global Alliance for Genomics and Health (GA4GH). Stian Soiland-Reyes demonstrated WorkflowHub cloud integration in ELIXIR. WorkflowHub standardizes metadata and allows you to link publications, datafiles, workflows, and protocols, as well as execute, publish, and cite workflows. Workflowhub can accommodate a wide range of workflow systems, including CWL, Galaxy, NextFlow, and RO-Crate. It also supports integration with GitHub, GitLab, Galaxy IWC, and Nextflow nf-core. WorkflowHub also provides the framework for publishing and sharing workflows with DOIs using DataCite, Zenodo, ORCID, and the COVID-19 Data Portal. In this session, Björn Grüning demonstrated the Galaxy's federated storage. Data Repository Service (DRS), Task Execution Service (TES), Beacon APIs as well as Tool Registry Service (TRS) were presented during this workshop. The challenges of increasing storage requirements, as well as recent Galaxy advancements (e.g., deferred data, temporary additional storage space up to 4 weeks, export to long-term archives, and BYOS), were discussed.
On the final day, multiple participants presented the various capabilities of the ELIXIR Research Software Ecosystem (RSEc) and its application in Life Science research during the The Research Software Ecosystem in action: examples and use-cases seminar. Dr. Bérénice Batut discussed the latest developments to expose tools and workflows for specific scientific communities, as well as the bi-directional association of tools and training materials. She also demonstrated how user support could be integrated into this package, as well as how Galaxy tools, workflows, training, and support for EDAM and bio.tools IDs are linked together.