June 2020 Galactic News
James Taylor Foundation, BCC2020, Galaxy+SRA, Events, Platforms, ...
In the June 2020 issue
- James Taylor Foundation
- BCC2020 will be online, global, affordable, and accessible
- Galaxy and the NCBI Sequence Read Archive (SRA)
- Upcoming events
- Galaxy Platform News
- Training material and doc updates
- Who's Hiring
- New Releases
- New publications
- And other cool news too
If you have anything to include to next month's newsletter, then please send it to outreach@galaxyproject.org.
JTech, the James Taylor Foundation
The recent passing of Dr. James Taylor, the Ralph S. O’Connor Professor of Biology and Professor of Computer Science at Johns Hopkins University, has left an enormous void in the field of computational biology. To help fill this void and continue James’ efforts, the Galaxy community has established a memorial foundation in James’ name.
James believed that scientific progress can best be sustained through mentoring of students and junior faculty. The Junior Training and Educational Connections Hotspot (JTech) foundation will ensure implementation of this vision. To begin, JTech will (1) support graduate students to participate in computational biology and data science conferences, and (2) organize and host mentoring sessions between senior and junior faculty members at high-profile meetings. JTech will later expand its reach as a platform for academic mentorship including high school through college age students.
To make this happen we are accepting contributions here. Please, help us continue what James has started.
BCC2020 will be Online, Global, Affordable, and Accessible
The 2020 Bioinformatics Community Conference (BCC2020) brings together the Bioinformatics Open Source Conference (BOSC) and Galaxy Community Conference. If you are working in data intensive life science research then there is no better event for sharing your work, and learning from other researchers addressing the challenges of modern data driven biology. BCC2020 will be held July 17-26, and offer 2 days of training, a 3 day meeting, and a 4 day CollaborationFest.
Going online and global), combined with low registration rates, makes this the most accessible Galaxy or BOSC conference ever. If you work in open source bioinformatics, anywhere in the world, then this is 2020’s best opportunity to share your work and learn from others.
We are pleased to announce that Prash Suravajhala of BioClues and the Birla Institute of Scientific Research will give a keynote address at BCC2020. Prash is a founder of BioClues, India’s largest bioinformatics society.
BCC2020 registration is now open. Registering early saves 50% off of the full rates and starts $3 per training session and $12 for the three day meeting.
BCC2020 Sponsors
We are pleased to announce several sponsors for BCC2020! These organizations have stepped up to help make BCC global and affordable:
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform, offering over 175 fully featured services from data centers globally. AWS has long enabled life science researchers to access dynamically scalable and cost-effective compute resources without requiring an investment in dedicated and expensive local computational infrastructure that can become rapidly out dated. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster. You can too.
Please welcome AWS as a Gold Level sponsor of BCC2020.
The Broad Institute's Data Sciences Platform accelerates science, transforms medicine, and improves lives through data technologies. It is a diverse organization of more than 160 people working together and with external collaborators to deliver high-quality open source software and services, such as the Genome Analysis Toolkit (GATK), the Cromwell workflow management system and Terra, the Broad Institute's cloud-based data access and analysis platform. Learn more about Terra with these 3-10 minute video guides and at this BCC training session on Saturday, July 18.
Please welcome Broad Institute's Data Sciences Platform as a Silver Level sponsor of BCC2020.
The Software Sustainability Institute is a world-leading hub for the cultivation and improvement of research software practice. The Institute provides guidance, training, policy, and tools to thousands of researchers and research software engineers working across all disciplines to help achieve a vision of “better software, better research”.
Please welcome the Software Sustainability Institute as a Gold Level sponsor of BCC2020.
GigaScience is an open access, open data, open peer-review journal published by Oxford University Press and BGI. We offer ‘big data’ research from the life sciences, including work that uses difficult-to-access large-scale data, such as imaging, neuroscience, ecology, systems biology, and other types of shareable data. GigaScience is uniquely publishes all research objects (data, software tools, source code, workflows, containers and other elements related to the findings in the article). Novel work presented at the meeting is eligible for 15% off APC.
Please welcome GigaScience as a Silver Level sponsor of BCC2020.
Interested in helping BCC2020 happen? See the sponsorship opportunities page for information.
Galaxy and the NCBI Sequence Read Archive (SRA)
Galaxy and the NIH Sequence Read Archive are now directly connected, enabling researchers to work with SRA data available from NCBI (https://www.ncbi.nlm.nih.gov/sra/) more easily within the Galaxy framework. This webinar will talk about the connection and demonstrate how to use it to get SRA data into Galaxy.
NIH's Sequence Read Archive (SRA) will become an integrated data source on UseGalaxy.org this month. This functionality is also built in to the upcoming 20.05 release of Galaxy. With this connection, you will be able to work with SRA data available from NCBI more easily within the Galaxy framework.
A webinar on June 24 will demonstrate this integration and how to use SRA data in the Galaxy platform. The webinar will be held on June 24, at Noon, Eastern US time (GMT -4). Interested? Register now.
NIH has released a request for information (RFI) to solicit community feedback on new proposed Sequence Read Archive (SRA) data formats. Learn more and share your thoughts at https://go.usa.gov/xvhdr.
The response deadline is July 17th, 2020. We encourage you all to share with your colleagues and networks, and respond if you are an SRA submitter or data user.
More Upcoming Events
The coronavirus outbreak has impacted BCC2020, and just about every other event for the rest of the year too. Most events through the end of August have been postponed or moved online. We have updated our list of events to reflect what we know. Some highlights:
June is an active month for the Scientific Gateways Community Institute:
-
Jumpstart Your Sustainability Plan mini-course
- June 16-18, with main presentations 12-1:30 pm ET each day, and optional office hours and special topics presentations from 2-5 pm ET
- Register by June 12, 2020
- Galaxy: Powering Science from the Desktop to Global Cyberinfrastructure, June 24, Nate Coraor.
This workshop will be held online 22-26 June. Registration deadline is 16 June!
The organizers are seeking French-speaking Teaching Assistants for this workshop. If you can help (Tues + Thurs only) please contact Björn Grüning.
ISMB 2020 has gone virtual and Galaxy is going with it: Check out
- a tutorial,
- a COSI keynote, and
- five posters
so far. Look for more as the schedule comes online.
There are
- 20 upcoming events (most of them virtual)
- covering COVID-19, variant detection, assembly, machine learning, accessing SRA, metabarcoding, bioinformatics education, RNA-Seq and more.
And material from some recent past events is now available:
- Videos of all 5 sessions in the Galaxy-ELIXIR webinar series: FAIR data and Open Infrastructures to tackle the COVID-19 pandemic are now available.
- Slides and video for the webinar Galaxy Project—Enabling an active global research community are online.
Galaxy Platforms News
The Galaxy Platform Directory lists resources for easily running your analysis on Galaxy, including publicly available servers, cloud services, and containers and VMs that run Galaxy. Here's the recent platform news we know about:
The MaREA4Galaxy server supports the Metabolic Reaction Enrichment Analysis and visualization of RNA-seq data toolset. This includes tools to
- Compute Reaction Activity Scores from gene expression (RNA-seq) dataset(s).
- Cluster analysis of any dataset, according to most used algorithms: K-means, agglomerative clustering and DBSCAN.
- Analyze and visualize differences in the Reaction Activity Scores (RASs) of groups of samples, as computed by the Expression2RAS tool, of groups of samples.
The wQAP Galaxy server is an online version of QAP, the Quasispecies Analysis Package, and contains all the programs in QAP, and there are nearly no differences between them. It comes with a tutorial and email support. wAQP is supported by the Research Laboratory of Clinical Virology, Ruijin Hospital, Shanghai Jiaotong University, School of Medicine, Shanghai, China.
ChemicalToolBox is a webserver for processing, analysing and visualising chemical data, and performing molecular simulations, with almost 100 tools. Training materials are available, and this toolbox was used in the recent Virtual screening of the SARS-CoV-2 main protease work.
- UseGalaxy.eu has surpassed 100,000 workflow invocations.
- Lots of tool updates on UseGalaxy.eu and UseGalaxy.org.au
Platforms that were referenced/used at least twice in recent publications:
43 : Huttenhower 13 : Workflow4Metabolomics 12 : UseGalaxy.eu 6 : RepeatExplorer 5 : ARGs-OAP 4 : Cistrome 3 : Galaxy-P 2 : ARIES 2 : Mississippi 2 : RNA Workbench 2 : Trinity 2 : UseGalaxy.org.au
Doc, Hub, and Training Updates
Machine Learning has had an active Galaxy community for a while now, and now that community has it's own web page on the Galaxy Hub describing the Machine learning Workbench, relevant training, and supported tools. Coming soon: community communication channels.
By Alireza Khanteymoori and Anup Kumar.
Discover hidden structure or patterns in unlabeled training data using unsupervised learning with clustering.
By Melanie Föll and Matthias Fahrner
Introduces the data analysis from raw data files to protein identification and quantification of two label-free human serum samples with the MaxQuant software.
This extensive overview slide deck on how Galaxy Code is architected received a major update from John Chilton, Helena Rasche, and Nicola Soranzo.
Simon Bray posted a major update to this COVID-19 related tutorial. (He added Frankenstein.)
Who's Hiring
VIB-UGent Center for Plant Systems Biology, Ghent, Belgium
... We are building on the internationally used platform FAIRDOMhub for data management, and Galaxy (https://www.usegalaxy.be and https://usegalaxy.eu) for data analysis.
Black Canyon Consulting at NCBI, Bethesda, Maryland, United States
New England Biolabs, Ipswich, Massachusetts, United States
Releases
The first full stable release of the blend4php package is out. blend4php is a PHP wrapper for interacting with Galaxy and CloudMan. blend4php currently offers a partial implementation of the Galaxy API and includes support for datasets, data types, folder contents, folders, genomes, group roles, groups, group users, histories, history contents, jobs, libraries, library contents, requests, roles, search, tools, toolshed repositories, users, visualizations and workflows.
Publications
262 new publications referencing, using, extending, and implementing Galaxy were added to the Galaxy Publication Library in April and May. There were 16 Galactic and Stellar publications added, and 14 of them are open access.
Nekrutenko, A., & Schatz, M. C. (2020). Genome Biology, 21(1), 105. https://doi.org/10.1186/s13059-020-02016-0
Knijn, A., Michelacci, V., Orsini, M., & Morabito, S. (2020). BioRxiv, 2020.05.14.095901. https://doi.org/10.1101/2020.05.14.095901
Schäfer, R. A., Lott, S. C., Georg, J., Grüning, B. A., Hess, W. R., & Voß, B. (2020). Bioinformatics. https://doi.org/10.1093/bioinformatics/btaa556
Goonasekera, N., Mahmoud, A., Chilton, J., & Afgan, E. (2020). BioRxiv, 2020.05.28.121772. https://doi.org/10.1101/2020.05.28.121772
Bray, S. A., Lucas, X., Kumar, A., & Grüning, B. A. (2020). Journal of Cheminformatics, 12(1), 40. https://doi.org/10.1186/s13321-020-00442-7
Damiani, C., Rovida, L., Maspero, D., Sala, I., Rosato, L., Di Filippo, M., Pescini, D., Graudenzi, A., Antoniotti, M., & Mauri, G. (2020). Computational and Structural Biotechnology Journal, 18, 993–999. https://doi.org/10.1016/j.csbj.2020.04.008
Tekman, M., Batut, B., Ostrovsky, A., Antoniewski, C., Clements, D., Ramirez, F., Etherington, G. J., Hotz, H.-R., Scholtalbers, J., Manning, J. R., Bellenger, L., Doyle, M. A., Heydarian, M., Huang, N., Soranzo, N., Moreno, P., Papatheodorou, I., Nekrutenko, A., Taylor, J., … Grüning, B. (2020). BioRxiv, 2020.06.06.137570. https://doi.org/10.1101/2020.06.06.137570
Wolff, J., Rabbani, L., Gilsbach, R., Richard, G., Manke, T., Backofen, R., & Grüning, B. A. (2020). Nucleic Acids Research. https://doi.org/10.1093/nar/gkaa220
Su, S.-Y., Lu, I.-H., Cheng, W.-C., Chung, W.-C., Chen, P.-Y., Ho, J.-M., Chen, S.-H., & Lin, C.-Y. (2020). BMC Genomics, 21(3), 163. https://doi.org/10.1186/s12864-019-6404-8
Timme, R. E., Wolfgang, W. J., Balkey, M., Venkata, S. L. G., Randolph, R., Allard, M., & Strain, E. (2020). Preprints. https://doi.org/10.20944/preprints202004.0253.v1
Murat, K., Grüning, B., Poterlowicz, P. W., Westgate, G., Tobin, D. J., & Poterlowicz, K. (2020). GigaScience, 9(5). https://doi.org/10.1093/gigascience/giaa049
Blomberg, N., & Lauer, K. B. (2020). European Journal of Human Genetics, 1–5. https://doi.org/10.1038/s41431-020-0637-5
Allain, F., Mareuil, F., Ménager, H., Nilges, M., & Bardiaux, B. (2020). Nucleic Acids Research. https://doi.org/10.1093/nar/gkaa362
Wang, M., Li, J., Zhang, X., Han, Y., Yu, D., Zhang, D., Yuan, Z., Yang, Z., Huang, J., & Zhang, X. (2020). BMC Genomics, 21(1), 363. https://doi.org/10.1186/s12864-020-6744-4
Publications are tagged with how they use, extend or reference Galaxy. This batch of pubs were tagged as:
185 : Methods 82 : UsePublic 34 : UseMain 26 : RefPublic 19 : Workbench 17 : UseLocal 15 : IsGalaxy 12 : Tools 5 : Reproducibility 3 : Cloud 3 : Other 3 : Project 2 : Education 1 : Shared 1 : UseCloud 1 : Visualization
Other News
Open Life Science is launching its 2nd cohort. If you are an early-stage researcher who wants to become an ambassador for Life Science in your communities then please considering applying for the next 16-week mentoring & training program.