December 2014 Galaxy Update
Welcome to the December 2014 Galaxy Update, a summary of what is going on in the Galaxy community. Galaxy Updates complement the Galaxy Development News Briefs which accompany new Galaxy releases and focus on Galaxy code updates.
Events
Galaxy Day: 3 December, Paris
The French Working Group GALAXY-IFB (Institut Français de Bioinformatique) is organizing a second Galaxy Day around the Galaxy portal. The event will be at Institut Curie in Paris over two days. This year, we want to involve two communities: biologists (also known as Galaxy 'users') and bioinformaticians (Galaxy 'developers'). The goal is to present user experience around the portal, from a single user to a wider community:
-
Dec 3 (09:00-17:00)
- Galaxy's user experiences, and discussion on how the platform is (or is not) useful for building analysis.
- Technology talks (new environment, Galaxy in production, ...)
Interested? Please contact [ifb DOT galaxy AT sb DASH roscoff DOT fr](mailto:ifb DOT galaxy AT sb DASH roscoff DOT fr) for more information.
The French IFB Galaxy Working Group:
URGI, GenoToul, MIGALE, PFEM, SouthGreen, Institut Curie, ABiMS
Intro to Galaxy Workshop, Dec 12, Virginia State U
If you are anywhere close to Richmond, Virginia and you want to attend an Introduction to Galaxy Workshop, then you are in luck. Thanks to sponsorship from the College of Natural and Health Sciences at Virginia State University, the workshop is free and open to all researchers and students. The workshop is split into morning and afternoon sessions. No knowledge of programming or command line interfaces is required.
However, space is limited, so you are encouraged to register right now.
GCC2015: 6-8 July, Norwich UK
Save these dates: 6-8 July 2015
Call for Sponsors
The 2015 Galaxy Community Conference (GCC2015) is now accepting Sponsorships. Your organisation can play a prominent part in the Galaxy community by sponsoring GCC2015. Sponsorship is an excellent way to raise your organization’s visibility.
Several sponsorship levels are available, including two levels of premier sponsorships that include presentations. Premium sponsorships are limited, however, so you are encouraged to act soon.
Please let the GCC2015 Organising Committee know if you are interested in helping make this event a success.
GCC2015 Training Day Topic Nominations ...
... will open shortly. The topics offered at the GCC2015 Training Day will be determined by you, the Galaxy Community. Topic nominations will open shortly (watch those Galaxy channels), and nominate topics will be voted on by the community early next year.
Please start giving some thought to what topics you would like to see covered at the GCC2015 Training Day.
Other Events
There are upcoming events on two continents. See the Galaxy Events Google Calendar for details on other events of interest to the community.
Date | Topic/Event | Venue/Location | Contact |
---|---|---|---|
December 3 | Galaxy Day | Institut Curie, Paris, France | IFB Galaxy |
December 5-8 | Next Generation Data Analysis Workshop | UC Riverside, Riverside, California, United States | Rakesh Kaundal |
December 9-11 | Microarray data analysis on Galaxy | BIRD IFB core facility Nantes University/INSERM, Nantes, France | Raluca Teusan, Audrey Bihouée, Edouard Hirchaud |
December 12 | Introduction to Galaxy Workshop | Virginia State University, Petersburg, Virginia, United States | Glenn Harris, Dave Clements |
December 16-19 | RNA-Seq and ChIP-Seq Analysis with Galaxy | UC Davis, California, United States | UC Davis Bioinformatics Training |
2015 | |||
January 10-14 | Galaxy for SNP and Variant Data Analysis | Plant and Animal Genome XXIII (PAG2014), San Diego, California, United States | Dave Clements |
February 9-13 | Analyse bioinformatique de séquences sous Galaxy | Montpellier, France | J.F. Dufayard |
February 16-18 | Accessible and Reproducible Large-Scale Analysis with Galaxy | Genome and Transcriptome Analysis, part of Molecular Medicine Tri-Conference, San Francisco, California, United States | James Taylor |
Large-Scale NGS data Analysis on Amazon Web Services Using Globus Genomic | Genomics & Sequencing Data Integration, Analysis and Visualization, part of Molecular Medicine Tri-Conference, San Francisco, California, United States | Ravi Madduri | |
iReport: An Integrative “omics” Reporting and Visualisation Platform | Andrew Stubbs | ||
July 6-8 | 2015 Galaxy Community Conference (GCC2015) | The Sainsbury Lab, Norwich, United Kingdom | Galaxy Outreach |
New Papers
96 papers (a new record) referencing, using, extending, and implementing Galaxy were added to the Galaxy CiteULike Group in November. That's a new record. Some of those papers are:
- "Galaxy Cluster to Cloud - Genomics at Scale" by Enis Afgan, et al. in Proceedings of the 9th Gateway Computing Environments Workshop (2014), pp. 47-50, doi:10.1109/gce.2014.13
- "Automatically exposing OpenLifeData via SADI semantic Web Services" by Alejandro Gonzalez, et al. Journal of Biomedical Semantics, Vol. 5, No. 1. (2014), 46, doi:10.1186/2041-1480-5-46
- "Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms" by Daniel I. Speiser, et al. BMC Bioinformatics, Vol. 15, No. 1. (19 November 2014), 350, doi:10.1186/s12859-014-0350-x
- "A case study for cloud based high throughput analysis of NGS data using the Globus Genomics system" by Krithika Bhuvaneshwar, et al. Computational and Structural Biotechnology Journal (November 2014), doi:10.1016/j.csbj.2014.11.001
- "RNA-Seq Analysis of Differential Gene Expression in Electroporated Chick Embryonic Spinal Cord" by Felipe M. Vieceli, C. Y. Irene Yan, Journal of Visualized Experiments, No. 93. (1 November 2014), doi:10.3791/51951
- "A Hadoop-Galaxy Adapter for User-friendly and Scalable Data-intensive Bioinformatics in Galaxy" by Luca Pireddu, et al. in Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (2014), pp. 184-191, doi:10.1145/2649387.2649429
The new papers covered these topics:
# | Tag | # | Tag | # | Tag | # | Tag | |||
---|---|---|---|---|---|---|---|---|---|---|
4 | Cloud | 1 | Project | 13 | Tools | 13 | UsePublic | |||
1 | HowTo | 10 | RefPublic | - | UseCloud | - | Visualization | |||
7 | IsGalaxy | 1 | Reproducibility | 3 | UseLocal | 32 | Workbench | |||
34 | Methods | 3 | Shared | 8 | UseMain |
Who's Hiring
The Galaxy is expanding! Please help it grow.
- Senior Development Engineer - Bioinformatics, and Bioinformatician II, University of Massachusetts Medical School
- Searching for bioinformaticians, post-docs, PhD students and software engineers in Freiburg, Germany at Max Planck Institute of Immunobiology and Epigenetics, and the Bioinformatics Group at the University of Freiburg
- Research Specialist, Michigan State University, United States
- Computational Science Developer I, Cold Spring Harbor Laboratory (CSHL), New York, United States
- Statistical Genomics Postdoc opening in the Makova lab at Penn State
- The Galaxy Project is hiring software engineers and post-docs
Got a Galaxy-related opening? Send it to outreach@galaxyproject.org and we'll put it in the Galaxy News feed and include it in next month's update.
Mailing Lists Moved
The Galaxy mailing lists have been moved from the lists.bx.psu.edu domain to the lists.galaxyproject.org domain. This transition should be largely transparent, but there are a few things to be aware of:
- List sender addresses and headers will change to reflect the updated domain: from bx.psu.edu to galaxyproject.org.
- Existing email filters you have set up may require adjustments.
- Posts from lists.galaxyproject.org could be categorized as spam until you train your filtering method.
The prior bx.psu.edu list posting email addresses will continue to accept email, which will be forwarded to the new list addresses.
New Galaxy-Training Mailing List
The Galaxy-Training mailing list is for anything related to training with Galaxy and training about Galaxy. Galaxy-Training is also the official mailing of the Galaxy Training Network, a world-wide group of training organizations that give Galaxy-based training.
If you are at all interested in Galaxy Training then you are encouraged to join the list. Messages to this list are publicly archived and will be included in the Galaxy search engine as well.
And please welcome the latest member of the Galaxy Training Network:
- HSPH Bioinformatics Core (HBC), Boston, Massachussetts, United States
This brings total membership to 22 training organizations on 5 continents.
New Public Servers
Two new public Galaxy servers were announced in November:
Pitagora-Galaxy
-
Links:
-
Domain/Purpose:
- The public, general purpose Galaxy servers of the Pitagora-Galaxy Project. This server is intended for testing and sharing. Heavy analysis should be performed using the project's identical virtual machine (VM) or Amazon Machine Image (AMI).
-
Comments:
-
"We are running a website for sharing users' know-how, and distributing a virtual environment where we configured Galaxy with selected workflows and tools. Now, you can perform our analysis workflows on the following three environments.
- Access to the public web site for testing.
- Download the virtual machine to your own PC or server.
- Launch AMI (Amazon machine image) on AWS cloud. Since Pitagora-Galaxy enables us to run the same workflows on any infrastructure and rebuild the environments in any time, we can quickly use Galaxy, and at the same time, ensure the reproducibility of the analyses. In addition, we plan to add a connector for Garuda Desktop, a desktop application platform, for data analyses that cannot be covered only with Galaxy tools.
-
-
User Support:
- Email: [Ryota Yamanaka](mailto:yamanaka AT genome DOT rcast.u-tokyo.ac.jp)
-
Quotas:
-
Public Server:
- See the instructions at Pitagora-Galaxy Server.
-
VM and AMI:
- None.
-
-
Sponsor(s):
PIA
-
Links:
- PIA Galaxy
- "Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms," by Speiser, et al. in BMC Bioinformatics 2014, 15:350 doi:10.1186/s12859-014-0350-x
- PIA code in Bitbucket
-
Domain/Purpose:
- Pylogenetics and gene interaction
-
Comments:
- "PIA (Phylogenetically Informed Annotation) is a set of tools for the Galaxy Bioinformatics Platform. In general, PIA uses BLAST, an alignment program, and RAxML's read placement algorithm to put unknown sequences into pre-calculated phylogenetic trees. We provide 102 genes called LIT (Light Interaction Toolkit) - vision genes like phototransduction genets - for use in PIA."
-
User Support:
- PIA Manual
- Getting Started with PIA screencast.
- Quotas:
-
Sponsor(s):
Galaxy Community Hubs
Share your experience now |
There were no new Log Board or Deployment Catalog entries in November! Eek! Please don't let this happen again!
The Community Log Board and Deployment Catalog Galaxy community hubs were launched last your. If you have a Galaxy deployment, or experience you want to share then please publish them this month.
New Releases
BioBlend v0.5.2 was released in October. BioBlend is a python library for interacting with CloudMan and the Galaxy API.
New versions of Galaxy, CloudMan, and blend4j were all released in August.
Look for a new Galaxy distribution soon.
ToolShed Contributions
Galaxy Project ToolShed Repos
Here are new contributions for the past two months.
In no particular order:
Tools
-
From peterjc:
- mira_datatypes: Defines 'mira' datatype for the MIRA Assembly Format. Note that Galaxy already has a 'maf' datatype for the Multiple (sequence) Alignment Format (MAF). This is specifically for the MIRA Assembly Format (also called MAF). Previously only on the Test Tool Shed.
- clc_assembly_cell: Galaxy wrapper for the CLC Assembly Cell suite from CLCBio. This is a wrapper for the commercial "CLC Assembly Cell" suite from CLCBio which includes a de novo assembler and read mapper: http://www.clcbio.com/products/clc-assembly-cell/ Uploaded v0.0.2, previously only on the Test Tool Shed.
- seq_composition: Uploaded v0.0.1 (with embedded citation). Sequence composition. Counts the letters in given sequence files, returning a table listing them with percentages. Suitable for use on assemblies or gene/protein sets. Probably not suitable for raw NGS reads.
- mira4_assembler: MIRA 4.0 assembler Wrapper for core functionality of assembly tool MIRA 4.0. Accepts data from Solexa/Illumina, Roche 454, Ion Torrent, PacBio and Sanger capillary sequencing. The key MIRA output files are captured, but the other files are deleted when the job finishes. Uploaded v0.0.4, previously only on the Test Tool Shed.
- samtools_depad: Runs "samtools depad" to remap a SAM/BAM file using a padded reference (with gap characters) giving a new BAM file using an unpadded (ungapped) reference. Uploaded v0.0.1, previously only on the Test Tool Shed.
- coverage_stats: This tool runs the commands
samtools idxstats
andsamtools depth
from the SAMtools toolkit, and parses their output to produce a consise summary of the coverage information for each reference sequence.
-
From iuc:
-
bedtools: bedtools: a powerful toolset for genome arithmetic Collectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic: that is, set theory on the genome. For example, bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations.
- Repository-Maintainer: Björn Grüning
- Repository-Development: https://github.com/galaxy-iuc/tool_shed/
-
-
From acrylamide:
- misc_tool_workflow_linkers: Contains my Tool Factory tools So far includes only one tool for dealing with seqprep output's gzip'd nature.
-
From big-tiandm:
- boost_graph: perl model Boost-Graph This is a perl model named Boost-Graph
-
From pjbriggs:
- weeder2: Motif discovery in sequences from coregulated genes of a single species. Weeder2 is a program for finding novel motifs (transcription factor binding sites) conserved in a set of regulatory regions of related genes
Tool Suites
-
From arkarachai-fungtammasan:
- microsat_ngs_profiling_suite_trfm: all dependency for microsattelite_ngs package and microsattelite_ngs itself
Packages / Tool Dependency Definitions
-
From agordon:
- package_datamash_1_0_6: GNU Datamash is a grouping and summarizing tool on tabular data files GNU Datamash is a command-line program which performs basic numeric, textual and statistical operations on input textual data files. it is designed to be portable and reliable, and aid researchers to easily automate analysis pipelines, without writing code or even short scripts. Home page: http://www.gnu.org/software/datamash
-
From iuc:
-
package_gnuplot_4_6: Contains a tool dependency definition that downloads and compiles version 4.6 of gnuplot. Gnuplot is a portable command-line driven graphing utility for Linux, OS/2, MS Windows, OSX, VMS, and many other platforms. It was originally created to allow scientists and students to visualize mathematical functions and data interactively, but has grown to support many non-interactive uses such as web scripting. It is also used as a plotting engine by third-party applications like Octave. Uploaded package as tested on the Test Tool Shed. http://www.gnuplot.info/
- Repository-Maintainer: Björn Grüning
- Repository-Development: https://github.com/galaxy-iuc/tool_shed/
-
-
From iyad:
- package_blast_2_2_26: adapted tool_dependencies.xml from Blast Plus 2.2.26 repository. Legacy NCBI Blast tools v2.2.26 Based on the NCBI Blast Plus package repository, this package will build and install the legacy NCBI Blast tools v2.2.26 for various operating systems and architectures.
-
From devteam:
- package_fastqc_0_11_2: Uploaded from GH fastqc v 0.11.2 fastqc v 0.11.2
Select Updates
-
From peterjc:
- blast_datatypes: v0.0.19, adds blastdbp and pssm-asn1 datatypes.
-
From devteam:
- structurefold: updated scripts, removed *.pyc and .DS_Store. Uploaded wrapper that correctly handles structure prediction without constraints.
-
From geert-vandeweyer:
- coverage_report: new version 0.0.3 (fix on headless R); changed tool.xml to request R 3.0.3; Correction to png calls to use cairo instead of x11. thanks to Eric Enns for pointing this out.
-
From peterjc:
- effectivet3: Uploaded v0.0.13, embed citation, relax test for floating point differences
- clinod: Uploaded v0.0.7, uses $GALAXY_SLOTS and embeds citation in tool XML.
- predictnls: Uploaded v0.0.7 with embedded citations
- blast_rbh: Uploaded v0.1.5, NCBI BLAST+ 2.2.30 etc
- tmhmm_and_signalp: Uploaded v0.2.6, embedded citations and uses $GALAXY_SLOTS
-
From crs4:
- mosaik2: Update Orione citation. Upgrade Mosaik dependency to v. 2.2.28 (2.2.30 is buggy, see https://github.com/wanpinglee/MOSAIK/issues/11 ). Use 2.1.78 neural networks. Add package_zlib_1_2_8 and package_samtools_0_1_19 dependencies. Add
.
- mosaik2: Update Orione citation. Upgrade Mosaik dependency to v. 2.2.28 (2.2.30 is buggy, see https://github.com/wanpinglee/MOSAIK/issues/11 ). Use 2.1.78 neural networks. Add package_zlib_1_2_8 and package_samtools_0_1_19 dependencies. Add
Other News
- Doing data-intensive biology in Poland? There's a server for you! Become a user.
- From Yvan Le Bras: You want to test Stacks on Galaxy? Find the links towards our Genocloud VM on GUGGO training page (with a nod to Julian Catchen)
- Claus from the University of Oslo developed an app for checking Galaxy on your phone. You can help test it.
- From CRS4 Galaxy: Seven Mycoplasma hyosynoviae strains assembled using Orione CRS4 Galaxy
- From Enis Afgan: Just wrapped up my first talk at Supercomputing 14: Deciphering Big Data Stacks: An Overview of Big Data Tools.
- From Robert Davidson: My conference presentation on metabolomics and Galaxy Project at #MMW2014 citable via FigShare
- From Ron Horst: Been away for a month, come back to a new GVL with Galaxy, IPython, Rstudio. See the GVL Dashboard and GET your own from http://genome.edu.au
- From Edinburgh Genomics: Workshop announcement (Feb 2015): "Introduction to Python for Biologists"
- Discussion on Command to compare repositories in different Tool Sheds started by Peter Cock.
- December 31, 2014, is the last day for submissions to GigaScience that will not be charged an article processing charge.