December 2014 Galaxy Update

Galaxy Updates

Welcome to the December 2014 Galaxy Update, a summary of what is going on in the Galaxy community. Galaxy Updates complement the Galaxy Development News Briefs which accompany new Galaxy releases and focus on Galaxy code updates.

Events

Galaxy Day: 3 December, Paris

The French Working Group GALAXY-IFB (Institut Français de Bioinformatique) is organizing a second Galaxy Day around the Galaxy portal. The event will be at Institut Curie in Paris over two days. This year, we want to involve two communities: biologists (also known as Galaxy 'users') and bioinformaticians (Galaxy 'developers'). The goal is to present user experience around the portal, from a single user to a wider community:

  • Dec 3 (09:00-17:00)

    • Galaxy's user experiences, and discussion on how the platform is (or is not) useful for building analysis.
    • Technology talks (new environment, Galaxy in production, ...)

Interested? Please contact [ifb DOT galaxy AT sb DASH roscoff DOT fr](mailto:ifb DOT galaxy AT sb DASH roscoff DOT fr) for more information.

The French IFB Galaxy Working Group:

URGI, GenoToul, MIGALE, PFEM, SouthGreen, Institut Curie, ABiMS

URGI   GenoToul   MIGALE   PFEM   SouthGreen Institut Curie   ABiMS

Intro to Galaxy Workshop, Dec 12, Virginia State U

Virginia State University

If you are anywhere close to Richmond, Virginia and you want to attend an Introduction to Galaxy Workshop, then you are in luck. Thanks to sponsorship from the College of Natural and Health Sciences at Virginia State University, the workshop is free and open to all researchers and students. The workshop is split into morning and afternoon sessions. No knowledge of programming or command line interfaces is required.

However, space is limited, so you are encouraged to register right now.

GCC2015: 6-8 July, Norwich UK

2015 Galaxy Community Conference (GCC2015)

Save these dates: 6-8 July 2015

Call for Sponsors

The 2015 Galaxy Community Conference (GCC2015) is now accepting Sponsorships. Your organisation can play a prominent part in the Galaxy community by sponsoring GCC2015. Sponsorship is an excellent way to raise your organization’s visibility.

Several sponsorship levels are available, including two levels of premier sponsorships that include presentations. Premium sponsorships are limited, however, so you are encouraged to act soon.

Please let the GCC2015 Organising Committee know if you are interested in helping make this event a success.

GCC2015 Training Day Topic Nominations ...

... will open shortly. The topics offered at the GCC2015 Training Day will be determined by you, the Galaxy Community. Topic nominations will open shortly (watch those Galaxy channels), and nominate topics will be voted on by the community early next year.

Please start giving some thought to what topics you would like to see covered at the GCC2015 Training Day.

Other Events

Galaxy Day  Intro to Galaxy Workshop at Virginia State University Galaxy @ Plant and Animal Genome (PAG 2015) Analyse bioinformatique de séquences sous Galaxy

There are upcoming events on two continents. See the Galaxy Events Google Calendar for details on other events of interest to the community.

Date Topic/Event Venue/Location Contact
December 3 Galaxy Day Institut Curie, Paris, France IFB Galaxy
December 5-8 Next Generation Data Analysis Workshop UC Riverside, Riverside, California, United States Rakesh Kaundal
December 9-11 Microarray data analysis on Galaxy BIRD IFB core facility Nantes University/INSERM, Nantes, France Raluca Teusan, Audrey Bihouée, Edouard Hirchaud
December 12 Introduction to Galaxy Workshop Virginia State University, Petersburg, Virginia, United States Glenn Harris, Dave Clements
December 16-19 RNA-Seq and ChIP-Seq Analysis with Galaxy UC Davis, California, United States UC Davis Bioinformatics Training
2015
January 10-14 Galaxy for SNP and Variant Data Analysis Plant and Animal Genome XXIII (PAG2014), San Diego, California, United States Dave Clements
February 9-13 Analyse bioinformatique de séquences sous Galaxy Montpellier, France J.F. Dufayard
February 16-18 Accessible and Reproducible Large-Scale Analysis with Galaxy Genome and Transcriptome Analysis, part of Molecular Medicine Tri-Conference, San Francisco, California, United States James Taylor
Large-Scale NGS data Analysis on Amazon Web Services Using Globus Genomic Genomics & Sequencing Data Integration, Analysis and Visualization, part of Molecular Medicine Tri-Conference, San Francisco, California, United States Ravi Madduri
iReport: An Integrative “omics” Reporting and Visualisation Platform Andrew Stubbs
July 6-8 2015 Galaxy Community Conference (GCC2015) The Sainsbury Lab, Norwich, United Kingdom Galaxy Outreach

New Papers

96 papers (a new record) referencing, using, extending, and implementing Galaxy were added to the Galaxy CiteULike Group in November. That's a new record. Some of those papers are:

The new papers covered these topics:

# Tag    # Tag    # Tag    # Tag
4 Cloud 1 Project 13 Tools 13 UsePublic
1 HowTo 10 RefPublic - UseCloud - Visualization
7 IsGalaxy 1 Reproducibility 3 UseLocal 32 Workbench
34 Methods 3 Shared 8 UseMain

Who's Hiring

Please Help! Yes you!

The Galaxy is expanding! Please help it grow.

Got a Galaxy-related opening? Send it to outreach@galaxyproject.org and we'll put it in the Galaxy News feed and include it in next month's update.


Mailing Lists Moved

The Galaxy mailing lists have been moved from the lists.bx.psu.edu domain to the lists.galaxyproject.org domain. This transition should be largely transparent, but there are a few things to be aware of:

  • List sender addresses and headers will change to reflect the updated domain: from bx.psu.edu to galaxyproject.org.
  • Existing email filters you have set up may require adjustments.
  • Posts from lists.galaxyproject.org could be categorized as spam until you train your filtering method.

The prior bx.psu.edu list posting email addresses will continue to accept email, which will be forwarded to the new list addresses.

New Galaxy-Training Mailing List

The Galaxy-Training mailing list is for anything related to training with Galaxy and training about Galaxy. Galaxy-Training is also the official mailing of the Galaxy Training Network, a world-wide group of training organizations that give Galaxy-based training.

If you are at all interested in Galaxy Training then you are encouraged to join the list. Messages to this list are publicly archived and will be included in the Galaxy search engine as well.

Galaxy Training Network

And please welcome the latest member of the Galaxy Training Network:

This brings total membership to 22 training organizations on 5 continents.

New Public Servers

Two new public Galaxy servers were announced in November:

Pitagora-Galaxy

Pitagora-Galaxy
  • Links:

  • Domain/Purpose:

    • The public, general purpose Galaxy servers of the Pitagora-Galaxy Project. This server is intended for testing and sharing. Heavy analysis should be performed using the project's identical virtual machine (VM) or Amazon Machine Image (AMI).
  • Comments:

    • "We are running a website for sharing users' know-how, and distributing a virtual environment where we configured Galaxy with selected workflows and tools. Now, you can perform our analysis workflows on the following three environments.

      • Access to the public web site for testing.
      • Download the virtual machine to your own PC or server.
      • Launch AMI (Amazon machine image) on AWS cloud. Since Pitagora-Galaxy enables us to run the same workflows on any infrastructure and rebuild the environments in any time, we can quickly use Galaxy, and at the same time, ensure the reproducibility of the analyses. In addition, we plan to add a connector for Garuda Desktop, a desktop application platform, for data analyses that cannot be covered only with Galaxy tools.
  • User Support:

    • Email: [Ryota Yamanaka](mailto:yamanaka AT genome DOT rcast.u-tokyo.ac.jp)
  • Quotas:

    • Public Server:

    • VM and AMI:

      • None.
  • Sponsor(s):

PIA

PIA

Galaxy Community Hubs

   Galaxy Community Log Board
Galaxy Deployment Catalog   

   Share your experience now   



There were no new Log Board or Deployment Catalog entries in November! Eek! Please don't let this happen again!

The Community Log Board and Deployment Catalog Galaxy community hubs were launched last your. If you have a Galaxy deployment, or experience you want to share then please publish them this month.

New Releases

BioBlend v0.5.2 was released in October. BioBlend is a python library for interacting with CloudMan and the Galaxy API.

New versions of Galaxy, CloudMan, and blend4j were all released in August.

Look for a new Galaxy distribution soon.


Galaxy ToolShed

ToolShed Contributions

Galaxy Project ToolShed Repos

Here are new contributions for the past two months.

In no particular order:

Tools

  • From peterjc:

    • mira_datatypes: Defines 'mira' datatype for the MIRA Assembly Format. Note that Galaxy already has a 'maf' datatype for the Multiple (sequence) Alignment Format (MAF). This is specifically for the MIRA Assembly Format (also called MAF). Previously only on the Test Tool Shed.
    • clc_assembly_cell: Galaxy wrapper for the CLC Assembly Cell suite from CLCBio. This is a wrapper for the commercial "CLC Assembly Cell" suite from CLCBio which includes a de novo assembler and read mapper: http://www.clcbio.com/products/clc-assembly-cell/ Uploaded v0.0.2, previously only on the Test Tool Shed.
    • seq_composition: Uploaded v0.0.1 (with embedded citation). Sequence composition. Counts the letters in given sequence files, returning a table listing them with percentages. Suitable for use on assemblies or gene/protein sets. Probably not suitable for raw NGS reads.
    • mira4_assembler: MIRA 4.0 assembler Wrapper for core functionality of assembly tool MIRA 4.0. Accepts data from Solexa/Illumina, Roche 454, Ion Torrent, PacBio and Sanger capillary sequencing. The key MIRA output files are captured, but the other files are deleted when the job finishes. Uploaded v0.0.4, previously only on the Test Tool Shed.
    • samtools_depad: Runs "samtools depad" to remap a SAM/BAM file using a padded reference (with gap characters) giving a new BAM file using an unpadded (ungapped) reference. Uploaded v0.0.1, previously only on the Test Tool Shed.
    • coverage_stats: This tool runs the commands samtools idxstats and samtools depth from the SAMtools toolkit, and parses their output to produce a consise summary of the coverage information for each reference sequence.
  • From iuc:

    • bedtools: bedtools: a powerful toolset for genome arithmetic Collectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic: that is, set theory on the genome. For example, bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations.

  • From acrylamide:

    • misc_tool_workflow_linkers: Contains my Tool Factory tools So far includes only one tool for dealing with seqprep output's gzip'd nature.
  • From big-tiandm:

    • boost_graph: perl model Boost-Graph This is a perl model named Boost-Graph
  • From pjbriggs:

    • weeder2: Motif discovery in sequences from coregulated genes of a single species. Weeder2 is a program for finding novel motifs (transcription factor binding sites) conserved in a set of regulatory regions of related genes

Tool Suites

Packages / Tool Dependency Definitions

  • From agordon:

    • package_datamash_1_0_6: GNU Datamash is a grouping and summarizing tool on tabular data files GNU Datamash is a command-line program which performs basic numeric, textual and statistical operations on input textual data files. it is designed to be portable and reliable, and aid researchers to easily automate analysis pipelines, without writing code or even short scripts. Home page: http://www.gnu.org/software/datamash
  • From iuc:

    • package_gnuplot_4_6: Contains a tool dependency definition that downloads and compiles version 4.6 of gnuplot. Gnuplot is a portable command-line driven graphing utility for Linux, OS/2, MS Windows, OSX, VMS, and many other platforms. It was originally created to allow scientists and students to visualize mathematical functions and data interactively, but has grown to support many non-interactive uses such as web scripting. It is also used as a plotting engine by third-party applications like Octave. Uploaded package as tested on the Test Tool Shed. http://www.gnuplot.info/

  • From iyad:

    • package_blast_2_2_26: adapted tool_dependencies.xml from Blast Plus 2.2.26 repository. Legacy NCBI Blast tools v2.2.26 Based on the NCBI Blast Plus package repository, this package will build and install the legacy NCBI Blast tools v2.2.26 for various operating systems and architectures.
  • From devteam:

Select Updates

  • From peterjc:

  • From devteam:

    • structurefold: updated scripts, removed *.pyc and .DS_Store. Uploaded wrapper that correctly handles structure prediction without constraints.
  • From geert-vandeweyer:

    • coverage_report: new version 0.0.3 (fix on headless R); changed tool.xml to request R 3.0.3; Correction to png calls to use cairo instead of x11. thanks to Eric Enns for pointing this out.
  • From peterjc:

    • effectivet3: Uploaded v0.0.13, embed citation, relax test for floating point differences
    • clinod: Uploaded v0.0.7, uses $GALAXY_SLOTS and embeds citation in tool XML.
    • predictnls: Uploaded v0.0.7 with embedded citations
    • blast_rbh: Uploaded v0.1.5, NCBI BLAST+ 2.2.30 etc
    • tmhmm_and_signalp: Uploaded v0.2.6, embedded citations and uses $GALAXY_SLOTS
  • From crs4:

Other News