November 2013 Galaxy Update
Welcome to the November 2013 Galaxy Update, a monthly summary of what is going on in the Galaxy community. Galaxy Updates complement the Galaxy Development News Briefs which accompany new Galaxy releases and focus on Galaxy code updates.
New Public Servers
Two new servers were added to the list of publicly accessible Galaxy servers in October, include one that covers some radically new territory for Galaxy:
CoSSci: Complex Social Science Gateway
Tools for solving Galton's problem in Comparative Research and complex network problems in Social Science.
The CoSSci Gateway is a public Galaxy server portal linked to a SDSC XSEDE Gateway for R-based analysis of multiple social science databases. CoSSci is hosted by the Institute for Mathematical Behavioral Science at UC Irvine. The initial six databases are anthropological (cross-cultural, foragers, etc.) and will expand to include cross-national (political, psychological, economic), regional, comparative urban and kinship networks, and solutions to identifying network-based structural cohesion and its consequences.
The initial modeling software solves for autocorrelation effects (spatial, phylogenetic histories, common language, common environment, larger polities and religion, etc. Dow-Eff software performs autocorrelation regression. logit provides multiple imputations of missing data for single or multiple overlapping datasets, maps, and also incorporates Bayesian and graphical approaches to testing for causal inferences and controlling for biases.
A Visual Manual is available
BioCiphers Lab Galaxy
The BioCiphers Lab Galaxy provides a user friendly interface for analysis tools developed by the BioCiphers Lab at the University of Pennsylvania. This includes the AVISPA alternative splicing prediction and analysis tool. See Barash, et al., "AVISPA: a web tool for the prediction and analysis of alternative splicing" Genome Biology 2013, 14:R114. BCL Galaxy offers email support. It is open to anyone from an educational organization.
# New Papers
# | Tag | # | Tag | # | Tag | # | Tag | |||
---|---|---|---|---|---|---|---|---|---|---|
2 | Cloud | 1 | Project | 2 | Tools | 3 | UsePublic | |||
1 | HowTo | - | RefPublic | - | UseCloud | - | Visualization | |||
1 | IsGalaxy | 1 | Reproducibility | 4 | UseLocal | 14 | Workbench | |||
29 | Methods | 1 | Shared | 8 | UseMain |
A record 53 new papers were added to the Galaxy CiteULike Group in October. Some papers that may be particularly interesting to the Galaxy community:
- Li, et al., "Expanding roles in a library-based bioinformatics service program: a case study," Journal of the Medical Library Association : JMLA, Vol. 101, No. 4. (October 2013), pp. 303-309, doi:10.3163/1536-5050.101.4.012
- Nagasaki, et al. "DDBJ Read Annotation Pipeline: A Cloud Computing-Based Pipeline for High-Throughput Analysis of Next-Generation Sequencing Data," DNA Research, Vol. 20, No. 4. (1 August 2013), pp. 383-390, doi:10.1093/dnares/dst017
And ...
Ten Simple Rules for Reproducible Computational Research
Geir Kjetil Sandve, Anton Nekrutenko, James Taylor, and Eivind Hovig's paper "Ten Simple Rules for Reproducible Computational Research" was published in PLoS Computational Biology in October.
Those 10 rules are (in our opinion) worth repeating:
- For Every Result, Keep Track of How It Was Produced
- Avoid Manual Data Manipulation Steps
- Archive the Exact Versions of All External Programs Used
- Version Control All Custom Scripts
- Record All Intermediate Results, When Possible in Standardized Formats
- For Analyses That Include Randomness, Note Underlying Random Seeds
- Always Store Raw Data behind Plots
- Generate Hierarchical Analysis Output, Allowing Layers of Increasing Detail to Be Inspected
- Connect Textual Statements to Underlying Results
- Provide Public Access to Scripts, Runs, and Results
If you care about reproducible research, then take a look.
Who's Hiring
The Galaxy is expanding! Please help it grow.
- CDD en bioinformatique : recherche de variants structuraux par reséquençage (NGS)
- Stage Master 2 : Développement et intégrationd’outils pour la bioanalyse dans l’environnement Galaxy
- Statistical Genomics Postdoc opening in the Makova lab at Penn State
- Computational biology opening at University Pierre-et-Marie-Curie, Paris
- The Galaxy Project is hiring software engineers and post-docs.
Got a Galaxy-related opening? Send it to outreach@galaxyproject.org and we'll put it in the Galaxy News feed and include it in next month's update.
Events
GCC2014: June 30 - July 2, Baltimore
The 2014 Galaxy Community Conference (GCC2014) will be held June 30 through July 2, at the Homewood Campus of Johns Hopkins University, in Baltimore, Maryland, United States.
Galaxy Community Conferences are an opportunity to participate in presentations, discussions, poster sessions, lightning talks and breakouts, all about high-throughput biology and the tools that support it. The conference will also includes a Training Day offering in-depth topic coverage, across several concurrent sessions. See the GCC2013 web site for an idea of what happens at a Galaxy Community Conference.
Galaxy Day, December 4, Paris
The IFB (French Bioinformatic Institute) Galaxy working group is organizing a seminar around the Galaxy platform which will takes place in Paris on December 4 (@Institut Curie - 9h30-17h).
The main aim of this meeting is to share experiences from laboratories and platforms working daily with Galaxy. Thirteen talks will have been selected to present insights about installations and releases of Galaxy platforms, new usages / domains, and technological developments.
Registration is open and free, but limited to 90 people.
UC Davis Bioinformatics Boot Camps
Registration is now open for Bioinformatics Bootcamps in December!
The next offering of UC Davis Bioinformatics Bootcamps will be held on the UC Davis campus December 10-13.
These focused one-day courses are for researchers looking to get up to speed quickly on the latest technologies and techniques in bioinformatics. Students will work on their own laptops and have continued access to software and example data used in the exercises through a public Amazon Web Services virtual machine. The first three bootcamps will use the Galaxy platform, and the final bootcamp will use both Galaxy and the command-line. The Alignment and Assembly bootcamps (Dec. 11th & 12th) require you to know Galaxy, so if you are unfamiliar with Galaxy, you should also take the Introduction bootcamp on Dec. 10th.
Tuesday, December 10: Introduction to Next-Generation Sequence Analysis with Galaxy
Wednesday, December 11: Next-Generation Sequence Alignment and Variant Discovery
Thursday, December 12: Genome Assembly using Next-Generation Sequence Data
Friday, December 13: Introduction to the Amazon Cloud for Galaxy and the Command-Line
Enrollment for each bootcamp will be capped at 24 students. Please enroll early to be assured of a seat, as these bootcamps usually fill up quickly! More information, including full descriptions of each bootcamp is available online. The cost for each bootcamp is $200 (academic/government) or $250 (non-academic/industry).
Other Events There is a lot going on in the next three months. Also see the [Galaxy Events Google Calendar](http://bit.ly/gxycal) for details on other events of interest to the community.
Lifeportal at the University of Oslo
The University of Oslo (UiO), the hosts of this year's Galaxy Community Conference, launched Lifeportal in October. While Lifeportal is not a public Galaxy server, it is available to all research institutions in Norway and their collaborators abroad.
From the Lifeportal website: Lifeportal gives you easy access to the High Performance Computing cluster Abel at the University of Oslo. The Galaxy based Lifeportal has a continuously growing list of services, and among them the most widely used tools from the former Bioportal.
The official opening event was on October 9 and included talks and demonstrations by Nils Christophersen, Rector Ole Petter Ottersen, Nikolay Vazov, Katerina Michalickova, Sveinung Gundersen, and Geir Kjetil Sandve, all of whom spoke and/or presented their work at GCC2013. Rector Ole Petter Ottersen commented that "The slogan of the university is 'reaching for the stars.' Great to open an entire galaxy today."
See this article for more on Lifeportal and why UIO chose Galaxy.
Galaxy Distributions
The most recent Galaxy distribution was August 12.
A new version of CloudMan was released in July.
Tool Shed Contributions
There were many...
- spp_tool: Cross-correlation analysis package (SPP)
- micomplete: Completeness report for (single-cell amplified) genomes
- BLAT a very fast sequence alignment tool similar to BLAST
- sequel: correct errors (i.e., insertions, deletions, substitutions) in contigs output from assembly
- gatk_2_7
- macs2: Model-based Analysis of ChIP-Seq (macs2)
- sample: sample records from the input file(s). Supports paired data if paired files are in sync.
- interproscan5: functional annotations/predictions
- cuffmerge, cuffcompare, cuffdiff, and cufflinks
- muscle: multiple alignment tool
- CLC Assembly Cell (CLCbio) (under development and looking for comments)