RNA Bioinformatics: 2012

Thursday, 29 November 2012

Evolving an NZ home into a European one

I've been meaning to share this for ages.

I recently bought a house in New Zealand after living in Europe for nearly a decade. This caused a slight shock to my system. European homes are warm, New Zealand homes are not. My house is a fairly typical 1950s bungalow, it had no insulation in the ceiling, walls or underfloor, no double glazing and no central heating. There is a heatpump attached to the wall but without insulation one is trying to warm the environment as well as ones house when it is on. Since I'm a miser, this makes me mad. So I've tried to do something about this.

Updated hottest researcher figure

I realise the previous version of this figure was not the prettiest thing to look at. I was divorced from the internet in my Dunedin hotel room last night, so I dedicated a bit of time to making it look more palatable. Enjoy!

Monday, 22 October 2012

Is Open Access for Free Too Much to Ask?

Google is now the second most valuable IT company. They made the bulk of their fortune by providing fast and accurate internet searches for free and are funded almost entirely by advertising. If you had told me this would happen a decade or two ago I would’ve thought it was ludicrous! Similarly, the 6th most visited website on the internet is an encyclopedia, called Wikipedia. Wikipedia is written entirely by amateurs and volunteers and is funded entirely by donations. This would have also seemed crazy a decade ago. Wikipedia is supported by a non-profit organisation called the Wikimedia Foundation that employs just 50 people.

Yet this IS the world we live in. Google and the Wikimedia Foundation are remarkably successful and influential businesses. They show that unusual business models can be remarkably successful. However, the major academic publishing houses have languished. Large publishing houses continue to lock vital medical, basic science and engineering literature behind paywalls. A few, major publishers provide a variety of author-pays open access (OA) models. The costs of which range from $300 to $5000 USD, even with a strong NZ dollar, the average cost is equivalent to a few Summer Scholarships (to test wild research ideas), a new computer or 60GB of next-gen sequencing data, at current costs (this is roughly 20 human genomes-worth of sequence data). Some, rare, publishers make all their articles open access (with the author’s permission) after 1-2 years.

A few new publishing models have been proposed. One of particular interest is the child of major German, UK and US funding agencies. These are The Max Planck Society (a publicly funded NGO named after theoretical physicist, Max Planck), The Wellcome Trust (founded by pharmaceutical magnate, Sir Henry Wellcome in 1936) and The Howard Hughes Medical Institute (founded by businessman, Howard Hughes in 1953). One can only assume that these charities have become tired of their donations being used to line the pockets of publishers. Therefore, in a cost cutting exercise, they have launched their own journal, eLife. A new open access journal, that initially is experimenting with free OA publishing. The first edition of the journal was released this week!

Other models include a hybrid of traditional publishing and preprint archiving pioneered by PeerJ, with a very reasonably priced Lifetime Subscription model. While on the subject of preprints, there is also the free (physics) preprint archive epitomised by arXiv.org. I’ve recently converted to using arXiv.org and have been very impressed by the near instantaneous indexing by GoogleScholar. Also, arXiv.org is ranked very well by GoogleScholar’s H5-index. This appears to be a great option for freeing your research, if your field is eligible (thank you qBIO).

As a follower of Impact Factors and other (better) measures of journal quality I’m not a fan of new journals. Personally I think there are already too many journals. However, the eLife model is so novel and has the backing three of the most powerful funding agencies in the world, therefore they may have the traction to build a successful new journal.

So, what about the other publishing groups? What fees do they charge for OA? Are any providing good value for money? To investigate this I have obtained OA fee ranges from a probably biased table prepared by BMC. Then for each of these publishers I have used the ISI Web of Science(TM) database to look up the range of impact factors for each publisher’s top 5 journals. See figure 1 for a visualisation of this data.

Figure 1: The figure on the left shows the range of OA costs charged by each publishing house. The fi gure on the right shows the Impact Factor range for each publishing house’s top 5 journals.

Then I became curious: Which publishers are providing the best OA deals in terms of dollars per impact factor point? (figure 2). Ignoring eLife for now, Wiley-Blackwell (W-B) may be charging $18.18USD/IF if one can really publish in the insanely high-impact (101.78) journal, “CA: A Cancer Journal for Clinicians” for $1850USD. However, checking the guidelines for this journal, I found that the only OA option costs $3000USD. This drops W-B to 5th place. Next is the American Chemical Society (ACS), with a potential charge of $24.88USD/IF if one can publish in Chemical Reviews (IF:40.197) for $1000USD. This option is available to ACS Members and Affliated Subscribers. Dues are $148.00 per year. This is surprisingly reasonable deal for a publisher with a history of a strong anti-OA stance.

Figure 2: This figure shows the potential range of cost/IF-point for each publisher’s top 5 journals.

Now, the other end of the spectrum. Who is providing the worst deal? This is a difficult question to answer: almost all the publishers support journals with impact factors near zero that charge for OA. Consequently, any monetary value divided by a small IF results in a large value. However, let’s look at the worst deals in the data I have. The list is topped by America’s National Academy of Science (which only publishes 3 journals indexed by ISI). “Transportation Research Record” (IF:0.471), as far as I can tell from their website, has no open access policy at all. That is no OA deal at all. Next down the list is “mBio” (IF:5.3) which, according to BMC’s table, may be charging $3285USD for OA publishing. Checking their website ASM members can publish for $2000USD, non-members for $3000USD. ASM membership costs $50, this is probably a worthwhile investment. $2050/5.3 drops mBio to 7th worst OA deal on my list. Next on my hitlist is Hindawi–a bunch of academic spammers if my inbox is anything to go by. Hindawi are potentially charging $1500USD to publish in “Journal of Biomedicine and Biotechnology” (IF:2.436). A quick trip to their website confirms this is the case. We finally have a winner for the worst deal award! Next down the list is, surprisingly, PLoS. PLoS’ 5-th ranked journal is “PLoS Computational Biology” (IF:5.215). According to their fee page, publication for research in middle to high income country incurs a fee of $2250USD. This is cheaper than anticipated so PLoS moves down to 8th worst slot. Phew! Anthony Poole and I have just had an article accepted in PLoS CB. I think I’ll finish this tiresome game there.

I have been terribly unfair to the publishers I have mentioned here (and probably the ones I haven’t). The rules of my game have been rather arbitrary. If I was to do this fairly I would survey a large number of academics from a number of disciplines to find out what OA fees they are really paying in which journals. I haven’t the time to do this, however, this is something that could be added to the next SOAP initiative.

Another issue I haven’t discussed is copyright. Some of the publishers still retain the copyright on OA articles, others do not. This topic is covered by other blogposts in the series.

In summary, open access is great but can be extremely expensive. Not all publishers are equal, therefore, it is worth shopping around. Preprint archives can provide a nice intermediate solution. Finally, please buy Paul’s Patented Cognitive Enhancement Vitamin Formula, produced in association with Placeboceuticals.

Conflicts of interest:

1. I am an Assistant Editor in Chief for the Landes Bioscience journal, RNA Biology. I regularly invite

contributions and referee articles for the journal. I receive no salary for this position. They did send me an iPad nearly 2 years ago after I cheekily asked for one when they advertised their new iPad App. This was very nice, my kids regularly use it for watching YouTube clips about ”Lego” and ”Thomas the Tank Engine”. I’m sometimes permitted to use it for email, Facebook, Twitter and as a Kindle.

2. I was funded by the Wellcome Trust for 4 years. They were wonderful, my contract specified that all my articles must be deposited in UKPMC within 6 months of publication. They happily paid the OA fees when I managed to publish my work.

Abbreviations used in the above figures: T&F=Taylor & Francis, ACS=American Chemical Society, NAS=National Academy of Science, PLoS-Public Library of Science, BMC=BioMed Central, W-B=Wiley-Blackwell, BMJ=British Medical Journal, CUP=Cambridge University Press, OUP=Oxford University Press, NPG=Nature Publishing Group.

A version of this article is cross posted on the NZ Creative Commons Blog.

Open Access Week, 2012

Together with a number of other open access advocates, I have written a blog post for Open Access week, 2012. To read it and some other fantastic posts from my colleagues visit the NZ Creative Commons website.

Friday, 24 August 2012

NZ's hottest researchers from 2010-2012.

In response to the recently released "The Hottest Research of 2011" report from ScienceWatch (where several of my former colleagues at the Wellcome Trust Sanger Institute feature) I thought I'd take a look at NZ's hottest research and researchers.

Re-blogging: Rfam 11.0 is out!

The really BIG news for this release is the Xfam-Biomart. Which finally allows researchers to easily fetch all the sequences in Rfam from their favourite organism. For example, lets pretend I was really interested in Helicobacter pylori 35A. I go to the NCBI Taxonomy, look the species up there and record the taxid (585535). Back at the Biomart I enter the taxid beside "NCBI Taxonomy ID:", hit "Next", then select a number of handy looking features, hit "Results" and suddenly I have ALL the sequences from Helicobacter pylori 35A. This is an extraordinarily useful feature that has, until now, been missing from the Xfam arsenal. I'll be making heavy use of it in future.

See the Xfam blog for more details.

Tuesday, 10 April 2012

LaTeX fun with periodic tables

For a while now I've wanted to generate a simple periodic table of elements in LaTeX. I've googled around a bit without too much joy. So I've made a simple one myself.

Here is the code:

\documentclass[a4paper,12pt]{article}
\pagestyle{empty}

\usepackage{rotating}

\linespread{1.2}

\begin{document}

\begin{sidewaystable}

{

\renewcommand{\arraystretch}{1.5}

\bfseries

\centering

\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}

\cline{1-1} \cline{18-18}

H & \multicolumn{16}{|c|}{} & He \\

\cline{1-2} \cline{13-18}

Li & Be & \multicolumn{10}{|c|}{} & Bo & C & N & O & Fl & Ne \\

\cline{1-2} \cline{13-18}

Na & Mg & \multicolumn{10}{|c|}{} & Al & Si & P & S & Cl & Ar \\

\hline

K & Ca & Sc & Ti & V & Cr & Mn & Fe & Co & Ni & Cu & Zn & Ga & Ge & As & Se & Br & Kr \\

\hline

Rb & Sr & Y & Zr & Nb & Mo & Tc & Ru & Rh & Pd & Ag & Cd & In & Sn & Sb & Te & I & Xe \\

\hline

Cs & Ba & * & Hf & Ta & W & Re & Os & Ir & Pt & Au & Hg & Tl & Pb & Bi & Po & At & Rn \\

\hline

Fr & Ra & ** & Rf & Db & Sg & Bh & Hs & Mt & Ds & Rg & Cn & Uut & Uuq & Uup & Uuh & Uus & Uuo \\

\hline

\end{tabular}

}

\end{sidewaystable}

\end{document}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

And here is the resulting table:

Thursday, 15 March 2012

Two Lecturer positions in Bioinformatics/Genomics : Auckland / Palmerston North, New Zealand

Lecturer positions in NZ are as rare as Moa teeth. If you are interested read more at UniJobs, NewScientist and NatureJobs.

Thursday, 8 March 2012

PhD position in the evolution and bioinformatics of RNA in New Zealand

Anthony Poole and I are seeking a talented PhD candidate to explore the evolution and bioinformatics of RNA. Tell your friends!

Closing Date: 30 March 2012
For more, see the information sheet.

Thursday, 23 February 2012

RNA Biology provides incentives to review

The full text of the email from Renee Schroeder and Eva Riedmann from RNA Biology is below. The gist of it is that researchers get free subscriptions and discounts on publication costs in exchange for reviewing. This is a clever move by them (in my completely biased opinion). The journal, Nucleic Acids Research, is the only other journal I know of that offers incentives for reviews. They offer a few pounds towards books or CDs from their preferred suppliers in exchange for reviewing. This is nice, but frankly the selection from those sources is very limited. I'd much rather have full access to the journals I review for (actually, I'd rather everyone had full access, that's another battle). All too often I've wanted to look at the published version of a manuscript that I've reviewed and not had access to it. This is ridiculous.

Dear Research Community –

We are writing to you now because you have either served as a reviewer or have submitted a manuscript to the journal RNA Biology in the past.

RNA Biology will be instituting a reviewer incentive program offering free subscriptions and discounts on publication costs in exchange for timely reviews. To guarantee the success of this program we need to update our database and are requesting a few moments of your time.

Please login to the RNA Biology submission and peer-review website here:

http://rnabiol.msubmit.net/

and click on the link “Modify Profile/Password.”

If you could please ensure that your institutional affiliation, address, and email address are up to date AND please select up to five ‘Areas of Expertise’, we would greatly appreciate this. (For your convenience, we have listed the areas of expertise below.)

This will ensure that we are able to notify you of new developments with the journal. Additionally, this information will help ensure that we have a robust database from which to quickly identify appropriate peer-reviewers.

If you have any questions or concern please contact us at rna@landesbioscience.com

Thank you very much for your help.

Sincerely,

Renee Schroeder
Editor-in-Chief
University of Vienna

Eva Riedmann, Ph.D.
Acquisitions Editor
Landes Bioscience

Areas of Expertise

Apatamers
biogenesis
bioinformatics
cancer
cell biology
chromatin
developmental biology
epigenetics
mechanism of translation
methods
miRNA
mRNA transport/localization
natural antisense
neurobiology/neurological disease
prokaryotes
protein-RNA interactions
regulation of stability/degradation
regulation of translation
ribonucleases
ribosome
riboswitches
ribozymes
RNA binding proteins
RNA damage/repair
RNA in disease
RNA stability/degradation
RNA viruses
RNomics
siRNA
small and large non-coding RNAs
splicing/pre-mRNA processing
therapeutics
transcriptome
TRNA

Wednesday, 22 February 2012

Fetching sequences from EMBL/ENA using wget/curl

Every time I want to download several EMBL files (eg. all the bacterial genomes) I spend at least an hour trying to find the right URL syntax. This post is a public note to self that will help me next time and perhaps help others who are also receiving a few lines of HTML when all they want is a verdammt plain-text EMBL formatted file.

There is actual documentation on the right syntax here, which again takes a while to find, searching for wget, curl, EMBL and various related combinations doesn't get you there quickly. However, the main issue I have is, if I go to the recommended sequence record eg. here, none of the links work with a simple "wget URL" or "curl -G URL".

So, if I want to fetch the Roseobacter denitrificans genome sequence with EMBL accession CP000362. I use:
wget http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/embl/CP000362
or if you're into curl:
curl -G http://www.ebi.ac.uk/Tools/dbfetch/dbfetch/embl/CP000362 > CP000362.embl

Simple!

Sunday, 12 February 2012

Excited about eQTLs

During my time at the Sanger Institute I heard many talks from people in Manolis Dermitzakis' group on expressed quantitative trait loci (eQTLs). For practical purposes these eQTLs are SNPs that are strongly correlated with expression level eg. a population's genotypes at one site might be AA, AG and GG, a nearby gene might have corresponding median expression levels of 2, 4 and 6 (arbitrary units) across multiple genotyped individuals. Something I've always thought would be very interesting to look at was the functional characterisation of the sites these SNPs lie in. A recent paper by Gaffney et al entitled "Dissecting the regulatory architecture of gene expression QTLs" has made some inroads into this problem. It looks like they've focussed on the promoter regions and found that ~40% are in open chromatin structures and are enriched in transcription factor binding sites. My interests are of course more on the putative cis-regulatory elements such as structured UTR elements (eg. IREs) and microRNA binding sites that the eQTLs can presumably influence. So it looks like there are still many fun projects that these datasets can spawn.

RNA Bioinformatics

Thursday, 29 November 2012

Evolving an NZ home into a European one

Wednesday, 28 November 2012

Updated hottest researcher figure

Monday, 22 October 2012

Is Open Access for Free Too Much to Ask?

Open Access Week, 2012

Friday, 24 August 2012

NZ's hottest researchers from 2010-2012.

Tuesday, 14 August 2012

Re-blogging: Rfam 11.0 is out!

Tuesday, 10 April 2012

LaTeX fun with periodic tables

Thursday, 15 March 2012

Two Lecturer positions in Bioinformatics/Genomics : Auckland / Palmerston North, New Zealand

Thursday, 8 March 2012

PhD position in the evolution and bioinformatics of RNA in New Zealand

Thursday, 23 February 2012

RNA Biology provides incentives to review

Wednesday, 22 February 2012

Fetching sequences from EMBL/ENA using wget/curl

Sunday, 12 February 2012

Excited about eQTLs

Wikipedia: my contributions

CiteULike: my library