Monday, March 17, 2014

Course in Molecular Neuroanatomy Group Project

I was lucky enough to visit Okinawa again for the Course in Molecular Neuroanatomy. Thanks to OIST and Allen Brain Atlas.

As a tutor I had to facilitate a group project. I was tasked with the auditory cortex and after a quick google search I found this paper:
Gene Expression Identifies Distinct Ascending Glutamatergic Pathways to Frequency-Organized Auditory Cortex in the Rat Brain, by Storace, Higgins, Chikar, Oliver, and Read

It was all done in rat it seems. I sent out the paper to the group and we took a look at some of their findings in the paper in the Allen Mouse datasets. We were happy to find agreement in the Allen mouse gene expression and the connectivity atlases. Limited to the results regarding VGLUT1 and cortex <-> thalamus connections.

Here's the powerpoint presentation that summarizes our work.

Updates (based on response from Douglas Storace):

So it's pretty easy to do a quick spatial correlative search in the Allen Brain atlas, to find other genes that have a similar pattern.

So the one gene we mention with a similar pattern is Kcnma1. I just tried to reproduce that on the Allen website and it's not so easy.

Here's the steps:

  1. search for slc17a7
  2. checkbox the saggital assay (the first one)
  3. goto the correlation box on the right and leave basic cell groups checked, and also checkbox thalamus
  4. click search in the correlation box
  5. Kcnma1 shows up as 6th.


It seems that search wasn't spatially restricted to thalamus as we have in the slides. You could try variations on that to find more genes - like selecting all three Slc17a7 assays.

When you get those lists, you should go through them by hand and take a look for gradients. We did this quickly for our project so the confidence we have in Kcnma1 is limited.

Also, one neat thing we found in the human data is that VGLUT2 is highly expressed in subcortical auditory regions like cochlear nuclei, Inferior Colliculus and others.. it's expressed all over.. but it's
relatively higher in those regions.

Tuesday, February 18, 2014

Axon guidance and SNPs

This is warning to those doing pathway type analysis on SNPs extracted from Illumina 660 type chips. Specifically when looking at brain phenotypes.

Recently, I took a quick and fast approach of grabbing a bunch of genes from some top (but non significant) SNP's from an analysis of fMRI data. I didn't expect any GO groups to be significant, so I was shocked to see axon guidance on top with a super significant p-value (10-17 after correction). There's a bunch of other GO groups too that come out on top too (see below). We were doing brain stuff so we liked the GO groups but were suspicious for a number of reasons.

Here's a quote that sums up the problem:
"Such an analysis of gene set enrichment is based on the assumptions that all genes are sampled independently from each other with the same probability. These assumptions are violated with data from GWA studies as (i) longer genes usually have more SNPs resulting in a higher probability of being sampled and (ii) overlapping genes are sampled in clusters (Holmans et al., 2009)."

That text is from:
Gowinda: unbiased analysis of gene set enrichment for genome-wide association studies
Robert Kofler and Christian Schlötterer

One quick test I did was to run another gene ontology analysis where the genes are sorted according do how many SNP's on the genotyping chip. That confirms the problem. Here's some of the top groups from a quick GOrilla analysis


GO Group FDR q-value
cell adhesion 3.43E-017
biological adhesion 2.04E-017
neuron projection guidance 1.39E-017
axon guidance 1.04E-017
single-organism process 1.84E-015
synaptic transmission 3.16E-014
single-organism cellular process 1.20E-013
cellular component movement 1.40E-013
single-multicellular organism process 2.30E-011
ion transport 2.51E-011

This shows a pretty clear bias and I'm guessing it's in part biological. I'd reckon that lots of variation in neuronal guidance genes allows a diversity of brain wiring.

Next step was to hookup Gowinda for the human data and redo the analysis. The result was no significant GO groups.

Several files are needed Gowinda, I put them online if anyone wants to do a similar analysis with human data.

We noticed this and fixed it, but I wonder how many other papers might have been tricked by this. One that comes to mind is this paper:
A genomic pathway approach to a complex disease: axon guidance and Parkinson disease
Timothy G Lesnick, Spiridon Papapetropoulos, Deborah C Mash, Jarlath Ffrench-Mullen, Lina Shehadeh, Mariza de Andrade, John R Henley,Walter A Rocca, J. Eric Ahlskog, Demetrius M Maraganore

with this follow-up (among others):
Neither Replication nor Simulation Supports a Role for the Axon Guidance Pathway in the Genetics of Parkinson's Disease
Yonghong Li, Charles Rowland, Georgia Xiromerisiou, Robert J. Lagier, Steven J. Schrodi, Efthimios Dradiotis, David Ross, Nam Bui, Joseph Catanese, Konstantinos Aggelakis, Andrew Grupe, Georgios Hadjigeorgiou

I can't say exactly what might explain the differences between those two as they did a more complex analysis than just looking for pathway enrichment for certain SNPs. The first paper does have very significant p-values (10-51) which can be a red flag. It could be that a small enrichment for brain genes, for the phenotype could be exaggerated by the huge bias in the chips. I didn't spend much time on this though, I just wanted to post it online incase anyone else runs into it.

Update: I noticed a similar problem when running GO analysis on Illumina 450k methylation data. In that case the number of CpG sites per gene is not even across GO groups. I ended up doing an empirical type analysis which seemed to work - the super low p-values disappeared.

Sunday, January 19, 2014

Tools for molecular neuroanatomy

This is just a quick listing of tools I often recommend to students at the Course in Molecular Neuroanatomy. They are mainly for dealing with many genes.

From slides:


Finding out when and where genes are expressed:

  • Excross
    • mouse genes as input
    • returns when and where a list of genes is expressed
    • find out which genes are in other Allen datasets
    • wait for images to load
    • reload page for new input
  • HBAset
    • specific expression for many genes in the 6 Allen HBA donors
    • returns where a list of genes are specifically expressed
    • under construction
    • treeview button doesn't work right in chrome, try firefox
I made the above two so if you find them helpful or are using them for analysis then contact me for extensions and help.

Lapis
  • extracting gene symbols from text
  • visual xml parsing, method:
    • click on the edit simultaneously button
    • select gene symbol or text of interest
    • copy and paste into a new file inside lapis
    • copy again and paste in excel or somewhere else

VennMaster
  • comparing gene lists
  • making venn/Euler diagrams
  • input is a text file with tab separated gene then group entries
    • use a spreadsheet tool to make list, then paste into text file
  • use control+c to copy genes in mac osx
  • used to test if the overlap between two gene sets is significant

Gene ontology analysis
Gene network



  • easiest to just set input+output to gene symbol

Extracting gene lists from PDF files - pdftohtml maybe useful. 

Thanks to Christos Gkogkas for suggesting I keep track of these.