Broad releases FASTG reference format that contains variation

The FASTG Format Specification Working Group is pleased to announce version 1.0 of the FASTG specification

FASTG is a format for faithfully representing genome assemblies in the face of allelic polymorphism and assembly uncertainty. Currently genome assemblies are represented linearly, as sequences of bases, recorded in FASTA files. Since chromosomes are in fact linear or circular, this makes sense, so long as one has complete knowledge of the genome. However, many genomes contain polymorphisms that cannot be represented in a simple linear sequence, and almost all assemblies contain errors and omissions, which can result in incorrect biological inferences. The FASTG format aims to address this problem using a flexible graph-based approach to encode any variability in the sequence, along with metadata to score and annotate the source of those variations. Assembly graphs in FASTG can be easily translated into linear FASTA sequences to support current analysis tools for reading mapping, annotation, visualization, etc, but our hope is to develop a next generation of assembly and genome analysis algorithms that can work with the graph structure directly. For the complete specification and additional information on FASTG, please visit:

If you are interested to discuss this further, please subscribe to the assemblathon-file-format mailing list:

The immediate plans are to enlist help to develop a reference library and command line suite for parsing, transforming, and querying assemblies in FASTG format, similar to the widely used SAM/SAMTools suite.


Your Genome? Which One?

One thing is clear at this stage: the assumption that each individual has a unique genome has been overthrown to some extent. Think how this might impact common evolutionary studies. For years, evolutionists have claimed small differences between human and chimpanzee genomes. What if the percent difference is a function of the source cells used? Remember, the Yale team found differences between cells in the same organ — human skin. If the percent difference grows or shrinks depending on the source, any conclusions about human-chimp similarities would prove unreliable.


It’s also not clear yet whether geneticists will be able to mask the differences between cells to establish an individual’s genome (to say nothing of a species’s genome) as a useful concept. Results would appear to be a function of investigator choice. Say, for instance, that an evolutionist chooses to compare genes of a particular kind of blood cell between species. If the CNV’s and SNP’s vary significantly from blood cell to blood cell within the individual, the results will be skewed. Mixing or averaging the maps of numerous cells, though, risks creating a theoretical construct that does not correspond to reality. Which cells should be averaged? Will the averages converge or diverge, depending on which cells are selected? Philosophers of science can have fun with this one.

Claims about evolutionary similarities and differences based on genetics must be taken with a grain of salt from now on. Perhaps the feared “profound implications” will prove inconsequential. If nothing else, though, the Yale study provides an example of conceptual superstructures built on shaky assumptions and “prevailing wisdom.” As those of us in the intelligent design community know, what prevails at a given moment is not necessarily wise.



50 tons de cinza

50 tons de cinza:

grey50 <- data.frame(
x = rep(1:10, 5),
y = rep(1:5, each=10),
c = unlist(lapply(seq(10,255,5), FUN=function(x) { rgb(x,x,x, max=255) })),
t = unlist(lapply(seq(10,255,5), FUN=function(x) { ifelse(x > 255/2, 'black', 'white') }))
ggplot(grey50, aes(x=x, y=y, fill=c, label=c, color=t)) + 
geom_tile() + geom_text(size=4) +
scale_fill_identity() + scale_color_identity() + ylab(NULL) + xlab(NULL) + 
theme(axis.ticks=element_blank(), axis.text=element_blank())

Daily Reports

Availability – miRGator v3.0 update is available at:
Cho S, Jang I, Jun Y, Yoon S, Ko M, Kwon Y, Choi I, Jang H, Ryu D, Lee B, Kim VN, Kim W, Lee S. (2012) miRGator v3.0: a microRNA portal for deep sequencing, expression profiling and mRNA targeting. Nucleic Acids Res

Computational thinking in the era of big data biology
Schatz MC
Genome Biology 2012, 13:177 (29 November 2012)

Genome interpretation and assembly—recent progress and next steps

Ganhador do prêmio Nobel de Química 2012

Achei bastante inspirador e por isso resolvi compartilhar!

Video com o ganhador

Asked for advice for young scientists by a reporter at the news conference, Kobilka said: “This is a fantastic way to spend your life. It’s hard, but if you’re interested and persevere, you can be successful. Every day is a challenge and exciting.” He cited his “irrational optimism” when asked why he continued his work in the face of seemingly insurmountable difficulty. “When something didn’t work, you’d be a little down, then at end of the day at home, you’ll think, ‘Oh, maybe this will work’. You always think something is going to work.”

Espero que gostem!

Installing Paperpile 0.5.1, Mendeley 1.8 and ReadCube in Ubuntu 12.04

Read Cube

./winetricks wininet adobeair mfc42 mono210 msxml6 quicktime76 vcrun6

Failed installing msxml6!

Had to install using arch32

remove .wine folder and type

WINEARCH=win32 winecfg

I used winxp as profile!

fixme:ras:RasEnumConnectionsW RAS support is not implemented! Configure program to use LAN connection/winsock instead!

follow this instruction.

1. Run winecfg

2. Click on Librarys

3. Choose “rasapi32” in “new overrides for” and click on add.

4. Click on rasapi32 and hit “edit”

5. Choose “disabled”

Taking forever to load!

Mendeley Desktop version 1.8

This is the most stablished


The thing I liked from the beginning in this software is that they have a version for linux and Mac. Strange they couldn’t make a version for windows :P. I discovered they used Catalyst  a perl MVC framework and this really impressed me.

They thing I didn’t like was that they don’t watch folders for new papers, and they duplicate your paper in a folder caled ~/.paperpile

I had some problems import 90 articles and I believe I’m supposed to add it by “manually” … boring!

Another killing feature is the possibility to search and download papers directly, but unfortunately they don’t have a good support for proxy, this would be really necessary since I’m downloading articles from everywhere I am not only from work.

The pdf preview sucks a little bit but they are still developing it. better would be if I could open the pdf outside the program.

They could create and account and sync my data.

Another problem I had was with screen resolution 1024×600, I usually use a big screen resolution since my computer is a pavillion dm1. had to change to 1280×768.

I accidentaly closed the app during the long time for importing, Had to import all folder again, takes forever!

I’m receiving this message on terminal:

QNetworkAccessFileBackendFactory: URL has no schema set, use file:// for files

It accused duplicated files while there it was supplementary material form the same paper!

Everytime I add a new paper I need to add the folder and wait for the import of all my 423 papers again ? Seriously ? 9minutes ? No way

Google Scholar not working

Very cool they have Cloud for Authors, Journals and Labels

You Can’t add one label inside the other, like folders on Mendeley!

Case Study: Adding a citation in your document while writing!





Mendeley is the most mature and has the best features such as watching folders and sync with Web so you can read in your tablet

Now Configuring Options on Mendeley


Get 10% of triobra1 and calculate REAP /projects/relatedness/plink-1.07-x86_64/plink –file trio1_bra.phasing –thin 0.2 –noweb –geno 0 –recode –out trio1_bra.phasing.20   /projects/relatedness/plink-1.07-x86_64/plink –file trio1_bra.phasing –thin 0.1 –noweb –geno 0 –recode –out trio1_bra.phasing.10 Of these, 956761  snps are new, 5637 already exist Cálculo só com os brasileiros merge3.ped