Report 02/10/12

Bioconductor Libraries to learn

http://www.bioconductor.org/packages/2.10/bioc/vignettes/topGO/inst/doc/topGO.pdf

Keggapi

Updated R to version 2.15

echo "deb http://cran.mirrors.hoobly.com/bin/linux/ubuntu precise/" >> /etc/apt/sources.list
gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
gpg -a --export E084DAB9 | sudo apt-key add -

Installed

http://www.omegahat.org/RCurl/FAQ.html

Biomart

library(KEGGSOAP)
genes <- get.genes.by.pathway(“path:hsa04340”)
#for each gene
genes <- sub(“(hsa:)”, “”, genes)
length(genes)
genes
#biomart
library(“biomaRt”)
ensembl = useMart(“ensembl”,dataset=”hsapiens_gene_ensembl”)

goids = getBM(attributes=c(‘entrezgene’,’go_id’), filters=’entrezgene’, values=genes, mart=ensembl)

 

 

 

Report 21/09/12

Indel Detection

Atlas, Dindel

ruby1.9.1 /lgc/programs/Atlas2_v1.4.1/Atlas-Indel2/Atlas-Indel2.rb -b ../../input/Exome_1_RP.realigned-recalibrated.bam -r /lgc/datasets/gatk_data/hg19/ucsc.hg19.fasta -o rms_indel -S

Varid Indel

varid_exec -a ../../../../input/Exome_1_RP.realigned-recalibrated.sam -r /lgc/datasets/gatk_data/hg19/ucsc.hg19.fasta -o varid_rms_detection –threads 4 –format vcf

Phasing Beagle

 

Agora sim!

Bom, depois de muito tempo refletindo sobre a melhor maneira de registrar as atividades do meu doutorado decidi que a melhor maneira de manter tudo organizado seria atraves deste blog.Porém com algumas ressalvas, apesar do blog se chamar Openscience pretendo manter os registros das atividades fechadas e postar mais sobre topicos em que ando trabalhando. Anteriormente estive usando um documento no google docs que se tornou extremamente grande, esse foi o maior motivo que me fez migrar para cá! Para isso preparei uma lista de alguns temas que considero comuns na area da Bioinformatica:

Markov

Burrows-Wheller

Montecarlo

Bayes

SVD

PCA

SVM

 

 

Notes

DATASETS from :references for SIFT, PolyPhen, annovar

OMIM variants extracted by Omicia and provided as a track (OMICIA_auto) on the next release of UCSC tables (http://genome-preview.ucsc.edu/…)

COSMIC rev54 (now 55 since a couple of days) DL as a text table I had to convert to BED with some perl magic (ftp://ftp.sanger.ac.uk/pub/CGP/cosmic)

dbSNP was not an easy catch and I am still struggling to get the full information from their difficult batch download system (only feasible through ensembl BIOMART so far: [tip: hg18 BIOMART is at:http://may2009.archive.ensembl.org/biomart/martview/]). For dbSNP, I searched for records with phenotype (thanks to another colleague) which is the only available annotation to pick disease variants but in fact includes many association results which are far from being causative .

Cancer Datasets

http://www.broadinstitute.org/cgi-bin/cancer/datasets.cgi

Breast Cancer Datasets

http://bioinformatics.nki.nl/data.php