A New 23andMe Experience

23andMe made some news today, with the launch of a new experience for customers that includes the first and only direct-to-consumer

23andMe_Box_121013_1genetic service that meets U.S. FDA standards. We invested nearly two years of work conducting extensive user testing, working with regulators, scientists, physicians, and top product design experts.  Most important, we talked to 23andMe customers about how we could improve functionality, comprehension and create tools to improve your experience.Take a look at our homepage to get a sense of how 23andMe is redefining how genetic information is delivered to people.We are working to transition our existing customers to the new experience as quickly as possible, but some of that transition is dependent on validation and local requirements. New customers can expect to begin receiving results within four to six weeks of returning their samples to our lab.On the blog in the coming months, we’ll highlight some of what customers will see in the new product and explore some of the new offerings including new carrier status* reports, wellness reports, information about traits and ancestry, as well as updated tools for exploring more aspects of their own genetic information.

Source: A New 23andMe Experience

 

24 Good Books to Read. Here’s what Sam Altman of Y Combinator is Reading.

Sam Altman is currently President of perhaps the World’s most prestigious technology accelerator Y Combinator. Having incubated companies such as AirBnB, Reddit, Stripe, Dropbox and Mixpanel. Over the years, Y Combinator have funded over 800 startups, accrued a community of over 1,600 founders and have earned a combined valuation of over $30B.

It goes without saying but it’s certain that there’s a fair bit of knowledge to acquire and lots of great books to read as suggested by the Y Combinator team and alumni. We were lucky enough to find Sam Altman’s Shelfie and have scanned it to give you a handy list of some books you may want to read.

Source: 24 Good Books to Read. Here’s what Sam Altman of Y Combinator is Reading. – Shelfie Blog – Take a Shelfie. Connect with Readers. Get free ebooks.

 

There’s a Mystery Machine That Sculpts the Human Genome – The Atlantic

Genomes are so regularly represented as strings of letters—As, Gs, Cs, and Ts—that it’s easy to forget that they aren’t just abstract collections of data. They exist in three dimensions. They are made of molecules. They are physical objects that take up space—a lot of space.

Source: There’s a Mystery Machine That Sculpts the Human Genome – The Atlantic

 

The Strangers in Your Brain – The New Yorker

Biologists often say that the most dangerous thing a cell can do is divide. This is because, during the complex process of replication—the unspooling of DNA, the assembling of two genomes from the halves of one—there is always the chance that the cell will make a mistake. Mutations can cost an organism its life, but they are also essential to evolution. Without them, there would be no novelty and no change; the slow-churning Darwinian search algorithm would stop. In this sense, transposons—wandering snippets of DNA that hide in genomes, copying and pasting themselves at random—are unsung heroes of natural selection. Although the information that they carry is spare, they account for fifty per cent of all mammalian genetic material. Our own DNA is a battlefield between self and other.

Source: The Strangers in Your Brain – The New Yorker

 

Document Clustering with Python


 

In this guide, I will explain how to cluster a set of documents using Python. My motivating example is to identify the latent structures within the synopses of the top 100 films of all time (per an IMDB list). See the original post for a more detailed discussion on the example. This guide covers:

  • tokenizing and stemming each synopsis
  • transforming the corpus into vector space using [tf-idf](http://en.wikipedia.org/wiki/Tf%E2%80%93idf)
  • calculating cosine distance between each document as a measure of similarity
  • clustering the documents using the [k-means algorithm](http://en.wikipedia.org/wiki/K-means_clustering)
  • using [multidimensional scaling](http://en.wikipedia.org/wiki/Multidimensional_scaling) to reduce dimensionality within the corpus
  • plotting the clustering output using [matplotlib](http://matplotlib.org/) and [mpld3](http://mpld3.github.io/)
  • conducting a hierarchical clustering on the corpus using [Ward clustering](http://en.wikipedia.org/wiki/Ward%27s_method)
  • plotting a Ward dendrogram
  • topic modeling using [Latent Dirichlet Allocation (LDA)](http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation)


http://nbviewer.ipython.org/github/brandomr/document_cluster/blob/master/cluster_analysis_web.ipynb

 

Default Alive or Default Dead?

When I talk to a startup that’s been operating for more than 8 or 9 months, the first thing I want to know is almost always the same. Assuming their expenses remain constant and their revenue growth is what it’s been over the last several months, do they make it to profitability on the money they have left? Or to put it more dramatically, by default do they live or die?The startling thing is how often the founders themselves don’t know. Half the founders I talk to don’t know whether they’re default alive or default dead.

Source: Default Alive or Default Dead?