R Passes SAS in Scholarly Use (finally) | r4stats.com

Way back in 2012 I published a forecast that showed that the use of R for scholarly publications would likely pass the use of SAS in 2015. But I didn’t believe the forecast since I expected the sharp decline in SAS and SPSS use to level off. In 2013, the trend accelerated and I expected R to pass SAS in the middle of 2014. As luck would have it, Google changed their algorithm, somehow finding vast additional quantities of SAS and SPSS articles. I just collected data on the most recent complete year of scholarly publications, and it turns out that 2015 was indeed the year that R passed SAS to garner the #2 position. Once again, models do better than “expert” opinion!  I’ve updated The Popularity of Data Analysis Software to reflect this new data and include it here to save you the trouble of reading the whole 45 pages of it.

If you’re interested in learning R, you might consider reading my books R for SAS and SPSS Users, or R for Stata Users. I also teach workshops on R, but I’m currently booked through mid October, so please plan ahead.

Source: R Passes SAS in Scholarly Use (finally) | r4stats.com

 

Mastering Programming

http://raonyguimaraes.com/wp-content/uploads/2016/06/newbie-coder-tips-intro-644x250.jpgFrom years of watching master programmers, I have observed certain common patterns in their workflows. From years of coaching skilled journeyman programmers, I have observed the absence of those patterns. I have seen what a difference introducing the patterns can make.
Here are ways effective programmers get the most out of their precious 3e9 seconds on the planet.
The theme here is scaling your brain. The journeyman learns to solve bigger problems by solving more problems at once. The master learns to solve even bigger problems than that by solving fewer problems at once. Part of the wisdom is subdividing so that integrating the separate solutions will be a smaller problem than just solving them together.

Time

  • Slicing. Take a big project, cut it into thin slices, and rearrange the slices to suit your context. I can always slice projects finer and I can always find new permutations of the slices that meet different needs.
  • One thing at a time. We’re so focused on efficiency that we reduce the number of feedback cycles in an attempt to reduce overhead. This leads to difficult debugging situations whose expected cost is greater than the cycle overhead we avoided.
  • Make it run, make it right, make it fast. (Example of One Thing at a Time, Slicing, and Easy Changes)
  • Easy changes. When faced with a hard change, first make it easy (warning, this may be hard), then make the easy change. (e.g. slicing, one thing at a time, concentration, isolation). Example of slicing.
  • Concentration. If you need to change several elements, first rearrange the code so the change only needs to happen in one element.
  • Isolation. If you only need to change a part of an element, extract that part so the whole subelement changes.
  • Baseline Measurement. Start projects by measuring the current state of the world. This goes against our engineering instincts to start fixing things, but when you measure the baseline you will actually know whether you are fixing things.

Source: Mastering Programming

 

The Barbell Effect of Machine Learning

If there is one technology that promises to change the world more than any other over the next several decades, it is arguably machine learning.

If there is one technology that promises to change the world more than any other over the next several decades, it is arguably machine learning. By enabling computers to learn certain things more efficiently than humans and discover certain things that humans cannot, machine learning promises to bring increasing intelligence to software everywhere and enable computers to develop ever new capabilities — from driving cars to diagnosing disease — that were previously thought impossible.

While most of the core algorithms that drive machine learning have been around for decades, what has magnified its promise so dramatically in recent years is the extraordinary growth of the two fuels that power these algorithms — data and computing power. Both continue to grow at exponential rates, suggesting that machine learning is at the beginning of a very long and productive run.

As revolutionary as machine learning will be, its impact will be highly asymmetric. While most machine learning algorithms, libraries and tools are in the public domain and computing power is a widely available commodity, data ownership is highly concentrated.

This means that machine learning will likely have a profound barbell effect on the technology landscape. On one hand, it will democratize basic intelligence through the commoditization and diffusion of services such as image recognition and translation into software broadly. On the other, it will concentrate higher-order intelligence in the hands of a relatively small number of incumbents that control the lion’s share of their industry’s data.

For startups seeking to take advantage of the machine learning revolution, this barbell effect is a helpful lens to look for the biggest business opportunities. While there will be many new kinds of startups that machine learning will enable, the most promising will likely cluster around the incumbent end of the barbell.

Source: The Barbell Effect of Machine Learning — Medium

 

How cancer was created by evolution

The cells inside a tumour change and evolve just like animals in the wild. Understanding how this works could help us stop cancer in its tracks

Will we ever win the “war on cancer”?

The latest figures show just how distant a prospect victory is right now. In the US, the lifetime risk of developing cancer is 42% in men and 38% in women, according to the American Cancer Society. The figures are even worse in the UK. According to Cancer Research UK, 54% of men and 48% of women will get cancer at some point in their lives.

And cases are on the rise. As of 2015 there are 2.5 million people in the UK living with the disease, according to Macmillan Cancer Support. This is an increase of 3% each year, or 400,000 extra cases in five years.

Source: BBC – Earth – How cancer was created by evolution

 

What is the difference between deep learning and usual machine learning?

That’s an interesting question, and I try to answer this is a very general way. The tl;dr version of this is: Deep learning is essentially a set of techniques that help we to parameterize deep neural network structures, neural networks with many, many layers and parameters.

And if we are interested, a more concrete example: Let’s start with multi-layer perceptrons (MLPs) …

On a tangent: The term “perceptron” in MLPs may be a bit confusing since we don’t really want only linear neurons in our network. Using MLPs, we want to learn complex functions to solve non-linear problems. Thus, our network is conventionally composed of one or multiple “hidden” layers that connect the input and output layer. Those hidden layers normally have some sort of sigmoid activation function (log-sigmoid or the hyperbolic tangent etc.). For example, think of a log-sigmoid unit in our network as a logistic regression unit that returns continuous values outputs in the range 0-1. A simple MLP could look like this

Source: python-machine-learning-book/difference-deep-and-normal-learning.md at master · rasbt/python-machine-learning-book

 

Scientists Find Form of Crispr Gene Editing With New Capabilities – The New York Times

A common bacterium contains molecules that target RNA, not DNA. If it can be harnessed for use in humans, the process may lead to new forms of bioengineering.

Just a few years ago, Crispr was a cipher — something that sounded to most ears like a device for keeping lettuce fresh. Today, Crispr-Cas9 is widely known as a powerful way to edit genes. Scientists are deploying it in promising experiments, and a number of companies are already using it to develop drugs to treat conditions ranging from cancer to sickle-cell anemia.

Yet there is still a lot of misunderstanding around it. Crispr describes a series of DNA sequences discovered in microbes, part of a system to defend against attacking viruses. Microbes make thousands of forms of Crispr, most of which are just starting to be investigated by scientists. If they can be harnessed, some may bring changes to medicine that we can barely imagine.

On Thursday, in the journal Science, researchers demonstrated just how much is left to discover. They found that an ordinary mouth bacterium makes a form of Crispr that breaks apart not DNA, but RNA — the molecular messenger used by cells to turn genes into proteins.

If scientists can get this process to work in human cells, they may open up a new front in gene engineering, gaining the ability to precisely adjust the proteins in cells, for instance, or to target cancer cells.

“The groundbreaking thing about this work is that it now opens up the RNA world to Crispr,” said Oliver Rackham, a synthetic biologist at the University of Western Australia who was not involved in the study.

Crispr was first discovered in 1987, but it took decades for scientists to figure out that microbes needed the system to recognize DNA from invading viruses and to chop it into pieces, stopping the infection.

In 2012, a team of scientists led by Jennifer Doudna of the University of California, Berkeley, and Emmanuelle Charpentier, then at Umea University in Sweden, discovered how to use this microbial defense as a gene-editing tool that could potentially alter any piece of DNA.

Source: Scientists Find Form of Crispr Gene Editing With New Capabilities – The New York Times

 

Kalzumeus Podcast Episode 12: Salary Negotiation with Josh Doody | Kalzumeus Software

Several years ago I wrote a blog post on salary negotiation for engineers. This probably created more value than anything else I’ve ever written — I have a folder in Gmail with thank-you messages from people, and my running total is something north of $2.3 million in added salary per year, mostly in $15k to $25k chunks.

A buddy of mine, Josh Doody, has decided to thoroughly own this area, and published a book (Amazon link) on the topic. I rather enjoyed the book, and thought I would have him on the podcast to talk about the topic in more detail.

[Patrick notes: As always, the below transcript occasionally has my thoughts inserted in this format.]

What you’ll learn in this podcast:

  • How to avoid the “What is your desired salary?” question
  • How to trade off across multiple axes when doing a salary negotiation (salary, vacation days, equity, etc)
  • How to get raises after being hired

A brief announcement: Keith Perhac and I have parted ways with regards to the podcast, amicably, largely due to scheduling issues. Both he and I have been quite busy with business and life, and we’ve moved to different countries, so we’ll be running our podcasts independently in the future. We’re still great friends and will probably appear on each others’ programs occasionally.

MP3 Download (~49 minutes, ~42MB): right click here and select Save As.

Podcast format: either subscribe to http://www.kalzumeus.com/category/podcasts/feed in your podcast reader of choice or you can search for Kalzumeus Podcast in iTunes, Overcast, or another aggregator of your choice.

Source: Kalzumeus Podcast Episode 12: Salary Negotiation with Josh Doody | Kalzumeus Software

 

Scientists Announce HGP-Write, Project to Synthesize the Human Genome

The formal announcement of the plans, which leaked last month, seeks to raise $100 million this year. The total price tag could exceed $1 billion.

Scientists on Thursday formally announced the start of a 10-year project aimed at vastly improving the ability to chemically manufacture DNA, with one of the goals being to synthetically create an entire human genome.

Plans for the project, which leaked last month, have already set off an ethical debate, because the ability to chemically fabricate the complete set of human chromosomes could theoretically allow the creation of babies without biological parents.

Some critics also objected to the secrecy surrounding a meeting to discuss the project at Harvard Medical School in May. The organizers said they avoided publicity so as to not jeopardize publication of the proposal in a peer reviewed scientific journal. The publication occurred on Thursday by the journal Science.

Source: Scientists Announce HGP-Write, Project to Synthesize the Human Genome – The New York Times