April 21, 2016 by Raony Guimaraes on Uncategorized

Building a Regex Search Engine for DNA

This winter, Tony interned at Benchling and built the foundation of our DNA search feature. Somak and I brought it into production shortly after his internship ended, and we’re finally writing it up! We recently launched DNA search on Benchling – you can have a library of thousands of plasmids

We recently launched DNA search on Benchling – you can have a library of thousands of plasmids and primers, and we’ll search through them in less than 100ms end to end. It was a fun project, and through it we ended up building a regex search engine (on top of Elasticsearch) to support the various search needs of scientists. Here’s how we did it.

Source: Building a Regex Search Engine for DNA

April 20, 2016 by Raony Guimaraes on Uncategorized

Row Level Security with PostgreSQL 9.5

Release 9.5 of PostgreSQL delivers many new features like upsert, new JSONB functions, new GROUPING functions, and more. While some of these like upsert or JSONB may be useful to many people, a number of these new features really only service edge cases. If you have the particular edge case a feature solves though then that new feature can be invaluable. RLS (Row Level Security) is one of these edge case features.

RLS does just what it says: it secures a row in a table. But, you do have to enable it for each table plus you need to commit to using database roles as a main security mechanism. That last part is the barrier but also the reason to use such a feature.

With RLS, you use the database tier to secure the data (at least for the enabled tables). Both multi-tenant tables and analytics schemas where users have general access to the database via a query tool are solid examples of when RLS makes sense.

Source: Row Level Security with PostgreSQL 9.5

April 20, 2016 by Raony Guimaraes on Uncategorized

Snappy Changes – Labix Blog

As announced last Saturday, Snappy Ubuntu Core 2.0 has just been tagged and made its way into the archives of Ubuntu 16.04, which is due for the final release in the next days. So this is a nice time to start covering interesting aspects of what is being made available in this release.

A good choice for the first post in this series is talking about how snappy performs changes in the system, as that knowledge will be useful in observing and understanding what is going on in your snappy platform. Let’s start with the first operation you will likely do when first interacting with the snappy platform — install:

% sudo snap install ubuntu-calculator-app
120.01 MB / 120.01 MB [================================================================] 100.00 % 1.45 MB/s

This operation is traditionally done on analogous systems in an ephemeral way. That is, the software has either a local or a remote database of options to install, and once the change is requested the platform of choice will start acting on it with all state for the modification kept in memory. If something doesn’t go so well, such as a reboot or even a crash, the modification is lost.. in the best case. Besides being completely lost, it might also be partially applied to the system, with some files spread through the filesystem, and perhaps some of the involved hooks run. After the restart, the partial state remains until some manual action is taken.

Snappy instead has an engine that tracks and controls such changes in a persistent manner. All the recent changes, pending or not, may be observed via the API and the command line:

% snap changes
ID   Status  ...  Summary
1    Done    ...  Install "ubuntu-calculator-app" snap

(the spawn and ready date/time columns have been hidden for space)

Source: Snappy Changes – Labix Blog

April 20, 2016 by Raony Guimaraes on Uncategorized

Peer review: Troubled from the start : Nature News & Comment

Pivotal moments in the history of academic refereeing have occurred at times when the public status of science was being renegotiated, explains Alex Csiszar.

Referees are overworked. The problem of bias is intractable. The referee system has broken down and become an obstacle to scientific progress. Traditional refereeing is an antiquated form that might have been good for science in the past but it’s high time to put it out of its misery.

What is this familiar litany? It is a list of grievances aired by scientists a century ago.If complaining about the faults of referee systems is nothing new, such systems are not as old as historical accounts often claim. Investigators of nature communicated their findings without scientific referees for centuries. Deciding whom and what to trust usually depended on personal knowledge among close-knit groups of researchers. (Many might argue it still does.)

Source: Peer review: Troubled from the start : Nature News & Comment

April 20, 2016 by Raony Guimaraes on Uncategorized

19 Tips For Everyday Git Use

I’ve been using git full time for the past 4 years, and I wanted to share the most practical tips that I’ve learned along the way. Hopefully, it will be useful to somebody out there.

If you are completely new to git, I suggest reading Git Cheat Sheet first. This article is aimed at somebody who has been using git for three months or more.

Table of Contexts:

Parameters for better logging
git log --oneline --graph
Log actual changes in a file
git log -p filename
Only Log changes for some specific lines in file
git log -L 1,1:some-file.txt
Log changes not yet merged to the parent branch
git log --no-merges master..
Extract a file from another branch
git show some-branch:some-file.js
Some notes on rebasing
git pull --rebase
Remember the branch structure after a local merge
git merge --no-ff
Fix your previous commit, instead of making a new commit
git commit --amend
Three stages in git, and how to move between them
git reset --hard HEAD and git status -s
Revert a commit, softly
git revert -n
See diff-erence for the entire project (not just one file at a time) in a 3rd party diff tool
git difftool -d
Ignore the white space
git diff -w
Only “add” some changes from a file
git add -p
Discover and zap those old branches
git branch -a
Stash only some files
git stash -p
Good commit messages
Git Auto-completion
Create aliases for your most frequently used commands
Quickly find a commit that broke your feature (EXTRA AWESOME)
git bisect

Source: 19 Tips For Everyday Git Use

April 19, 2016 by Raony Guimaraes on Uncategorized

The Remarkable Advantage of Abundant Thinking | First Round Review

Katia Verresen is one of the most sought-after leadership coaches in tech. Here’s how she’s seen abundant thinking transform careers and companies.

Source: The Remarkable Advantage of Abundant Thinking | First Round Review

April 19, 2016 by Raony Guimaraes on Uncategorized

6 Lesser Known Python Data Analysis Libraries

Python offers a great environment and rich set of libraries to developers while working with data. There are tons of useful libraries out there for novice or experienced developers or analysts for helping out with processing or visualizing datasets. Some of the libraries are really popular and used by millions of developers, for example – Pandas, Numpy, Scikit-learn, NTLK etc. Some of the libraries are not so well known and turned out to be handy in my experience. This article introduces 6 such Python libraries when working with data. Readers might already be familiarized with some of them, but I hope this article still proves to be useful.
Source: 6 Lesser Known Python Data Analysis Libraries

April 16, 2016 by Raony Guimaraes on Uncategorized

Sorry, You Can’t Speed Read – The New York Times

Don’t be fooled by courses or digital technologies that promise otherwise.

OUR favorite Woody Allen joke is the one about taking a speed-reading course. “I read ‘War and Peace’ in 20 minutes,” he says. “It’s about Russia.”

The promise of speed reading — to absorb text several times faster than normal, without any significant loss of comprehension — can indeed seem too good to be true. Nonetheless, it has long been an aspiration for many readers, as well as the entrepreneurs seeking to serve them. And as the production rate for new reading matter has increased, and people read on a growing array of devices, the lure of speed reading has only grown stronger.

Source: Sorry, You Can’t Speed Read – The New York Times