An average data scientist deals with loads of data daily. Some say over 60-70% time is spent in data cleaning, munging and bringing data to a suitable format such that machine learning models can be applied on that data. This post focuses on the second part, i.e., applying machine learning models, including the preprocessing steps. The pipelines discussed in this post come as a result of over a hundred machine learning competitions that I’ve taken part in. It must be noted that the discussion here is very general but very useful and there can also be very complicated methods which exist and are practised by professionals.

We will be using python!

Source: Approaching (Almost) Any Machine Learning Problem | Abhishek Thakur | No Free Hunch

Categories: Uncategorized

Related Posts

Uncategorized

Django 2.0 alpha 1 released | Weblog | Django

Django 2.0 alpha 1 is now available. It represents the first stage in the 2.0 release cycle and is an opportunity for you to try out the changes coming in Django 2.0. Django 2.0 has Read more…

Uncategorized

Becoming a 10x Data Scientist – Algorithmia

Borrowing tips and tricks from software developers, learn how to create a more productive workflow on the journey to becoming a 10X Data Scientist. Source: Becoming a 10x Data Scientist – Algorithmia Related PostsTrey Causey Read more…

Uncategorized

Announcing Rust 1.20 – The Rust Programming Language Blog

curl https://sh.rustup.rs -sSf | sh rustup update stable Source: Announcing Rust 1.20 – The Rust Programming Language Blog Related PostsIn Defense of C++Principles for C programming – Drew DeVault’s BlogVulnerability announced: update your Git clientsVulnerability Read more…