Spark + R = SparkR

Spark + R = SparkR

Spark wins more and more hearts. And no wonder, comments from different sources tell us about a significant speed up (by an order of magnitude) for analysis of big datasets. Well-developed system for caching objects in memory allows us to avoid torturing hard discs during iterative operations performed on same data.

Pretty heat maps

Pretty heat maps

Do you know where Kamil Stoch earns most of his points in season 2013/2014? Some time ago I came across a pheatmaps package for R software which generates much nicer heat maps than the standard heatmap() function. This is why the package is named…

Is a simple linear regression able to knock spots off SVM and Random Forest?

Is a simple linear regression able to knock spots off SVM and Random Forest?

A friend of mine took part in a project in which he had to perform future prediction of Y characteristic. The problem was that Y characteristic showed an increasing trend over time. For the purposes of this post let us assume that Y characteristic was energy demand or milk yield of cows or any other characteristic that with positive trend over time.