Last week we have downloaded and loaded into R data from fitness tracker (motion coprocessor in iphone). Then with just few lines of R code we decomposed the data into a seasonal weekly component and the trend. Today we are going to see how to plot the number of steps per hour for different days of week. And then same data will be used to check how often there was any activity at given time.
Right Whale Recognition was a computer vision competition organized by the NOAA Fisheries on the Kaggle.com data science platform. Our machine learning team at deepsense.ai has finished 1st! In this post we describe our solution.
During the Christmas break I met my brother-in-law who is an ultimate gadgeteer (an excellent trait for brother). He told me that most iPhones have build-in motion coprocessor and by default they are counting steps. No need to turn on anything, it is working all the time (assuming that the phone is with you).
The new version of the ggplot2 package (v 2.0.0) will be available on CRAN in a few days.
It has a very nice mechanism for adding new geoms and stats.
I’ve prepared a short console-based data-driven R game named ,,The Proton Game’’ or ,,Hack the Proton” (still cannot decide which name is better). The goal of a player is to play the hacker and infiltrate Slawomir Pietraszko’s account on a Proton server. To do this, you have to solve four data-based puzzles.
Decision trees are cute.
It is easy to visualize them, easy to explain, easy to apply and even easy to construct.
Unfortunately they are quite unstable, particularly for large sets of correlated features.
Today we are going to illustrate some subtle differences among three statistical packages, R/SAS/SPSS. Small differences, but sometimes even a very small difference may have large consequences. So it is worth to know such things.
Two days ago Hadley Wickham tweeted a link with introduction to his new package multidplyr. Basically it’s a tool to take advantage of many cores for dplyr operations. Let’s see how to play with it.
When you execute an action on a RDD, Apache Spark runs a job that in turn triggers tasks using DAGScheduler and TaskScheduler, respectively. They are all low-level details that may be often useful to understand when a simple transformation is no longer simple performance-wise and takes ages to complete.
The lives of brave firemen are threatened during dangerous emergency missions while they try to save other people and their property. In this post I would like to share my experiences and winning strategy for the AAIA’15 Data Mining Competition: Tagging Firefighter Activities at a Fire Scene, in which I took first place.