Contact us
Locations
United States of America
- deepsense.ai, Inc.
- 2100 Geng Road, Suite 210
- Palo Alto, CA 94303
- United States of America
Poland
- deepsense.ai Sp. z o.o.
- al. Jerozolimskie 44
- 00-024 Warsaw
- Poland
- ul. Łęczycka 59
- 85-737 Bydgoszcz
- Poland
Let us know how we can help
- Our service offerings
- contact@deepsense.ai
- Media relations
- media@deepsense.ai
eXtreme Gradient Boosting vs Random Forest [and the caret package for R]
/in Data science /by Przemyslaw BiecekDecision trees are cute.
It is easy to visualize them, easy to explain, easy to apply and even easy to construct.
Unfortunately they are quite unstable, particularly for large sets of correlated features.
R vs SAS vs SPSS
/in Data science /by Przemyslaw BiecekToday we are going to illustrate some subtle differences among three statistical packages, R/SAS/SPSS. Small differences, but sometimes even a very small difference may have large consequences. So it is worth to know such things.
multidplyr: first impressions
/in Data science /by Przemyslaw BiecekTwo days ago Hadley Wickham tweeted a link with introduction to his new package multidplyr. Basically it’s a tool to take advantage of many cores for dplyr operations. Let’s see how to play with it.
Understanding Apache Spark’s Execution Model Using SparkListeners
/in Big data & Spark /by Jacek LaskowskiWhen you execute an action on a RDD, Apache Spark runs a job that in turn triggers tasks using DAGScheduler and TaskScheduler, respectively. They are all low-level details that may be often useful to understand when a simple transformation is no longer simple performance-wise and takes ages to complete.
Machine Learning for Greater Fire Scene Safety
/in Data science, Machine learning /by Jan LasekThe lives of brave firemen are threatened during dangerous emergency missions while they try to save other people and their property. In this post I would like to share my experiences and winning strategy for the AAIA’15 Data Mining Competition: Tagging Firefighter Activities at a Fire Scene, in which I took first place.
Data mining of the votes of Members of Parliament
/in Data science /by Przemyslaw Biecek7th term of the Sejm has already come to its end. It would be nice to see how have the Members of Polish Parliament voted for these last 4 years! In total they took part in over 6000 votings. Did the representatives of the same clubs voted more similarly to each other? Did the Members of Polish Parliament who changed the clubs they belonged to voted in a different way than the Members of Parliament from their former clubs? Let’s see!
Do cats or dogs live longer?
/in Data science /by Przemyslaw BiecekSome time ago our herd has expanded by a guinea pig called Hugo. It turns out that the presence of a pet at home is a great pretext for discussing with children the concepts of randomness, distribution functions and distribution in general.
Statistician like a shoemaker
/in Data science /by Przemyslaw BiecekChildren bring from school strange home assignments, like for example a question: What is your dad’s job similar to? After several hits (a cosmonaut, Formula 1 driver, firefighter) it turns out that the work performed by a statistician is very much similar to the work of a shoemaker. Why?
Multilevel classification, Cohen kappa and Krippendorff alpha
/in Data science /by Przemyslaw BiecekI was facing an interesting problem last week. Playing with data from The Genome Cancer Atlas (full genetic and clinical data for thousands of patients) I was building a classifier that predicts the type of cancer based on sets of genetic signatures.
Biplots, correspondence analysis and ggplot2
/in Data science /by Przemyslaw BiecekI was looking for biplots created with the use of ggplot2 library (because they look good and are customisable).
It turns out that there are some nice solutions for PCA (like sinhrks/ggfortify; kassambara/factoextra; vqv/ggbiplot; fawda123/ggord) but I could not find suitable solution for correspondence analysis.
So I create one….