Distribute by and cluster by clauses are really cool features in SparkSQL. Unfortunately, this subjectremains relatively unknown to most users – this post aims to change that.
Tomasz Kulakowski, co-founder and CEO of CodiLime, the sole investor in the Big Data science company deepsense.io, is the winner of Polish Business Roundtable’s 2016 Vision and Innovation Award. PRB’s Jan Wejchert Awards are the most prestigious prizes in the Polish business community.
A few days ago we released Seahorse 1.1, an enhanced version of our machine learning, Big Data manipulation and visualization product. Today, we will show you how the new version of Seahorse can be used for data mining and data visualization.
The latest version of Seahorse, deepsense.io’s flagship Big Data product, adds new features and improved UI.
deepsense.io experts to present Big Data and deep learning accomplishments at Silicon Valley and Dublin conferences.
Seahorse provides users with reports on their data at every step in the workflow. A user can view reports after each operation to review the intermediate results. In our reports we provide users with distributions for columns in the form of a histogram for continuous data, and a pie chart for categorical data.
deepsense.io tops global competition for predicting dangerous seismic events in active coal mines.
A few days ago we have released Seahorse 1.0, a visual platform for machine learning and Big Data manipulation available for all, for free! Today, we show you how to use Seahorse to solve a simple classification problem.
New product version and corporate workshop series target world’s premiere big data gathering of Apache Spark professionals.
In Seahorse we want to provide our users with accurate distributions for their categorical data. Categorical data can be thought of as possible results of an observation that can take one of K possible outcomes. Some examples: Nationality, Marital Status, Gender, Type of Education.