Introducing Exploratory Desktop — UI for R

Kan Nishida
learn data science
Published in
7 min readMay 6, 2016

--

dplyr is amazing. I immediately fell in love with it when I encountered for the first time because each command interface was simple and beautiful, its use of ‘pipe’ made the data analysis pipeline readable for anybody, and the functionality it provided was already comprehensive and practical for real use cases especially when combined with tidyr. On top of that, the performance was blazing fast. What else did I need ?

The data transformation (or data wrangling), which I was not big fun of before, has become the most fun part of my data analysis workflow since I moved to ‘Hadleyverse’ where you find most of your end to end data analysis flows can be accomplished more effectively with less effort by a series of tools including dplyr provided by Hadley Wickham and his collaborators. And this is why I have been talking about dplyr and other functions from ‘Hadleyverse’ on this blog. I wanted you to have this transformational experience (no pun intended!) and aware that there is this amazing set of tools that anybody can use to work with any data so easily.

We are still living in the world where Excel is considered as the most popular analysis tool and anything beyond Excel is considered to be done by someone who can do programming. But I believe strongly that anybody using Excel should be able to do what once was thought that only Data Scientists / Engineers could do, if you know dplyr or tools from ‘Hadleyverse’.

Sure, sounds good, but maybe next weekend or next year

But… Unless you are already R user today, all the great things about dplyr or ‘Hadleyverse’ are just another nice sound from a different universe. You might not be sure how to set up your R environment itself. You might be wondering what R packages you need and how to install them. And you might not be sure how to bring your data into R. While importing popular data types like CSV files could be very simple as I have shown before at this post, there are many different types of data like the data in JSON format, the data sitting inside Web pages, or the data you need to extract through REST APIs like the one for Github.

It’s one of those things — “Sounds cool, maybe I should take some time to get my hands on, but probably this weekend, next month, or maybe next year….!” But just reading how great dplyr is or hearing about it is not going to make you fully realize the awesomeness of ‘Hadleyverse’ and become better at data analysis. You got to experience it by using it with your data.

Here comes Exploratory Desktop

So we have built this app called ‘Exploratory Desktop’ to make it easy to experience ‘Hadleyverse’ quickly. It is sort of an UI for dplyr, but the main interaction between you and your data is still done through dplyr or tidyr commands (functions). We have built some features around the command line interface to make your overall data analysis more visually intuitive, rapidly interactive, and most importantly more fun.

Let me introduce some of those features quickly.

dplyr / tidyr command builder

You might not even know which commands or functions of ‘Hadleyverse’ to start with and how to use them. With Exploratory, you can simply select one of the data wranling operations from each column header menu. This will automatically generate dplyr or tidyr commands for you. You can simply run this command as is or customize it the way you want. If you are already familiar with dplyr / tidyr commands, then don’t bother. You can simply start typing in the command line input area and start interacting with your data right away!

Context Aware Suggestion List for Hadleyverse and R functions

You might not know the syntax of each function or its arguments. Looking up the reference doc or user guide would be too cumbersome. So we have built context aware syntax suggestion list and syntax help window. As you type in the command line input area you will be suggested possible columns, functions, function arguments, or operators based on where you are inside the functions.

Data Analysis Pipeline — Readability and Reproducibility

Instead of manually typing the pipe ‘%>%’ operator you can simply add a new step by clicking + (Plus) button and select one of the dplyr / tidyr operations from the list or simply just start typing. And each data wrangling operation will be added to the steps at right hand side. All the steps are actually chained together behind the scene and are managed together for the data dependency. So, you can update any step at any time or even add a new step in the middle of the steps while you can (and should!) expect the data at all the steps are still intact by getting automatically updated.

And not only managing your data analysis pipeline inside Exploratory Desktop, you can also export it as R script — a set of dplyr commands chained with the pipes — so that you can run it even outside of Exploratory.

Visualization at every step

Visualization is an essential part of any data analysis flow. You might want to inspect the data or understand the result of each command operation quickly and intuitively. Exploratory provides two types of visualization. One is Summary view where you can see a quick summary of the data every time you run the commands.

Another is Chart view where you can interactively visualize your data to find patterns, trends, or outliers.

Data Access and Import

Even with an amazing data wrangling grammar of dplyr and tidyr, you still can’t have truely interesting expeirence without being able to bring the data of your choice. So we have built a list of out-of-the-box data import dialogs that will help you import data quickly. We support text file (with readr), Excel (with readxl), JSON file (with jsonlite), Statistical files (with haven) as local file import. Also, it supports scraping table data from web pages, extracting data from cloud apps like Google Analytics, Google Spreadsheet, Github, etc, and importing data from databases like MySQL, MongoDB, etc.

Desktop app that is ready to start dplyr right away

There are many more, but I will cover those in detail with different posts. But one of the most important things is that Exploratory Desktop is literary a desktop app, which means your data stays on your PC. You don’t need to worry about your private or company data being exposed to somewhere you don’t know.

You can download Exploratory Desktop to your PC and simply just double click on it to start the app. The initial setup takes care of setting up R and R packages from ‘Hadleyverse’ and others automatically, if you don’t have them yet.

Start experience Hadleyverse today!

We think Exploratory Desktop is a great way to learn dplyr and ‘Hadleyverse’ quickly by experiencing it. We have been using it for ourselves to work with a diverse set of data. Not only are we becoming better at data analysis but also we are simply having a lot of fun working with such data. We hope you will too!

Beta invitation is open

This is still our beta version and we’re just getting started. So there could be some places you might find unpolished, but we would love to hear what you think of it. You can sign up for the beta access from this page.

Start your journey of ‘Hadleyverse’ now!

--

--

CEO / Founder at Exploratory(https://exploratory.io/). Having fun analyzing interesting data and learning something new everyday.