Visualize a year of bike rides with Kimono and CartoDB

Tutorial: Map your own location data

Data is more accessible, tangible and interesting when you can visualize and interact with the figures on a page. That’s why we love teaming up with our friends over at CartoDB! Kimono is a smart web scraper that let’s you turn data on a website into an API – a structured feed of updating data. CartoDB let’s you take that data set and create beautiful interactive maps. In this post, we will use kimono to get over a year’s worth of bike trip data from New York City’s citibike bike sharing program. We’ll then use cartoDB to plot our friend Andrew’s movements on a map. A big thanks to Andrew for riding his bike a lot and sharing his data with us!

Here’s all that you’ll need to build your own data-driven map:

  • A kimono account (it’s free) and the kimono chrome extension
  • A cartoDB account (it’s also free)
  • A website with location data – we’ll use data from NYC’s citibike bike sharing program, but you can capture anything you like (e.g., Uber, Lyft, public transportation routes)
  • 10 minutes

CREATE AN API

Navigate to the website with the data you want to map. For this example we are using data from Andrew’s Citibike account, which looks like this:

Screen Shot 2014-12-30 at 1.32.05 PM

Click on the kimono chrome extension…
chrome-extension-256 copy

…and the kimono toolbar will appear on top of the webpage:

Screen Shot 2015-01-06 at 10.21.07 AM

Notice the flashing lock icon on the toolbar. This indicates that the page requires you to log in. Click on the lock icon Screen Shot 2015-01-05 at 6.42.28 PM.

Kimono will then direct you to the site’s log-in page (if you need to navigate further to get to the login page, click the navigation icon Screen Shot 2015-01-05 at 6.43.17 PM and go to the appropriate login page). Once at the login page, you must identify the username, password and submit fields – this teaches kimono how to log-in to this site. Do this by clicking the username, password and submit icons on the toolbar and then clicking on the matching field on the webpage.

Screen Shot 2014-12-30 at 1.38.15 PM

Click done and then enter your login credentials. Kimono securely stores the credentials so that it can access your data automatically on the schedule you specify.

Screen Shot 2014-12-30 at 1.38.27 PM

EXTRACT DATA

Once you’ve completed the log-in cycle, you will see the original page with your data. Here we want to grab four types of data – the start station, start times, end station and end times. To do this, click on one of the start stations. Kimono will suggest other start stations to you. Click the check mark to accept all the start stations into your first data property. The number in the yellow circle Screen Shot 2015-01-05 at 6.19.22 PM will increase to reflect the number of data points in that property. Now click the grey plus Screen Shot 2015-01-05 at 6.19.35 PM to add a new property and repeat this process with start times, end stations and end times. You can preview the structured data that kimono will extract in the data preview pane and make more granular adjustments in the data model view.

intro-step-4

Click DONE to create an API. Select daily to make sure your data is refreshed every day. Once it’s done, click the link to check out your new API.
(Pro-tip: if you are having trouble selecting just the data you want, try clicking and dragging to select just the part you want and kimono will strip off extraneous text.)

Screen Shot 2014-12-30 at 1.39.02 PM
Screen Shot 2014-12-30 at 1.39.22 PM
Screen Shot 2014-12-30 at 1.45.10 PM
Screen Shot 2014-12-30 at 1.51.31 PM

CONFIGURE YOUR API

You can view the first page of data extracted on the API detail page for your new API.

Screen Shot 2014-12-30 at 1.52.43 PM

The API we just set up only extracts from one page of rides. If you have several pages of data, you’ll need to configure your API to extract from multiple pages as well. To do this, go to the crawl setup tab, click on ‘crawl strategy’ to specify the type of crawl you want to do – select generated URL list in this case. Then, on the lower right you will see the URL generator.

Screen Shot 2014-12-30 at 2.05.46 PM

Screen Shot 2014-12-30 at 2.06.21 PM
Kimono has broken your source URL into its relevant sub-components. For us, the number after ‘trips’ specifies the page you’re on. To the right of the number parameter, click ‘range’ and specify 1 to 20 (instead of 20 use whatever number corresponds to the total number of pages of data that you have).

Screen Shot 2014-12-30 at 2.06.02 PM
You will see a list of URLs generated by kimono below. Click save changes, then hit ‘start crawl above’. Once the crawl completes, go back to the data preview tab and download the CSV. Open it up in excel and remove the top row – the row that says ‘collection1’ to get it formatted for use with cartoDB.

Now that we have our structured data set, let’s start mapping our route data.

GEO-CODE YOUR DATA

Log in to your cartoDB account and select ‘tables’. Click the large plus on the right to add a new table.
Screen Shot 2014-12-30 at 2.08.21 PM
Screen Shot 2014-12-30 at 2.08.24 PM

Choose ‘data file’ and select the kimono csv file that we just downloaded. Once the data is loaded into cartoDB, click the drop-down next to the property with your start station data. Select ‘georeference’ to translate this into coordinates, i.e. latitude/longitude pairs.

Screen Shot 2014-12-30 at 2.08.47 PM
Screen Shot 2014-12-30 at 2.09.06 PM

Select referencing ‘by street address’, specify the city and country and hit continue. You’ll see a new column appear with latitude and longitude data for each station. We’re almost done.

MAP THE RESULTS

At the top, click on ‘map view’ and click the wizard/wand icon on the right.

Screen Shot 2014-12-30 at 2.09.18 PM
Screen Shot 2014-12-30 at 2.23.06 PM

To create an animated map with categories, select ‘Torque Cat’, then use the drop down menus to set ‘time column’ to your start time property. Then set the category column to the end station and use the fields below to map colors to end stations by region, for example.

Ta-da! You’re done! You just built an awesome animated map. But, suppose you wanted to calculate a few more interesting things and plot the output? With kimono’s filter functions you can do just that. We’re beta testing this feature right now, so just email us at support@kimonolabs.com and we’ll give you early access to the feature. Filter functions allow you to write javascript functions that operate on the data returned by your API. With filter functions enabled your APIs will now return the processed output. For example, we wrote a frequency function to count the number of times each station appears in the dataset in total, and how many times during the day and the night, allowing us to create a heatmap of where Andrew spends the most time, and how that changes by time of day. Once we’ve enabled you for filter functions, you can access them from the ‘advanced’ tab of your API.

Filter-Function

Copy in our frequency function below, to start:

function transform(data, callback) {
  var collection = data.results.collection1; //shortcut for our collection
  var totalTimes ={}; //object for histogram for all times. Key is a string of station name.

  //helper function to return if is during day or night...between 7am and 7pm = day. 
  //assumes a Date-able string as input 
  var dayOrNight = function(date){
   var time = new Date(date);
   if (19 > time.getHours() && time.getHours() > 7 ){
     return 'day';
   } 
   else{
     return 'night';
   }
  };


  //helps populate totalTimes with key and value pair of address and total, day, night times
  var addToHistograms = function(station, date){

    //initialize for a given address 
    if(!totalTimes.hasOwnProperty(station)){
      totalTimes[station] = {'station': station, 'total' : 1, 'day' : 0, 'night' : 0 };
      if(dayOrNight(date) === 'day'){
        totalTimes[station].day += 1;
      }
      else{
        totalTimes[station].night += 1;
      }
    }

    //add for a given address
    else{
      totalTimes[station].total += 1;
      if(dayOrNight(date) === 'day'){
        totalTimes[station].day += 1;
      }
      else{
        totalTimes[station].night += 1;
      }
    }
  };

  //iterate through property2s (start destination) for collection1 and add them to totalTimes
  for (var i = 0; i < collection.length; i++){
    var station = collection[i].property2;
    var date = collection[i].property3;
    addToHistograms(station, date);

  }


  //do the same for end destination
  for (var j = 0; j < collection.length; j++){
    station = collection[j].property4;
     date = collection[j].property5;
    addToHistograms(station, date);
  }

  //delete old data
  collection.splice(0, collection.length);

  //pop off totalTimes by just the value, and add to the collection array (for csv formatting purposes)
  for (var key in totalTimes) {
    if (totalTimes.hasOwnProperty(key)) {
      collection.push(totalTimes[key]);
    }
  }


 callback(null, data);
}

Using cartoDB’s bubble plot setting, we can quickly turn this into a heatmap of where Andrew spends his time.

Screen Shot 2015-01-06 at 10.27.46 AM

That’s just a quick preview of some powerful maps you can build with kimono and cartoDB. We’re excited to see what you will build with the tools. Tell us what you create at contribute@kimonolabs.com and reach out to us at support@kimonolabs.com if you get stuck.

One thought on “Visualize a year of bike rides with Kimono and CartoDB

  1. You can definitely see your expertise in the article
    you write. The world hopes for more passionate writers like you who aren’t afraid to
    say how they believe. All the time go after your heart.

Comments are closed.