Data

    The Map is not the Territory

    The map is not the territory; seek ground truth whenever possible to accelerate learning.

    uberHOP is a little example from my experience. The product was a point-to-point (a.k.a UV express) service Uber launched in Manila, along with Seattle and Toronto.

    The way it worked was simple: you would make a request to take a specific route during peak hours, and we would batch you in with up to 6 people to take a high occupancy vehicle along the route.

    uberHOP needed high occupancy to become profitable

    The pricing was at a 70% discount to uberX (the traditional ride product), and drivers were guaranteed earnings, so there was a minimum average occupancy needed to hit profitability. To get to that high occupancy, we needed to ensure that the routes selected were of high quality.

    A slide showing a news article and the interface for uberHOP.

    Initial approach: Clustering!

    My first instinct as a data person was [[clustering]]. We needed to find pairs of longitude and latitude that had enough pickup and dropoff density in them to have a decent chance of becoming profitable.

    The launch routes were selected using this method, but we had limited success, even after a novelty period, cancellation rates remained high. I tried different algorithms, distance metrics, using various map features, dispatch radiuses, all for very incremental gains.

    We used clustering to find initial approaches, but the results were not as expected.

    Seeking ground truth

    What did help was to actually seek ground truth, and the solution was embarrassingly obvious.

    When we physically went to the most successful route’s pickup, the two key factors were: (a) high density residential buildings (as opposed to commercial), and (b) a driveway so drivers weren’t a moving target.

    SM Light Residences was a great pickup that embodied all the factors that were required for a good pickup

    We were able to turn the product profitable in a few weeks! This was easy to do because I was physically located in the market. However, this is a perennial challenge for distributed teams, so it’s even more important to consciously seek ground truth in those situations.

    Here’s an abridged version in Twitter thread form:

    Harnessing the Power of Data

    Prepared and presented this talk along with the data team at First Circle at Philippine Startup Week 2020. Focused on the practical aspects of actually creating a data function within a startup, from Data Strategy, to Data Engineering, and Data Analytics. We intentionally skipped Data Science.

    Here are the slides for reference:

    Dealing with model uncertainty in data products

    Uncertainty can stall the development of data products, particularly in areas where there are domain experts that don’t necessarily understand the end goal.

    Read More →

    Data Science for Operational Impact

    Data science for operational impact is quite an interesting field of study that has more intersections with social sciences and requires more organizational savvy.

    Much of the data science discourse at the time was geared towards what I call Product Data Science, where the goal was to build highly scalable machine learning systems that solve a general problem (think Uber’s surge algorithm). However, I think an equally interesting area of data science is what I’d call Operational Data Science, where there is significantly more iteration, working with domain experts (local marketers, country managers), to solve problems with “human-in-the-loop”.

    Thinking about data science this way has enabled me to overcome many adoption barriers in my past work.

    I presented this at SGInnovate, a data science incubator bootcamp in Singapore, while I was still at Uber.

    Exploring the world of Philippine online news

    The Databeers Manila crowd at the Google Philippines office.

    Last week, I had the chance to speak at Databeers Manila, which is a data science event infused with the power of good ol' beer. The event was held in the Google Philippines office, and beer was care of Katipunan Craft Ales.

    Read More →

    On Data Visualization Design

    Last year, I was invited to speak at World Information Architecture Day 2016 in Manila, held at A-Space, Makati. I covered an introduction to data use cases, and how to think about designing data visualizations, and why I think designing the right ways for people to consume insights from data is just as important than our ability to process or analyze them.

    The video is up for people to watch. Please let me know your thoughts at the comments section.

    If you’d like to view the slides more clearly, I’ve uploaded them to slideshare:

    On Why the Hero Generation is an Informed One

    Businesses and governments in the Philippines should adopt a data mindset - where intuition and personal experience are always backed up by data and an effort to see the entire picture is always made - in order to realize the benefits of entering the demographic window.

    Read More →

Older Posts →