Thoughts

The Map is not the Territory

The map is not the territory; seek ground truth whenever possible to accelerate learning.

uberHOP is a little example from my experience. The product was a point-to-point (a.k.a UV express) service Uber launched in Manila, along with Seattle and Toronto.

The way it worked was simple: you would make a request to take a specific route during peak hours, and we would batch you in with up to 6 people to take a high occupancy vehicle along the route.

uberHOP needed high occupancy to become profitable

The pricing was at a 70% discount to uberX (the traditional ride product), and drivers were guaranteed earnings, so there was a minimum average occupancy needed to hit profitability. To get to that high occupancy, we needed to ensure that the routes selected were of high quality.

A slide showing a news article and the interface for uberHOP.

Initial approach: Clustering!

My first instinct as a data person was [[clustering]]. We needed to find pairs of longitude and latitude that had enough pickup and dropoff density in them to have a decent chance of becoming profitable.

The launch routes were selected using this method, but we had limited success, even after a novelty period, cancellation rates remained high. I tried different algorithms, distance metrics, using various map features, dispatch radiuses, all for very incremental gains.

We used clustering to find initial approaches, but the results were not as expected.

Seeking ground truth

What did help was to actually seek ground truth, and the solution was embarrassingly obvious.

When we physically went to the most successful route’s pickup, the two key factors were: (a) high density residential buildings (as opposed to commercial), and (b) a driveway so drivers weren’t a moving target.

SM Light Residences was a great pickup that embodied all the factors that were required for a good pickup

We were able to turn the product profitable in a few weeks! This was easy to do because I was physically located in the market. However, this is a perennial challenge for distributed teams, so it’s even more important to consciously seek ground truth in those situations.

Here’s an abridged version in Twitter thread form:

Talks

Data analytics in emerging markets

I was invited by the Civil Service Commission of the Philippines to provide a short talk on data analytics in emerging markets.

I wanted to provide an organization with a rough blueprint to building a sustainable data practice in an emerging market context, including to:

  • set a Data Strategy
  • building Decision Systems
  • achieve Data Adoption

The slides are here:

Read More →

Talks

Harnessing the Power of Data

Prepared and presented this talk along with the data team at First Circle at Philippine Startup Week 2020. Focused on the practical aspects of actually creating a data function within a startup, from Data Strategy, to Data Engineering, and Data Analytics. We intentionally skipped Data Science.

Here are the slides for reference:

Articles

A data team's product is decisions

Photo by Franki Chamaki (Unsplash)

Making your data team’s main product decisions, as opposed to reports, models, or engineered systems, is a great way of communicating the value of the team internally and externally.

Read More →

Reviews

Book Review: Freakonomics: A Rogue Economist Explores the Hidden Side of Everything

Morality, it could be argued, represents the way that people would like the world to work—whereas economics represents how it actually does work. Freakonomics

This book has a special place in my life. It sparked my interest in quantitative social science which led to a career in data.

It’s a very light read that goes through various anecdotes generally exploring the themes of unintended consequences, when good intent is not enough to produce good outcomes, and incentives more broadly . The book explores rather taboo topics with a careless irreverence, earning it its title.

It can become a bit difficult to verify some of the claims in the book as the author introduces papers and studies very casually. It may be prudent to treat all the claims in this book with some skepticism.

At the end of the day, though, I don’t think the point of the book was to convince the reader of the absolute veracity of its analyses, but to introduce the reader to a new way of thinking that is completely pragmatic, a fresh perspective when compared to the dogmatisms of many personal and professional spheres today.

Thoughts

Dealing with model uncertainty in data products

Uncertainty can stall the development of data products, particularly in areas where there are domain experts that don’t necessarily understand the end goal.

Read More →