Seek ground truth whenever possible to accelerate learning.
The map is not the territory; seek ground truth whenever possible to accelerate learning.
uberHOP is a little example from my experience. The product was a point-to-point (a.k.a UV express) service Uber launched in Manila, along with Seattle and Toronto.
The way it worked was simple: you would make a request to take a specific route during peak hours, and we would batch you in with up to 6 people to take a high occupancy vehicle along the route.
The pricing was at a 70% discount to uberX (the traditional ride product), and drivers were guaranteed earnings, so there was a minimum average occupancy needed to hit profitability. To get to that high occupancy, we needed to ensure that the routes selected were of high quality.
My first instinct as a data person was clustering. We needed to find pairs of longitude and latitude that had enough pickup and dropoff density in them to have a decent chance of becoming profitable.
The launch routes were selected using this method, but we had limited success, even after a novelty period, cancellation rates remained high.
I tried different algorithms, distance metrics, using various map features, dispatch radiuses, all for very incremental gains.