Welcome to my field notes!

Field notes are notes I leave myself as I go through my day to day work. The hope is that other people will also find these notes useful. Note that these notes are unfiltered and unverified.



TJ Palanca


July 31, 2022

Correlation does not imply causation, but correlation does imply there is (common or actual) causation. Bias in models is a result of:

Uncertainty is a result of:

Correlation and causation

  • If you want to select a population that can be observed to over-index for a trait, then selecting based on a correlated trait is okay
  • If your strategy involves interventions with one of the correlated variables to change the other, then correlation alone is not sufficient. You need causation
  • Correlation implies causation (and vice versa) whenever there is no bias
  • If you need a causal result, and all you have is observational data, it’s okay to act on correlation alone if you’re sure there’s no bias. That is, estimation problems aside, you’re sure that there’s no confounding, and no selection bias.

Back door criterion

  • This is the property for identifying the right variables to control for.
  • You need to control for variables that:
  • IS NOT an effect of the cause we’re interested in, AND
  • IS on the confounding path

Data Processing Inequality

  • Words daily to represent the whole reality
  • There will never be a feature engineering or change in representation that will increase the amount of information in data
  • Data processing can only reduce the amount of information in a model
  • IC Algorithm - used for finding causal relationships according to Pearl

Applied to a driver earnings model

We want to find the earnings values that would reduce churn. If we simply correlated, for example, decrease in earnings volatility and churn, we might see a correlation, but we are unsure as to whether this will actually cause any changes in churn if we reduce volatility. There may simply be a common cause. That common cause may be casual drivers have both earnings volatility and churn. So reducing earnings volatility. If the entire effect on churn is because of the cohort of that driver, then reducing earnings volatility would not lead to any changes in churn.