We present the backbone method, a generic framework that enables sparse and interpretable supervised machine learning methods to scale to ultra-high dimensional problems. We solve sparse regression problems with $10^7$ features in minutes and $10^8$ features in hours, as well as decision tree problems with $10^5$ features in minutes.The proposed method operates in two phases: we first determine the backbone set, consisting of potentially relevant features, by solving a number of tractable subpro...