The Choice: Class X or Linear X in a Logistic Model
This presentation discusses binary logistic models. Let Y be the target with levels 0 and 1, and let X be a "discrete" numeric predictor for a logistic model. "Discrete" means X has only a "few" levels. "Few" is subjective, but is typically under 16. A familiar example of a discrete numeric X is a "count" (e.g. count of children in household).
Predictor X might be entered into a logistic model as a nominal predictor by putting X in a CLASS statement. Alternatively, predictor X may enter a logistic model as a LINEAR predictor such as: PROC LOGISTIC desc; MODEL Y = X <and other predictors>; Monotonic transforms, such as Log(X), might also be considered instead of X.
This presentation gives criteria for choosing LINEAR X in preference to CLASS X. These criteria are programmed into a SAS® macro called %LOGIT_SCREENX. %LOGIT_SCREENX can process dozens of X's in a single macro call.
In a final section of the presentation the CLASS X vs. LINEAR X decision is discussed for the cumulative logit model. %LOGIT_SCREENX extends to this case.
It is also a purpose of this paper to provide a discourse regarding statistical properties of a predictor X (including nominal X) as a predictor of Y. These properties are obtained as general theorems or through simulations.
Predictor X might be entered into a logistic model as a nominal predictor by putting X in a CLASS statement. Alternatively, predictor X may enter a logistic model as a LINEAR predictor such as: PROC LOGISTIC desc; MODEL Y = X <and other predictors>; Monotonic transforms, such as Log(X), might also be considered instead of X.
This presentation gives criteria for choosing LINEAR X in preference to CLASS X. These criteria are programmed into a SAS® macro called %LOGIT_SCREENX. %LOGIT_SCREENX can process dozens of X's in a single macro call.
In a final section of the presentation the CLASS X vs. LINEAR X decision is discussed for the cumulative logit model. %LOGIT_SCREENX extends to this case.
It is also a purpose of this paper to provide a discourse regarding statistical properties of a predictor X (including nominal X) as a predictor of Y. These properties are obtained as general theorems or through simulations.
About the Presenter
Bruce Lund is a statistical modeling trainer. For 16 years, from 2002 through 2017, he was affiliated with OneMagnify of Detroit. From 2002 to 2006 he was the first analytics manager at OneMagnify and founded the analytics practice. For the next 12 years he was a consultant for OneMagnify. Before OneMagnify, he was the customer database manager at Ford Motor Company and a mathematics professor at University of New Brunswick, Canada. At Ford and OneMagnify he developed numerous predictive models to support automotive marketing. Bruce has a mathematics PhD from Stanford University. He has presented at SAS Global Forum, AnalyticsX, ASA CSP, and regional SAS user group conferences. Mostly retired now, Bruce gives statistical modeling training at SAS user group conferences.
|