Welcome to MSUG: Michigan SAS Users Group

 
  • Home
  • News
  • Meetings
  • Links
  • Contact Us
  • Presentations
  • Papers
  • Jobs

Join us for our NEXT MEETING- a FREE Webinar!

Date: Thursday, November 20, 2025
Time: Noon - 1:00 PM ET
Place: Online
Cost:  Free!  However, please register by Tuesday, November 18 in order to receive the webinar link. Webinar information will be sent out by Wednesday, November 19.
Register Now!

Agenda
  • Logistic Modeling Without Split-Samples - Bruce Lund, Statistical Trainer

Here is a quote from: N. Kriegeskorte, et. al. (2009) "Circular analysis in systems neuroscience: the dangers of double dipping", Nature Neuroscience …"Double Dipping is the use of the same dataset for selection and selective analysis. It gives distorted descriptive statistics and invalid statistical inference".

Double Dipping would arise if a logistic model was fit to an Analysis dataset and the same Analysis dataset was used for computing model validation statistics (e.g. c-statistic, average squared error, etc.). The long standing approach to avoid double dipping is usage of split-sampling. In split-sampling the Analysis dataset is randomly divided into Training and Validation datasets. The split is often 50%-50% but could be 60%-40% or 70%-30%.

The focus of this paper is on binary logistic modeling. In split sampling, a logistic model is fit on Training, without ever looking at the Validation dataset. Once a final model is fitted, then model performance is measured on the Validation dataset.

Can the problem of double dipping be avoided without a split-sample? If so, this would have the advantage of fitting the model on the entire Analysis dataset, giving better predictor variable selection and better coefficient estimation. But this leaves open the question of how to perform model validation.

It is a purpose of the paper to show how split-sampling can be avoided. Briefly, this approach involves the usage of bootstrap sampling to find an “optimism correction”. This “optimism correction” is an adjustment to performance metrics (e.g., c-statistic, average squared error, etc.) that are computed on the full Analysis dataset. That is, the model is fitted on the Analysis dataset. The performance metrics are also computed on the Analysis dataset, but these performance metrics are then corrected by an “optimism correction”. The paper explains how bootstrap sampling is utilized in finding the optimism correction. Optimism Correction is presented in a book by Efron and Tibshirani (1993), An Introduction to the Bootstrap, pp 247-252. In recent years, F. Harrell and E. Steyerberg have championed this approach (see references in paper).

Bruce Lund is a statistical modeling trainer. For 16 years, from 2002 through 2017, he was affiliated with OneMagnify of Detroit. From 2002 to 2006 he was the first analytics manager at OneMagnify and founded the analytics practice. For the next 12 years he was a consultant for OneMagnify. Before OneMagnify, he was the customer database manager at Ford Motor Company and a mathematics professor at University of New Brunswick, Canada. At Ford and OneMagnify he developed numerous predictive models to support automotive marketing. Bruce has a mathematics PhD from Stanford University. He has presented at SAS Global Forum, AnalyticsX, ASA CSP, and regional SAS user group conferences. Mostly retired now, Bruce gives statistical modeling training at SAS user group conferences.
Register Now!

Help plan the future of MSUG!

Please assist us with planning upcoming MSUG events by filling out this short 2-minute survey.  Your responses will let us know what types of events you'd like to see- 1-hour webinars?  Half-day virtual classes?  Another in-person Single-Day Event?  We'd like to hear from you!

We are always looking for speakers for future meetings - please contact us if you are interested in presenting!

MSUG Occasional Papers - NEW!

MSUG members may wish to contribute a paper for posting on our website instead of giving a webinar presentation. Instructions for doing so are posted on the new MSUG Occasional Papers page- find out how you can participate!  


A big thank you to the sponsors of our 2025 1-Day Conference:

Premier Sponsor

SAS logo

Gold Sponsor

Picture

Midwest SAS Users Group
October 5-7, 2025


Save the date! 
Our 2025 1-Day SAS Conference will be held on Wednesday, May 21 at Schoolcraft College in Livonia, MI!


If you are interested in speaking at a future MSUG meeting, please contact us at [email protected]. We also welcome reviews of SAS Press books- please see our Book Review Process.

About Us

Mission Statement

The Michigan SAS Users Group is organized to further the interests of programmers and users of the SAS Software System. Members share information on building applications with SAS Software and how to better use programming tools and user interfaces. MSUG also provides a forum for informing attendees and members of career opportunities.

Newsletter

Members automatically receive email announcements of upcoming meetings, along with a copy of the MSUG newsletter, which is published occasionally. Notes and Notices, users coding tips and tricks and "help wanted" ads are included in the newsletter. See our archive of Newsletters and Past Proceedings.

Membership

There are no membership dues at the moment.If you wish to join MSUG, please go to our mailing list website and add your email address to our distribution list for newsletters and meeting announcements!  If you have questions, please contact us.

Advertising/Sponsorship

Download a description of MSUG's sponsorship program.  If your company wishes to sponsor an MSUG event, please contact [email protected].


Picture

Legal

Powered by Create your own unique website with customizable templates.