Proportional Odds Model

CathyVI · Posted 11-28-2022 03:54 PM

Hi,

This is from the free online paper: Using New SAS 9.4 Features for Cumulative Logit Models with Partial
Proportional Odds. On page 7, a subset data was created: The dataset “MB” is comprised of 408 of the 508 observations in the dataset. Dataset “XV” contains 100 observations and will be used for cross-validation purposes for the model.

Question: Was the data MB still contain 508 or not. From my understanding, MB only contain 408, so my question is how was 408 MB separated from 508 because proc selectsurvey will only recreate 100(XV) from MB and MB will still contain 508 atleast from my sas knowledge. How do I have 408 with the proc selectsurvey and not 508?

Thanks

PGStats · Posted 11-28-2022 04:46 PM

I guess proc surveyselect was used with option OUTALL and split according to the newly created variable Selected. Example:

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 68         
 69         proc surveyselect data=sashelp.class out=classSamples outall sampsize=15 seed=75868;
 70         run;
 
 NOTE: The data set WORK.CLASSSAMPLES has 19 observations and 6 variables.
 NOTE: PROCEDURE SURVEYSELECT a utilisé (Durée totale du traitement) :
       real time           0.02 seconds
       cpu time            0.03 seconds
       
 
 71         
 72         data mb xv;
 73         set classSamples;
 74         if Selected then output mb;
 75         else output xv;
 76         run;
 
 NOTE: There were 19 observations read from the data set WORK.CLASSSAMPLES.
 NOTE: The data set WORK.MB has 15 observations and 6 variables.
 NOTE: The data set WORK.XV has 4 observations and 6 variables.
 NOTE: DATA statement a utilisé (Durée totale du traitement) :
       real time           0.00 seconds
       cpu time            0.01 seconds

PG

CathyVI · Posted 11-28-2022 05:24 PM

@PGStats

Thanks this codes work but I have a question- I want to understand how it works. This is my code below.

In the datastep code, the cleanp_n is only 1000 obs, the stcp.cleanedp3 =1000 obs. How did stcp.cleanedp4 provides the reminding of the 3000 obs since I did not reference stcp.cleanedp2 that has all the 4000 obs in the datastep?

/* simple random sampling with replacement - proc survey select */
proc surveyselect data=stcp.cleanedp2 method = srs outall sampsize = 1000
seed=535113001 out=cleanp_n ;
run;

data stcp.cleanedp3 stcp.cleanedp4;
set cleanp_n;
if selected then output stcp.cleanedp3;
else output stcp.cleanedp4;
run;

PGStats · Posted 11-28-2022 08:19 PM

Please check the Log after you run your code. This should show you that dataset cleanp_n has the same number of obs as stcp.cleanedp2.

PG

Proportional Odds Model

Re: Proportional Odds Model

Re: Proportional Odds Model

Re: Proportional Odds Model

Proportional Odds Model

Re: Proportional Odds Model

Re: Proportional Odds Model

Re: Proportional Odds Model

SAS Innovate 2025: Call for Content