Good day,
I saw this post previously and am wondering if the opposite is done: https://communities.sas.com/t5/Statistical-Procedures/Identifying-Unused-Observations-in-PROC-PHREG/....
Suppose I have a panel dataset, how can I determine how many people were included in my model?
@CEdward wrote:
PGStats
Thank you.
I want to get the number of distinct individuals used rather than observations. I have used that method for observations already.
Does your data set have an individual identifier? Or group of variables that identify an individual? Then something like this gets the individuals, so you could use that as the basic for counting as a subquery.
Proc sql;
select distinct <variable(s) to identify an individual>
from residualdataset
where not missing (residualvariable)
;
quit;
Same logic:
Use the fact that residuals cannot be calculated for unused observations. Try adding the statement
output out=resOut resmart=resmart;
to the phreg procedure, and then
proc print data=resOut; where resmart is NOT missing; run;
to print the obs used, or
proc sql;
select count(resmart) as numberUsedInModel
from resOut;
quit;
to get the number of obs used.
PGStats
Thank you.
I want to get the number of distinct individuals used rather than observations. I have used that method for observations already.
@CEdward wrote:
PGStats
Thank you.
I want to get the number of distinct individuals used rather than observations. I have used that method for observations already.
Does your data set have an individual identifier? Or group of variables that identify an individual? Then something like this gets the individuals, so you could use that as the basic for counting as a subquery.
Proc sql;
select distinct <variable(s) to identify an individual>
from residualdataset
where not missing (residualvariable)
;
quit;
In that post @JacobSimonsen posted an example for one specific regression with a specific data set.
Other regressions might have a similar behavior depending on data, regression procedure and options used.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.