data stud_Pri (keep =HSE_ID HH_NO Hsehold_Ref_No PriStd_cnt);
set stud_dev;
by Hsehold_Ref_No;
if first.Hsehold_Ref_No then do;
where LVL_OF_EDN in ('0', '1');
PriStd_cnt=0; end;
PriStd_cnt + 1;
if last.Hsehold_Ref_No then output;
run;
I don't understand this code.
If first.Hsehold_Ref_No then do;
where LVL_OF_EDN in ('0', '1');
PriStd_cnt=0;end;
1) What if the first.Hsehold_Ref_No doesn't have LVL_OF_EDN in ('0', '1')? What will SAS do?
2) If the LVL_OF_EDN in ('0', '1') is in the 2nd or beyond record of the same Hsehold_Ref_No, will SAS count it?
Thank you
The WHERE statement is NOT an executable statement. This means that it should be moved out of the conditional block. Its effect is to restrict the input data to observations where LVL_OF_EDN is either '0' or '1'.
The datastep logic simply counts the number of observations with LVL_OF_EDN either '0' or '1' within each value of Hsehold_Ref_No.
It is assumed that the input dataset is sorted by Hsehold_Ref_No.
The datastep should read:
data stud_Pri (keep =HSE_ID HH_NO Hsehold_Ref_No PriStd_cnt);
set stud_dev;
by Hsehold_Ref_No;
where LVL_OF_EDN in ('0', '1');
if first.Hsehold_Ref_No then do;
PriStd_cnt = 0;
end;
PriStd_cnt + 1;
if last.Hsehold_Ref_No then output;
run;
PG
Note: with your datastep, Hsehold_Ref_No groups having no observations with LVL_OF_EDN either '0' or '1' will be skipped, i.e. you will never get PriStd_cnt=0. If you want the zero counts, you should drop the WHERE statement and use :
data stud_Pri (keep =HSE_ID HH_NO Hsehold_Ref_No PriStd_cnt);
set stud_dev;
by Hsehold_Ref_No;
if first.Hsehold_Ref_No then do;
PriStd_cnt = 0;
end;
PriStd_cnt + (LVL_OF_EDN in ('0', '1'));
if last.Hsehold_Ref_No then output;
run;
PG
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.