Hi,
I have a data set in the following form;
ID Practice Age Status Sex
1 Bingley 77 Eligible Female
2 Saltaire 74 Intervention Male
3 Saltaire 73 Eligible Male
4 Saltaire 75 Intervention Female
5 Bingley 72 Eligible female
Saltaire practice has both 'intervention' and 'eligible' group while 'Bingley' has only 'eligible' group. My cases are 'intervention' group under 'status' variable within 'Saltaire' practice. I want to create two different groups of controls; one that has 'eligible' status within 'Saltaire' practice, and second with 'eligible' status within 'Bingley' practice, matching on age and sex and creating 1:1 match. there are other variable in the data set which I am not using for matching.
I will appreciate help with the syntax please?
Ta
And when I run this syntax to create separate case and control data sets;
data study cont1 cont2; *controls2;
set demo;
if practice= "Saltaire medical practice" and status = "interventionGroup" then output STUDY;
else if practice = "Saltaire medical practice" and status = "Eligible" then output cont1;
else output cont2;
RUN;
I get this log;
NOTE: There were 409 observations read from the data set WORK.DEMO.
NOTE: The data set WORK.STUDY has 0 observations and 16 variables.
NOTE: The data set WORK.CONT1 has 0 observations and 16 variables.
NOTE: The data set WORK.CONT2 has 409 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
while I expect 106 observations in study, 119 in cont1 and 180 in cont2.
Can someone cast their beady eye and tell me where I am wrong please.
Thanks
S
Yes, absolutely right! its 'intervention' but even correcting that doesn't correct the problem. posted the log in the previous post, posting again;
88 data study cont1 cont2; *controls2;
89 set demo;
90 if practice= "Saltaire medical practice" and status = "intervention" then output STUDY;
91 else if practice = "Saltaire medical practice" and status = "Eligible" then output cont1;
92 else output cont2;
93 RUN;
NOTE: There were 409 observations read from the data set WORK.DEMO.
NOTE: The data set WORK.STUDY has 0 observations and 16 variables.
NOTE: The data set WORK.CONT1 has 0 observations and 16 variables.
NOTE: The data set WORK.CONT2 has 409 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
Thanks
S
Hi @sks521
Same problem with the variable practice as with status. If it's not "Saltaire medical practice" then output goes to cont2. But after your first post ithe value is "Saltaire".
Sorry I should've been more precise; the actual values in the data set are 'Saltaire medical practice' and 'Bingley medical practice' and for variable 'status' are 'Eligible' and 'InterventioGroup', I just made them up in my first post to make it look short.
But I am assuming something else is wrong with my syntax which I can' figure out.
Thanks
S
Hi @sks521
I can't see any problems with your code. There must be something wrong with your input data, because your code works as expected, if your input data structure and content is as you describe it.
* Demo data;
data demo (drop=rec);
length ID 8 Practice $30 Age 8 Status $20 Sex $6;
input rec $char80.;
id = input(scan(rec,1,','),8.);
Practice = scan(rec,2,',');
Age = input(scan(rec,3,','),8.);
Status = scan(rec,4,',');
Sex = scan(rec,5,',');
datalines;
1,Bingley medical practice,77,Eligible,Female
2,Saltaire medical practice,74,interventionGroup,Male
3,Saltaire medical practice,73,Eligible,Male
4,Saltaire medical practice,75,interventionGroup,Female
5,Bingley medical practice,72,Eligible,female
;
run;
* Your code;
data study cont1 cont2;
set demo;
if practice= "Saltaire medical practice" and status = "interventionGroup" then output STUDY;
else if practice = "Saltaire medical practice" and status = "Eligible" then output cont1;
else output cont2;
run;
160 * Your code;
161 data study cont1 cont2;
162 set demo;
163 if practice= "Saltaire medical practice" and status = "interventionGroup" then output STUDY;
164 else if practice = "Saltaire medical practice" and status = "Eligible" then output cont1;
165 else output cont2;
166 run;
NOTE: There were 5 observations read from the data set WORK.DEMO.
NOTE: The data set WORK.STUDY has 2 observations and 5 variables.
NOTE: The data set WORK.CONT1 has 1 observations and 5 variables.
NOTE: The data set WORK.CONT2 has 2 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.05 seconds
cpu time 0.00 seconds
Here is the syntax;
data study cont1 cont2; *controls2;
set demo;
if practice= "Saltaire medical practice" and status = "interventionGroup" then output STUDY;
else if practice = "Saltaire medical practice" and status = "Eligible" then output cont1;
else output cont2;
RUN;
and log;
75 data study cont1 cont2; *controls2;
76 set demo;
77 if practice='Saltaire medical practice' and status= 'interventionGroup' then output STUDY;
78 else if practice='Saltaire medical practice' and status= 'Eligible' then output cont1;
79 else output cont2;
80 RUN;
NOTE: There were 409 observations read from the data set WORK.DEMO.
NOTE: The data set WORK.STUDY has 0 observations and 16 variables.
NOTE: The data set WORK.CONT1 has 0 observations and 16 variables.
NOTE: The data set WORK.CONT2 has 409 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.