BookmarkSubscribeRSS Feed
sahoositaram555
Pyrite | Level 9

Hi experts, i have a dataset similar to the dataset(with 5000 subjids) below.

i need to extract the unique subjid who doesn't have cnsdt missing for vis1 only for which i have written a sql code below. kindly let me know the possible reason behind group by is getting converted in to order by as the warning appears in the log and the values i'm getting are all wrong.

data a;

input subjid event cnsdt;

01001 vis1 12/03/2018

01001 vis2 

01001 vis3

01001 vis4  

01002 vis1 08/08/2018

01002 vis2 

01002 vis3

01002 vis4

01003 vis1 . 

01003 vis2 

01003 vis3

01003 vis4  

01004 vis1 23/02/2018

01004 vis2 

01004 vis3

01004 vis4

01005 vis1 .

01005 vis2 

01005 vis3

01005 vis4

;

run;

 

proc sql;
create table missconsdt as select distinct subjid from a where cnsdt ^=. group by subjid;
quit;

 

any other better ways to do this in datastep, without using sql?Kindly suggest

5 REPLIES 5
novinosrin
Tourmaline | Level 20
data a;
infile cards truncover;
input subjid $ event $ cnsdt :ddmmyy10.;
format  cnsdt ddmmyy10.;
cards;
01001 vis1 12/03/2018
01001 vis2 
01001 vis3
01001 vis4  
01002 vis1 08/08/2018
01002 vis2 
01002 vis3
01002 vis4
01003 vis1 . 
01003 vis2 
01003 vis3
01003 vis4  
01004 vis1 23/02/2018
01004 vis2 
01004 vis3
01004 vis4
01005 vis1 .
01005 vis2 
01005 vis3
01005 vis4
;
run;

 
proc sql;
create table missconsdt as 
select  subjid 
from a 
where (event='vis1')*cnsdt ;
quit;
proc print noobs;run;

data missconsdt;
 set a;
 where  (event='vis1')*cnsdt ;
run;
proc print noobs;run;
subjid
01001
01002
01004
sahoositaram555
Pyrite | Level 9

Thanks @novinosrin for the wonderful response. it's working perfectly . before i close your response as a solution, i would like to understand the syntax behind using the line
"where (event='vis1')*cnsdt "
Does cnsdt ^=. will be directly interpreted like that by SAS, by mentioning just the cnsdt (which i can see though from the results ,but what if i would like to do the vice versa , where (event='vis1')*-cnsdt, will this work )? or am i missing something.
Looking forward to having your response. Any paper for reference can be shared from your end would be helpful.

novinosrin
Tourmaline | Level 20
 where  (event='vis1')*cnsdt ;

is a boolean expression equivalent to 

 

where event='vis1' and not missing(dnsdt);

Boolean expressions are mathematical expressions that standardises values into binary 1's and 0' for true and false. I use these a lot considering the utility is phenomenal in the data prep for logistic regression for my credit risk team. You will pick up by practice. The people who inspired me to use this are @SASKiwi , @hashman  and @data_null__   They showed me the usage in terms of brevity but it furthered to get me using in Probability models. So I would recommend reading their posts. 

 

PS Honestly they were gracious to lend me their time and I think they have gotten old now to do the same for today's newbies. lol 😁

 

Kurt_Bremser
Super User

A slight, but important correction:

where event='vis1' and cnsdt not in (0,.);

A non-missing value of zero will cause the multiplication to also result in zero, which is considered "false".

novinosrin
Tourmaline | Level 20

Indeed that's sharp. Agreed 🙂

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1118 views
  • 0 likes
  • 3 in conversation