BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Priya_26
Fluorite | Level 6

Hello Everyone !

 

I am using SAS 9.4 Version.

 

I have one dataset as below and I want to check if VarA's value in present in VarB anywhere, irrespective of its row order.based on the i can create a new variable to mark yes/No based on its presence.

 

Can you please help how can i check it in sas using datastep /Macro/proc sql ? 

 

I am trying to search if XX is present in VarB anywhere ? if yes then mark it as Y else NO
                                  if YY is present in VarB anywhere ? if yes then mark it as Y else NO and so on .. till the dataset ends.

Dataset- Test
VarA     VarB       new_var

 XX         PP              Y

 YY         MM             N

 ZZ          XX              N
AA           BB              Y
BB           AA              Y

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input (VarA     VarB) ($);       
datalines;
 XX         PP             
 YY         MM             
 ZZ          XX              
AA           BB              
BB           AA              
;
data want;
 if _n_=1 then do;
  if 0 then set have;
  declare hash h(dataset:'have');
  h.definekey('varb');
  h.definedone();
 end;
 set have;
 if h.check(key:vara)=0 then flag='Y';
  else flag='N';
 run;

View solution in original post

11 REPLIES 11
novinosrin
Tourmaline | Level 20

using Hash object is very easy if you know it,and since you didn't mention, I am giving you the more traditional datastep solution:

 

data have;
input (VarA     VarB) ($);       
datalines;
 XX         PP             
 YY         MM             
 ZZ          XX              
AA           BB              
BB           AA              
;

proc transpose data=have(keep=varb) out=_have;
var varb;
run;

data want;
set have;
if _n_=1 then set _have;
array t(*) col:	;
if vara in t then new_var="Y";
else new_var="N";
keep vara varb new_var;
run;
Priya_26
Fluorite | Level 6

Hi Novinosrin,

 

Thank you very much for your answer. I didn't know that even a single dataset can be looked for different variables within.

Can you please suggest the HASH object suggestion for same  ?

novinosrin
Tourmaline | Level 20

@Priya_26 With no surprises, the genius Ksharp gave you the hash solution even before I woke up to see your message. 🙂

Priya_26
Fluorite | Level 6

@novinosrin Thank you veru much for your time and Solution. Both of them are just working Perfect. Since I have to use this in production with Huge data as input, I will go with HASH solution.

 

Thanks again 🙂

Astounding
PROC Star

My preference:

 

proc sql;

select distinct(quote(varb)) into : b_list separated by ' ' from have;

quit;

 

data want;

set have;

if vara in (&b_list) then new_var='Y';

else new_var='N';

run;

novinosrin
Tourmaline | Level 20

@Astounding's approach will work comfortably for  relatively medium sized datasets however if you are working with very large datasets and the macro var list(if and when exceeds 64k length) as it did once to me in a similar situation will make you look for an alternative solution yet again. 

novinosrin
Tourmaline | Level 20

And Proc SQL:

Proc sql does not guarantee the order of the output though:

data have;
input (VarA     VarB) ($);       
datalines;
 XX         PP             
 YY         MM             
 ZZ          XX              
AA           BB              
BB           AA              
;

proc sql;
create table want as
select a.* ,max(case when a.vara=b.varb then 'Y' else 'N' end) as new_var
from have a, have(keep=varb) b
group by a.vara,a.varb;
quit;
Satish_Parida
Lapis Lazuli | Level 10
proc sql;
  create table test1 as
    select t.*, 
    (case when (select count(*) from test where VarB =t.VarA) =0 then 'N' 
            else 'Y' end) as new_var
  from test t;
quit;

This will work.
It will work like a charm until the size of the DS in small or medium or you have an index on VarB.

srinath3111
Quartz | Level 8

data have;

input (VarA VarB) ($);

datalines;

XX PP

YY MM

ZZ XX

AA BB

BB AA ;

run;

 

proc sql;

select distinct("varb") into : mac separated by ' ' from have;

quit;

 

data want;

set have;

if vara in (&mac) then new_var='Y';

else new_var='N';

run;

 

 

 

Ksharp
Super User
data have;
input (VarA     VarB) ($);       
datalines;
 XX         PP             
 YY         MM             
 ZZ          XX              
AA           BB              
BB           AA              
;
data want;
 if _n_=1 then do;
  if 0 then set have;
  declare hash h(dataset:'have');
  h.definekey('varb');
  h.definedone();
 end;
 set have;
 if h.check(key:vara)=0 then flag='Y';
  else flag='N';
 run;
Priya_26
Fluorite | Level 6

@Ksharp Thanks alot, This is just perfect 🙂 Thank you veru much !! 🙂

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 6130 views
  • 7 likes
  • 6 in conversation