HI i have a base dataset as bigdata.
which contains some datalines like.
sumit chohan
sumeet chauhan
sumit chouhan
pratik dahibhat
prateek dahibat
partik dahibhat
and sample dataset which contains
summit chauhan
prateek dahibhat
i just wanted to find out the count of the possible matches as well as want to extract the possible matches from base dataset which is in our case bigdata.
Please suggest if any.
thanks.
Prashant.
Well, there's several ways you could do this, probably the way I would do it is:
data _null_; set small_data end=last; if _n_=1 then call execute('data want end=last; set bigdata; retain count;'); call execute('if index(variable,"',snippet,'") > 0 then count=sum(count,1);'); if last then call execute(' if last then output; run;'); run;
This assumes that you have small_data which contains a variable snippet, and a dataset bigdata with a variable called variable. This will generate a datastep with an if statement for each row of your small data, and output one row with the total. As you want the snippets maybe also add:
data _null_; set small_data end=last; if _n_=1 then call execute('data want end=last matches; set bigdata; retain count;'); call execute('if index(variable,"',snippet,'") > 0 then do; count=sum(count,1); ouput matches; end;'); if last then call execute(' if last then output want; run;'); run;
You could also do the same via merging the two datasets. Also depends on how your data looks, does casing match, how good is the match etc.
Thank you so much for your reply...
Hi ,
you can try Sounds like operator, anyhow it's designed for english so there may be somme difficulties with indian names.
data have;
input name $ 1-32;
cards;
sumit chohan
sumeet chauhan
sumit chouhan
pratik dahibhat
prateek dahibat
partik dahibhat
;
data sample;
input name $ 1-32;
cards;
summit chauhan
prateek dahibhat
;
run;
proc sql;
create table want as
select h.name as have_name, s.name as sample_name
from have h,
sample s
where s.name = *h.name;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.