HI i have a base dataset as bigdata.
which contains some datalines like.
sumit chohan
sumeet chauhan
sumit chouhan
pratik dahibhat
prateek dahibat
partik dahibhat
and sample dataset which contains
summit chauhan
prateek dahibhat
i just wanted to find out the count of the possible matches as well as want to extract the possible matches from base dataset which is in our case bigdata.
Please suggest if any.
thanks.
Prashant.
Well, there's several ways you could do this, probably the way I would do it is:
data _null_; set small_data end=last; if _n_=1 then call execute('data want end=last; set bigdata; retain count;'); call execute('if index(variable,"',snippet,'") > 0 then count=sum(count,1);'); if last then call execute(' if last then output; run;'); run;
This assumes that you have small_data which contains a variable snippet, and a dataset bigdata with a variable called variable. This will generate a datastep with an if statement for each row of your small data, and output one row with the total. As you want the snippets maybe also add:
data _null_; set small_data end=last; if _n_=1 then call execute('data want end=last matches; set bigdata; retain count;'); call execute('if index(variable,"',snippet,'") > 0 then do; count=sum(count,1); ouput matches; end;'); if last then call execute(' if last then output want; run;'); run;
You could also do the same via merging the two datasets. Also depends on how your data looks, does casing match, how good is the match etc.
Thank you so much for your reply...
Hi ,
you can try Sounds like operator, anyhow it's designed for english so there may be somme difficulties with indian names.
data have;
input name $ 1-32;
cards;
sumit chohan
sumeet chauhan
sumit chouhan
pratik dahibhat
prateek dahibat
partik dahibhat
;
data sample;
input name $ 1-32;
cards;
summit chauhan
prateek dahibhat
;
run;
proc sql;
create table want as
select h.name as have_name, s.name as sample_name
from have h,
sample s
where s.name = *h.name;
quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.