BookmarkSubscribeRSS Feed
prashantchitta4
Calcite | Level 5

HI i have a base dataset as bigdata.

which contains some datalines like.

sumit chohan

sumeet chauhan

sumit chouhan

pratik dahibhat

prateek dahibat

partik dahibhat

 

and sample dataset which contains

 

summit chauhan

prateek dahibhat

 

i just wanted to find out the count of the possible matches as well as want to extract the possible matches from base dataset which is in  our case bigdata.

Please suggest if any.

thanks.

 

Prashant.

 

3 REPLIES 3
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, there's several ways you could do this, probably the way I would do it is:

data _null_;
  set small_data end=last;
  if _n_=1 then call execute('data want end=last;  set bigdata; retain count;');
  call execute('if index(variable,"',snippet,'") > 0 then count=sum(count,1);');
  if last then call execute(' if last then output; run;');
run;

This assumes that you have small_data which contains a variable snippet, and a dataset bigdata with a variable called variable.  This will generate a datastep with an if statement for each row of your small data, and output one row with the total.  As you want the snippets maybe also add:

data _null_;
  set small_data end=last;
  if _n_=1 then call execute('data want end=last matches;  set bigdata; retain count;');
  call execute('if index(variable,"',snippet,'") > 0 then do; count=sum(count,1); ouput matches; end;');
  if last then call execute(' if last then output want; run;');
run;

You could also do the same via merging the two datasets.  Also depends on how your data looks, does casing match, how good is the match etc.

prashantchitta4
Calcite | Level 5

Thank you so much for your reply...

AskoLötjönen
Quartz | Level 8

Hi ,

 

you can try Sounds like operator, anyhow it's designed for english so there may be somme difficulties with indian names.

 

data have;
input name $ 1-32;
cards;
sumit chohan               
sumeet chauhan             
sumit chouhan              
pratik dahibhat            
prateek dahibat            
partik dahibhat            
;

data sample;
input name $ 1-32;
cards;
summit chauhan             
prateek dahibhat           
;
run;

proc sql;
create table want as
select h.name as have_name, s.name as sample_name
from   have h,
       sample s
where  s.name = *h.name;
quit;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 2663 views
  • 0 likes
  • 3 in conversation