BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rmacarthur
Pyrite | Level 9

HI SAS friends,

Am using SAS ver 9.4.
Need your help and expertise setting up text searching and selection.
There are two sample datasets below.
The dataset COLORS, has the colors terms that need to be searched for.
In the dataset HAVE, there are two columns (apples , heroes). Need to identify those rows in HAVE where
one of the colors is mentioned at least once. Need to search the two columns, Apples and Heroes,
for any instance of one of the colors.

Some rows will be completely excluded.  Some rows may have a color of interest mentioned more than once.
I'd guess this involves Do Loops, probably one per column of interest (Apples, Heros), but beyond am not sure what to do next.
Thanks for your suggestions, much appreciated.
R

 

data COLORS; 
   	input 		color $10.		; 
	datalines					;
red
purple
green
blue
;
run; 

data	HAVE	;
	input	Apples $14.	Heroes $20.	;
	datalines						;
red delicious	green hornet
sweet fuscia	assain orange
gala red	    lavender lantern
green granny	spidy red
sour grey	    assain purple
:
run	;
1 ACCEPTED SOLUTION

Accepted Solutions
kiranv_
Rhodochrosite | Level 12

something likes this. this will find whether color is there in apples or heroes

 

proc sql;
select distinct a.* from
have a
inner join
colors b
on index(trim(apples),trim(color)) gt 0
or index(trim(heroes),trim(color)) gt 0; 

View solution in original post

3 REPLIES 3
kiranv_
Rhodochrosite | Level 12

something likes this. this will find whether color is there in apples or heroes

 

proc sql;
select distinct a.* from
have a
inner join
colors b
on index(trim(apples),trim(color)) gt 0
or index(trim(heroes),trim(color)) gt 0; 
Karen_Horton
Obsidian | Level 7

How about if you merge the datasets together so that each colour is attached to the fruit (a many to many merge).

 

You could then use the index function to identify the colours present in the fruit name as everything would be present in the same dataset. There are no doubt more complicated methods, but this would be worth a go first. 

rmacarthur
Pyrite | Level 9

Hi, 

Regarding merging the two datasets, via a many to many merge, was trying to avoid that bc the actual set of search terms has about ~12000 entries (rows), and there are actually 2 columns of different "colors" (strings) to search for, not one. 

Also, in the dataset to be searched, there are about 27000 rows and also two columns of interest.

So there's much more data than is provided in the sample datasets. 

That said, it its' probably possible, and I like the simplicity.

Thank you,

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 2659 views
  • 0 likes
  • 3 in conversation