BookmarkSubscribeRSS Feed
KPCklebspn
Obsidian | Level 7

Hi there,

 

I have a dataset where there are a bunch of character values for a variable (var). I want to print all observations that contain one of the values from a list of values for var.

 

I tried this code, but I think you can only put one value if you use the index function (i.e. can only list "abc" or "bbb", not both). Is there a function like the index function but where you can list multiple values?

 

proc print data = dataset; where index (var, "abc", "bbb"); run;

 

 

Example dataset:

var

abc ab

abc bb

abc da

abc rr

bbb ab

ccc

 

(I only want to print first 5 observations)


Thx!

4 REPLIES 4
TomKari
Onyx | Level 15

Here's an option:

 

Tom

 

data Want;
	length TestWord $50;
	set Have;
	WordFound = 0;

	do WordCount = 1 to countw(Var);
		TestWord = scan(Var, WordCount);

		if TestWord in("abc", "bbb") then
			WordFound = 1;
	end;

	if WordFound then
		output;
run;
Astounding
PROC Star

In the examples you listed, the key word appears at the beginning of VAR.  If that pattern holds true you can use:

 

where var in : ('abc', 'bbb');

 

The colon will limit the number of characters used for making a comparison.

mkeintz
PROC Star

Do you want the first 5  of 'aaa' OR 'bbb'?  Or do you want the first 5 'aaa' and the first 5 'bbb'?

 

If it's the earlier then

proc print data = dataset (obs=5);
 where var in: ("aaa", "bbb");
run;

As @Astounding mentioned the in : assumes you are looking a values the start with 'aaa' or 'bbb'.  Note that the "obs=5" honors the where filter.  I.e. you will get 5 qualifying records, not the qualifying subset of the first 5 records.

 

 

But if it's the latter, then:

data need / view=need;
  set dataset (where=(var=:'aaa') obs=5)
      dataset (where=(var=:'bbb') obs=5);
  by id;
run;
proc print data=need;
run;

 

I put in the "by id;" to illustrate what you can do if the data are already sorted by ID and you want the 10 records printed out in original order.  If you don't use the BY statement, then it will print 5 'aaa' records, then 5 'bbb' records.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
JackHamilton
Lapis Lazuli | Level 10
PRXMATCH can test for a match against multiple strings. As was mentioned already, using the IN: construct would be easier (and probably faster) if the qualifying string is always at the beginning.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 2606 views
  • 0 likes
  • 5 in conversation