BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mohamedk
Fluorite | Level 6
Hi All

I’d like to create a variable or macro that includes info from an existing dataset.

What I’m trying to achieve is to add a where clause to look at the info without having to write each one down all the time.

Eg:
Data want;
Set have;
Where column in (column1);
Run;

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

@mohamedk wrote:
From a previous output

So somehow, these numbers were in a previous output. What you ought to do is have the previous output store the numbers in a data set. Then, you can do something like this, where the variable name in the previous data set is ABC:

 

LEFT JOIN 
	tableB as b ON a.CustomerNum = b.CustomerNum
where 
	b.field1 IN (select distinct abc from mypreviousdataset);
quit;

By the way, when you need help with SQL, don't start the thread by asking a DATA step question.

--
Paige Miller

View solution in original post

10 REPLIES 10
PaigeMiller
Diamond | Level 26

@mohamedk wrote:

I’d like to create a variable or macro that includes info from an existing dataset.

What I’m trying to achieve is to add a where clause to look at the info without having to write each one down all the time.

Eg:
Data want;
Set have;
Where column in (column1);
Run;


I'm afraid I don't understand

 

Where column in (column1);

as this is not valid syntax in neither a DATA step nor in PROC SQL, and in my mind not valid logic here either. What is the logic you are trying to implement? Can you give simple examples?

--
Paige Miller
ballardw
Super User

@mohamedk wrote:
Hi All

I’d like to create a variable or macro that includes info from an existing dataset.

What I’m trying to achieve is to add a where clause to look at the info without having to write each one down all the time.

Eg:
Data want;
Set have;
Where column in (column1);
Run;


Your "request" sounds like you want all the values for a variable. At which point a "where" is useless as by default you would get all the values.

So you should provide a mite more detailed description such as example data, rules for selecting values from that data set and then we might have something that makes sense.

Tom
Super User Tom
Super User

For this to make sense you need to have two datasets.  One with the list of values your want to filter to.  For example a list of subjects or list of states etc.  The other is the dataset you want to analyze.

 

So you could just merge the two.  If they are sorted already by that variable use a simple data step merge.

data want;
  merge have list(in=inlist);
  by subject ;
  if inlist;
run;

Or you could use SQL instead.

proc sql ;
create table want as
  select * 
  from have
  where subject in (select subject from list)
;
quit;

If the list of values is smallish you could put them into a macro variable.

proc sql noprint;
select distinct subject into :slist separated by ' '
from list
;
quit;

Note if the variable is character add quotes.

proc sql noprint;
select distinct quote(trim(subject)) into :slist separated by ' '
from list
;
quit;

Then use the list in your WHERE clause.

data want ;
  set have;
  where subject in (&slist);
run;
mohamedk
Fluorite | Level 6

Apologies All,

It seems I was quite unclear in my questions.

 

My current query is:

proc sql;
create table test as 
select 
a.*,
b.field1,
b.field2,
b.field3
from tableA as a
LEFT JOIN 
	tableB as b ON a.CustomerNum = b.CustomerNum
where 
	b.field1 IN (123,456,789,010,111,212,131,415);
quit;

I'd like to change the way I'm manually adding the numbers in the where clause to something less manual as currently I copied and pasted to test the logic of my query.

 

Hoping the above makes more sense that my initial query

cc:@PaigeMiller @ballardw 

mkeintz
PROC Star

And where did you get the values that you manually entered in the list

   

(123,456,789,010,111,212,131,415)

 

To "de-manual" this process of adding to this list, you need to tell us how these numeric values were established. 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mohamedk
Fluorite | Level 6
From a previous output
PaigeMiller
Diamond | Level 26

@mohamedk wrote:
From a previous output

So somehow, these numbers were in a previous output. What you ought to do is have the previous output store the numbers in a data set. Then, you can do something like this, where the variable name in the previous data set is ABC:

 

LEFT JOIN 
	tableB as b ON a.CustomerNum = b.CustomerNum
where 
	b.field1 IN (select distinct abc from mypreviousdataset);
quit;

By the way, when you need help with SQL, don't start the thread by asking a DATA step question.

--
Paige Miller
mohamedk
Fluorite | Level 6

Thank you. Will try this out.

Apologies, relatively new to SAS so will post correctly In future.

 

Edit: @PaigeMiller Thank you - your suggestion worked.

Kurt_Bremser
Super User

You can also use a hash for a lookup:

data want;
set have;
if _n_ = 1
then do;
  declare hash l (dataset:"previous (keep=field1)");
  l.definekey(field1);
  l.definedone();
end;
if l.check() = 0;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1973 views
  • 5 likes
  • 6 in conversation