Hello, I have what appears to be an easy question but am getting an error. Perhaps you can help:
Below is an example--I want to subset dataset x based on pt variable common only in dataset y. I want to do this using a macro variable (I realize there are many ways to do this but I just want to do it this way)...Can you help with the error?
data x;
input pt ht;
datalines;
1001 100
1002 32
1003 42
1004 93
;
data y;
input pt;
datalines;
1003
1003
1001
;
proc sql noprint;
select distinct(pt) into :ids separated by ','
from y;
quit;
data x2;
set x;
if pt in &ids then output;
run;
I get an error message for the last datastep...Any ideas why?
Thanks very much for reading!
Just need to make a slight change in
data x2;
set x;
if pt in (&ids) then output;
run;
Thanks very much stat@sas...that was precisely what I was looking for...Didn't realize that the parenthesis are essential but it makes sense...macros are just text substitution so thx for your answer. It works!
Just keep in mind that the maximum number of characters that can be stored in a Macro Variable is 65,534. If you have a sizable dataset you may exceed this limitation and get an error.
Perhaps you could use a join, a merge or another type of lookup to ensure you don't run into this issue.
This is untested, so give it a shot and see how you go.
PROC SQL;
CREATE TABLE Y1 AS
SELECT X.PT, X.HT
FROM X WHERE X.PT IN
(SELECT DISTINCT(Y.PT)
FROM Y)
;
QUIT;
A Hash Table might also be a valid solution. The cool thing about the default functionality of the Hash Table is that it only contains unique values, so there is no need to dedupe or undertake a seperate SQL query to obtain the distinct values.
DATA HASHLOOKUP;
IF _N_ = 1 THEN DO;
DECLARE HASH HH(DATASET:'Y');
HH.DEFINEKEY('PT');
HH.DEFINEDONE();
END;
DO UNTIL (DONE);
SET X END=DONE;
RC = HH.CHECK();
IF RC = 0 THEN OUTPUT;
END;
RUN;
RC = HH.CHECK(); checks to see if the key appears in the Hash Table and where it does it RC is populated with 0 (if not it is populated with a number other than 0). If the values of RC is 0 then we output the values from the X dataset.
Thanks Scott...I really appreciate your response! Thank you so much. I didn't know you could use SQL that way to do the subset (not requiring say an inner join)...very cool for teaching me a new technique.
Thanks Scott for the Hash Table method....Unfortunatly I don't know much about hashes so your code is outside my limited brain capacity...However, I shall learn to use hashes & once I am comfortable enough shall revisit your solution in the future.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.