BookmarkSubscribeRSS Feed
Learn_uk
Calcite | Level 5

Hello, I have what appears to be an easy question but am getting an error. Perhaps you can help:

Below is an example--I want to subset dataset x based on pt variable common only in dataset y. I want to do this using a macro variable (I realize there are many ways to do this but I just want to do it this way)...Can you help with the error?

data x;

input pt ht;

datalines;

1001 100

1002 32

1003 42

1004 93

;

data y;

input pt;

datalines;

1003

1003

1001

;

proc sql noprint;

     select distinct(pt) into :ids separated by ','

     from y;

quit;

data x2;

     set x;

     if pt in &ids then output;

run;

I get an error message for the last datastep...Any ideas why?

Thanks very much for reading!

6 REPLIES 6
stat_sas
Ammonite | Level 13

Just need to make a slight change in

data x2;

     set x;

     if pt in (&ids) then output;

run;

Learn_uk
Calcite | Level 5

Thanks very much stat@sas...that was precisely what I was looking for...Didn't realize that the parenthesis are essential but it makes sense...macros are just text substitution so thx for your answer. It works!

Scott_Mitchell
Quartz | Level 8

Just keep in mind that the maximum number of characters that can be stored in a Macro Variable is 65,534.  If you have a sizable dataset you may exceed this limitation and get an error.

Perhaps you could use a join, a merge or another type of lookup to ensure you don't run into this issue.

This is untested, so give it a shot and see how you go.

PROC SQL;

  CREATE TABLE Y1 AS

    SELECT X.PT, X.HT

  FROM X WHERE X.PT IN

    (SELECT DISTINCT(Y.PT)

  FROM Y)

    ;

QUIT;

Scott_Mitchell
Quartz | Level 8

A Hash Table might also be a valid solution.  The cool thing about the default functionality of the Hash Table is that it only contains unique values, so there is no need to dedupe or undertake a seperate SQL query to obtain the distinct values.

DATA HASHLOOKUP;

  IF _N_ = 1 THEN DO;

  DECLARE HASH HH(DATASET:'Y');

  HH.DEFINEKEY('PT');

  HH.DEFINEDONE();

  END;

  DO UNTIL (DONE);

  SET X END=DONE;

  RC = HH.CHECK();

  IF RC = 0 THEN OUTPUT;

  END;

RUN;

RC = HH.CHECK();  checks to see if the key appears in the Hash Table and where it does it RC is populated with 0 (if not it is populated with a number other than 0).  If the values of RC is 0 then we output the values from the X dataset.

Learn_uk
Calcite | Level 5

Thanks Scott...I really appreciate your response! Thank you so much. I didn't know you could use SQL that way to do the subset (not requiring say an inner join)...very cool for teaching me a new technique.

Learn_uk
Calcite | Level 5

Thanks Scott for the Hash Table method....Unfortunatly I don't know much about hashes so your code is outside my limited brain capacity...However, I shall learn to use hashes  & once I am comfortable enough shall revisit your solution in the future.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1943 views
  • 3 likes
  • 3 in conversation