BookmarkSubscribeRSS Feed
Learn_uk
Calcite | Level 5

Hello, I have what appears to be an easy question but am getting an error. Perhaps you can help:

Below is an example--I want to subset dataset x based on pt variable common only in dataset y. I want to do this using a macro variable (I realize there are many ways to do this but I just want to do it this way)...Can you help with the error?

data x;

input pt ht;

datalines;

1001 100

1002 32

1003 42

1004 93

;

data y;

input pt;

datalines;

1003

1003

1001

;

proc sql noprint;

     select distinct(pt) into :ids separated by ','

     from y;

quit;

data x2;

     set x;

     if pt in &ids then output;

run;

I get an error message for the last datastep...Any ideas why?

Thanks very much for reading!

6 REPLIES 6
stat_sas
Ammonite | Level 13

Just need to make a slight change in

data x2;

     set x;

     if pt in (&ids) then output;

run;

Learn_uk
Calcite | Level 5

Thanks very much stat@sas...that was precisely what I was looking for...Didn't realize that the parenthesis are essential but it makes sense...macros are just text substitution so thx for your answer. It works!

Scott_Mitchell
Quartz | Level 8

Just keep in mind that the maximum number of characters that can be stored in a Macro Variable is 65,534.  If you have a sizable dataset you may exceed this limitation and get an error.

Perhaps you could use a join, a merge or another type of lookup to ensure you don't run into this issue.

This is untested, so give it a shot and see how you go.

PROC SQL;

  CREATE TABLE Y1 AS

    SELECT X.PT, X.HT

  FROM X WHERE X.PT IN

    (SELECT DISTINCT(Y.PT)

  FROM Y)

    ;

QUIT;

Scott_Mitchell
Quartz | Level 8

A Hash Table might also be a valid solution.  The cool thing about the default functionality of the Hash Table is that it only contains unique values, so there is no need to dedupe or undertake a seperate SQL query to obtain the distinct values.

DATA HASHLOOKUP;

  IF _N_ = 1 THEN DO;

  DECLARE HASH HH(DATASET:'Y');

  HH.DEFINEKEY('PT');

  HH.DEFINEDONE();

  END;

  DO UNTIL (DONE);

  SET X END=DONE;

  RC = HH.CHECK();

  IF RC = 0 THEN OUTPUT;

  END;

RUN;

RC = HH.CHECK();  checks to see if the key appears in the Hash Table and where it does it RC is populated with 0 (if not it is populated with a number other than 0).  If the values of RC is 0 then we output the values from the X dataset.

Learn_uk
Calcite | Level 5

Thanks Scott...I really appreciate your response! Thank you so much. I didn't know you could use SQL that way to do the subset (not requiring say an inner join)...very cool for teaching me a new technique.

Learn_uk
Calcite | Level 5

Thanks Scott for the Hash Table method....Unfortunatly I don't know much about hashes so your code is outside my limited brain capacity...However, I shall learn to use hashes  & once I am comfortable enough shall revisit your solution in the future.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 925 views
  • 3 likes
  • 3 in conversation