DATA Step, Macro, Functions and more

Error message: sql, macro variables and datastep integration

Reply
Occasional Contributor
Posts: 15

Error message: sql, macro variables and datastep integration

Hello, I have what appears to be an easy question but am getting an error. Perhaps you can help:

Below is an example--I want to subset dataset x based on pt variable common only in dataset y. I want to do this using a macro variable (I realize there are many ways to do this but I just want to do it this way)...Can you help with the error?

data x;

input pt ht;

datalines;

1001 100

1002 32

1003 42

1004 93

;

data y;

input pt;

datalines;

1003

1003

1001

;

proc sql noprint;

     select distinct(pt) into :ids separated by ','

     from y;

quit;

data x2;

     set x;

     if pt in &ids then output;

run;

I get an error message for the last datastep...Any ideas why?

Thanks very much for reading!

Trusted Advisor
Posts: 1,204

Re: Error message: sql, macro variables and datastep integration

Just need to make a slight change in

data x2;

     set x;

     if pt in (&ids) then output;

run;

Occasional Contributor
Posts: 15

Re: Error message: sql, macro variables and datastep integration

Thanks very much stat@sas...that was precisely what I was looking for...Didn't realize that the parenthesis are essential but it makes sense...macros are just text substitution so thx for your answer. It works!

Super Contributor
Posts: 297

Re: Error message: sql, macro variables and datastep integration

Just keep in mind that the maximum number of characters that can be stored in a Macro Variable is 65,534.  If you have a sizable dataset you may exceed this limitation and get an error.

Perhaps you could use a join, a merge or another type of lookup to ensure you don't run into this issue.

This is untested, so give it a shot and see how you go.

PROC SQL;

  CREATE TABLE Y1 AS

    SELECT X.PT, X.HT

  FROM X WHERE X.PT IN

    (SELECT DISTINCT(Y.PT)

  FROM Y)

    ;

QUIT;

Super Contributor
Posts: 297

Re: Error message: sql, macro variables and datastep integration

A Hash Table might also be a valid solution.  The cool thing about the default functionality of the Hash Table is that it only contains unique values, so there is no need to dedupe or undertake a seperate SQL query to obtain the distinct values.

DATA HASHLOOKUP;

  IF _N_ = 1 THEN DO;

  DECLARE HASH HH(DATASET:'Y');

  HH.DEFINEKEY('PT');

  HH.DEFINEDONE();

  END;

  DO UNTIL (DONE);

  SET X END=DONE;

  RC = HH.CHECK();

  IF RC = 0 THEN OUTPUT;

  END;

RUN;

RC = HH.CHECK();  checks to see if the key appears in the Hash Table and where it does it RC is populated with 0 (if not it is populated with a number other than 0).  If the values of RC is 0 then we output the values from the X dataset.

Occasional Contributor
Posts: 15

Re: Error message: sql, macro variables and datastep integration

Thanks Scott...I really appreciate your response! Thank you so much. I didn't know you could use SQL that way to do the subset (not requiring say an inner join)...very cool for teaching me a new technique.

Occasional Contributor
Posts: 15

Re: Error message: sql, macro variables and datastep integration

Thanks Scott for the Hash Table method....Unfortunatly I don't know much about hashes so your code is outside my limited brain capacity...However, I shall learn to use hashes  & once I am comfortable enough shall revisit your solution in the future.

Ask a Question
Discussion stats
  • 6 replies
  • 256 views
  • 3 likes
  • 3 in conversation