DATA Step, Macro, Functions and more

regarding macro variable resolution

Reply
Regular Contributor
Posts: 167

regarding macro variable resolution

 Hi,

 

I am trying to do a many to many join on the datasets AE and patdata. I am trying to figure out if there is a way to reduce the number of iterations of inner loop. The key on which join is performed is subject.

 

I am planning to split the patdata into few subset based on the value of subject using the code i got from the internet. i tried creating the subsets with the name patdata_001 and patdata_002.

 

During the iteration i am trying to extract the value of subject from outer record (record from AE) into a macro variable value.

 

In the same step I am trying to read the corresponding data

 

set patdata_&value

but here the dataset name is not resolving .

 

but when i am trying to substitute the value directly eg :  patdata_001 it does work.

 

Is it because macro code gets compiled first? is there any alternate way for this.

 

Thanks in advance,

 

Regards,

Sheeba

 

proc sql noprint;

    select distinct SUBJECT into :subject_code

        separated by '|'

 

    from patdata;

quit;

 

%put &subject_code;

options mprint mlogic symbolgen;

%macro splitchar(string=,wordpfx=WORD,dlm=%str( ));

    %do cnt=1 %to %sysfunc(countw(&string,&dlm));

        %global &wordpfx&cnt;

        %let &wordpfx&cnt = %scan(&string,&cnt,%str(&dlm));

 

        %* echo macro var result to log window;

        %put &wordpfx&cnt=&&&wordpfx&cnt;

    %end;

    proc sql;

        %do i = 1 %to %sysfunc(countw(&string,&dlm));

            create table patdata_&&&wordpfx&i

                as select * from

                    patdata

                where subject = %unquote(%str(%')&&&wordpfx&i%str(%'));;

        %end;

    quit;

 

 

%mend splitchar;

 

%splitchar(string=&subject_code,dlm=|);

 

 

%macro temp;

DATA alldata0;

SET ae;

%global value;

call symput ('value', subject);

DROP _: match; * Drop temporary variables;

match=0;

/*Below I am looping the second dataset completely. is there any way to subset it?*/

DO i=1 TO xnobs;

SET patdata_&value(rename=(subject=_subject)) NOBS=xnobs POINT=i;

if subject=_subject THEN

DO;

match=1;

OUTPUT;

END;

END;

IF match=0 THEN DO; * Output AE record if no match in CM;

CALL MISSING(trt_code);

OUTPUT; END;

RUN;

%mend temp;

%temp;

 

 

Super User
Super User
Posts: 7,039

Re: regarding macro variable resolution

[ Edited ]

 

What are you actually trying to create?  Why do you think that you need to create multiple datasets?

You can easily do a many-to-many join in PROC SQL.

 

proc sql;
   create table alldata as
     select * 
     from patdata a
     full join ae b
     on a.subject = b.subject
  ;
quit;

     
Regular Contributor
Posts: 167

Re: regarding macro variable resolution

Hi Tom,

Thanks a lot for the reply.

I need to proceed with data step as I need to do many additional calculations during the creation of datasets . I am trying to achieve a many to many join and need to two datasets to capture matched and not matched .

I trying to iterate through a smaller loop .. So I am trying to create subsets of the second dataset. The macro variables are not getting resolved .. Is it due to the macro variables getting compiled first ?

Thanks again

Regards.
Sheeba Swaminathan
Super User
Super User
Posts: 7,039

Re: regarding macro variable resolution

Yes. You cannot change the name of the dataset used in a SET statement in the middle of a data step.  That is set when the step begins.

 

But I still do not understand what you are trying to do with that complicated piece of code. Or why it is necessary to make many seperate datasets.  Does what you do to the data depend on the id value of the subject?  If not the just use one data step and apply the same rules to every body.  Use by group processing is much easier than having to maintain hundreds or thousands of individual data sets.

 

First if PATDATA has just one record per subject then just merge to the two datasets.  You can decide whether to keep everyone or only those that appear in both datasets by using the IN= data set option to create flag variables.

data want ;
   merge patdata(in=in1) ae(in=in2) ;
   by subject ;
   if in1 and in2;
   ....
run;

Second if PATDATA has more than one record per subject then use proc sql to join them.  You can do more later in another step if you need to.  You can even make the join as a VIEW so that you only process the data in the DATA step.

proc sql ;
  create view patdata_ae as
    select *
    from patdata a inner join ae b
    on a.subject = b.subject 
    order by a.subject
  ;
quit;
data want ;
   set patdata_ae ;
   by subject ;
   ....
run;
Super User
Super User
Posts: 7,039

Re: regarding macro variable resolution

Here is a way to implement a many to many merge with just data steps. 

 

First calculate the first and last observataion number for each subject for one of that datasets.  Then you can do a one to many merge with that new table and use the POINT= option on a SET statement inside of a DO loop to pull in the mulitple observations.

 

data pointers ;
  set patdata ;
  by subject ;
  if first.subject then first=_n_;
  retain first;
  if last.subject then last=_n_;
  if last.subject then output;
  keep subject first last ;
run;

data alldata0;
  merge ae pointers (in=in2) ;
  by subject ;
  if not in2 then do;
    match=0;
    output;
  end;
  else do p=first to last;
    match=1;
    set patdata point=p ;
    output;
  end;
run;
Ask a Question
Discussion stats
  • 4 replies
  • 321 views
  • 2 likes
  • 2 in conversation