BookmarkSubscribeRSS Feed
Sheeba
Lapis Lazuli | Level 10

 Hi,

 

I am trying to do a many to many join on the datasets AE and patdata. I am trying to figure out if there is a way to reduce the number of iterations of inner loop. The key on which join is performed is subject.

 

I am planning to split the patdata into few subset based on the value of subject using the code i got from the internet. i tried creating the subsets with the name patdata_001 and patdata_002.

 

During the iteration i am trying to extract the value of subject from outer record (record from AE) into a macro variable value.

 

In the same step I am trying to read the corresponding data

 

set patdata_&value

but here the dataset name is not resolving .

 

but when i am trying to substitute the value directly eg :  patdata_001 it does work.

 

Is it because macro code gets compiled first? is there any alternate way for this.

 

Thanks in advance,

 

Regards,

Sheeba

 

proc sql noprint;

    select distinct SUBJECT into :subject_code

        separated by '|'

 

    from patdata;

quit;

 

%put &subject_code;

options mprint mlogic symbolgen;

%macro splitchar(string=,wordpfx=WORD,dlm=%str( ));

    %do cnt=1 %to %sysfunc(countw(&string,&dlm));

        %global &wordpfx&cnt;

        %let &wordpfx&cnt = %scan(&string,&cnt,%str(&dlm));

 

        %* echo macro var result to log window;

        %put &wordpfx&cnt=&&&wordpfx&cnt;

    %end;

    proc sql;

        %do i = 1 %to %sysfunc(countw(&string,&dlm));

            create table patdata_&&&wordpfx&i

                as select * from

                    patdata

                where subject = %unquote(%str(%')&&&wordpfx&i%str(%'));;

        %end;

    quit;

 

 

%mend splitchar;

 

%splitchar(string=&subject_code,dlm=|);

 

 

%macro temp;

DATA alldata0;

SET ae;

%global value;

call symput ('value', subject);

DROP _: match; * Drop temporary variables;

match=0;

/*Below I am looping the second dataset completely. is there any way to subset it?*/

DO i=1 TO xnobs;

SET patdata_&value(rename=(subject=_subject)) NOBS=xnobs POINT=i;

if subject=_subject THEN

DO;

match=1;

OUTPUT;

END;

END;

IF match=0 THEN DO; * Output AE record if no match in CM;

CALL MISSING(trt_code);

OUTPUT; END;

RUN;

%mend temp;

%temp;

 

 

4 REPLIES 4
Tom
Super User Tom
Super User

 

What are you actually trying to create?  Why do you think that you need to create multiple datasets?

You can easily do a many-to-many join in PROC SQL.

 

proc sql;
   create table alldata as
     select * 
     from patdata a
     full join ae b
     on a.subject = b.subject
  ;
quit;

     
Sheeba
Lapis Lazuli | Level 10
Hi Tom,

Thanks a lot for the reply.

I need to proceed with data step as I need to do many additional calculations during the creation of datasets . I am trying to achieve a many to many join and need to two datasets to capture matched and not matched .

I trying to iterate through a smaller loop .. So I am trying to create subsets of the second dataset. The macro variables are not getting resolved .. Is it due to the macro variables getting compiled first ?

Thanks again

Regards.
Sheeba Swaminathan
Tom
Super User Tom
Super User

Yes. You cannot change the name of the dataset used in a SET statement in the middle of a data step.  That is set when the step begins.

 

But I still do not understand what you are trying to do with that complicated piece of code. Or why it is necessary to make many seperate datasets.  Does what you do to the data depend on the id value of the subject?  If not the just use one data step and apply the same rules to every body.  Use by group processing is much easier than having to maintain hundreds or thousands of individual data sets.

 

First if PATDATA has just one record per subject then just merge to the two datasets.  You can decide whether to keep everyone or only those that appear in both datasets by using the IN= data set option to create flag variables.

data want ;
   merge patdata(in=in1) ae(in=in2) ;
   by subject ;
   if in1 and in2;
   ....
run;

Second if PATDATA has more than one record per subject then use proc sql to join them.  You can do more later in another step if you need to.  You can even make the join as a VIEW so that you only process the data in the DATA step.

proc sql ;
  create view patdata_ae as
    select *
    from patdata a inner join ae b
    on a.subject = b.subject 
    order by a.subject
  ;
quit;
data want ;
   set patdata_ae ;
   by subject ;
   ....
run;
Tom
Super User Tom
Super User

Here is a way to implement a many to many merge with just data steps. 

 

First calculate the first and last observataion number for each subject for one of that datasets.  Then you can do a one to many merge with that new table and use the POINT= option on a SET statement inside of a DO loop to pull in the mulitple observations.

 

data pointers ;
  set patdata ;
  by subject ;
  if first.subject then first=_n_;
  retain first;
  if last.subject then last=_n_;
  if last.subject then output;
  keep subject first last ;
run;

data alldata0;
  merge ae pointers (in=in2) ;
  by subject ;
  if not in2 then do;
    match=0;
    output;
  end;
  else do p=first to last;
    match=1;
    set patdata point=p ;
    output;
  end;
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 947 views
  • 2 likes
  • 2 in conversation