BookmarkSubscribeRSS Feed
Jamie_H
Fluorite | Level 6

Does anyone know how I would create a macro that would create new columns that are combinations of every join of existing columns.  For example, I have a dataset with column names B, C, D & E.  My desired dataset should have those columns, plus the concatenated values for B&C, B&D. B&E, B&C&D, B&D&E, C&D&E, B&C&D&E.

 

A couple of caveats to this. 

 

1) It needs to be agnostic of the dataset, so would need to go to the dictionary tables to get the list of variables.

2) There will be variables I wouldn't like to be included in the output other than the individual variable itself.  For example, if variables A & F also existed in the dataset, i'd like a way of saying ignore A & F in the macro to concatenate, but still output the value for A & F by themselves on the final dataset.

 

Thanks

3 REPLIES 3
Jamie_H
Fluorite | Level 6
I think a 3rd caveat might also be needed for this challenge. Very quickly, the column names will become too large. So that will also need to be considered. Maybe renaming each variable to a simple letter combination and then having a label hold what the new columns is made up from?
Astounding
PROC Star
What's the logic that tells you to exclude other combinations such as:

C&D
C&E
D&E
Tom
Super User Tom
Super User

You need to explain more what you want to do.  For the main part of the logic you can just use the CALL ALLCOMB() method.  Just a simple extension of the example in the documentation.

First generate an observation with every possible combination of NVARs.  The second step blanks out the names not used for that particular combination.

data combo;
   array name[&nvar] $32 (&varlist);
   n=dim(name);
   do k=1 to n;
     ncomb=comb(n, k);
     do j=1 to ncomb;
       call allcomb(j, k, of name[*]);
       output;
     end;
   end;
run;

data combo;
  set combo;
  array name[&nvar];
  do _n_=k+1 to dim(name); name[_n_]=' '; end;
run;

Partial results:

Tom_0-1718807849880.png

You could easily get the list of variable names into the macro variable VARLIST from the CONTENTS of the source dataset.

proc contents data=have out=contents noprint; run;

proc sql noprint;
  select quote(trim(name)) into :varlist separated by ' '
  from contents
  order by varnum
  ;
%let nvar=&sqlobs;
quit;

Once you have list of combinations you could now use it to generate code.

Not sure what you want but perhaps something like this:

filename code temp;
data _null_;
  set combo;
  file code;
  array name[&nvar] ;
  varname=catx('_','combo',k,j);
  put varname "= catx('|'" @;
  do i=1 to k;
    put ',' name[i] @;
  end;
  put ');' ;
run;

data want;
  set have;
%include code / source2;
run;

Tom_1-1718808323705.png

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 177 views
  • 2 likes
  • 3 in conversation