DATA Step, Macro, Functions and more

Combination condition from 2 files

Accepted Solution Solved
Reply
Super Contributor
Posts: 371
Accepted Solution

Combination condition from 2 files

[ Edited ]

Hello Everyone,

I have 2 files, one file contains condition on variable "white", one contains condition for variable "black".

The following SQL code and create all combinations for white and black conditions.

 

I want to have the code that allow me to specify number of White combine with number of Black.

For example: 2 condition White (W) and 2 condition black(B). so the output file look like:

w1_name  w1_value  w2_name  w2_value     b1_name  b1_value  b2_name  b2_value

 

The output for 2black and 2 white should be:

 

B1 1 B2 5 W1 2 W2 5
B1 1 B2 5 W1 2 W2 5
B2 5 B3 5 W1 2 W2 5
B1 1 B2 5 W1 2 W3 7
B1 1 B2 5 W1 2 W3 7
B2 5 B3 5 W1 2 W3 7
B1 1 B2 5 W2 5 W3 7
B1 1 B2 5 W2 5 W3 7
B2 5 B3 5 W2 5 W3 7

 

I guess it could be done through some sort of DO LOOP but I dont know how to do it.

 

Could anyone help me with that?

 

Thank you so much for your time.

 

HHC

 


data black;
input black_name $ black_value;
datalines;
b1 1
b2 5
b3 5
; run;

data white;
input white_name $ white_value;
datalines;
w1 2
w2 5
w3 7
;run;

*1 black with 1 white;
data black; set black;
d=1;run;

data white; set white;
d=1;run;

proc sql;
create table want1_1 as select * from black as a full join white as b
on a.d=b.d;quit;

 


Accepted Solutions
Solution
‎02-09-2017 11:12 AM
Super Contributor
Posts: 474

Re: Combination condition from 2 files

OK.

I've modified the macro to get you what you want, as long as you don't mind a different order:

%macro doit(IN_W,IN_B,B,W,OUT);
proc sql noprint;
create table &OUT (drop = _:) as
     select distinct *
     from 
     %do I=1 %to &B;
     &IN_B (keep=BLACK_NAME BLACK_VALUE
            rename=(BLACK_NAME=B&I._NAME BLACK_VALUE=B&I._VALUE)) as B&I,%end;
     %do J=1 %to &W;
     &IN_W (keep=WHITE_NAME WHITE_VALUE
            rename=(WHITE_NAME=W&J._NAME WHITE_VALUE=W&J._VALUE)) as W&J,%end;
     &IN_W (obs=1 keep=WHITE_NAME rename=(WHITE_NAME=_DUMMY)) as _DUMMY
     where %do I=1 %to %eval(&B-1); B&I._NAME lt B%eval(&I+1)_NAME and %end;
           %do J=1 %to %eval(&W-1); W&J._NAME lt W%eval(&J+1)_NAME and %end;
     1;
quit;
%mend doit;
%doit(WHITE,BLACK,2,2,WANT); * in_white, in_black, white, black, out;


The cartesian product will produce the permutations following the rules (where condition) of the results you need.

For example for B=2, W=2 the macro will build the following SQL statement:

 

select distinct * from
       BLACK (keep=BLACK_NAME BLACK_VALUE
              rename=(BLACK_NAME=B1_NAME BLACK_VALUE=B1_VALUE)) as B1,
       BLACK (keep=BLACK_NAME BLACK_VALUE
              rename=(BLACK_NAME=B2_NAME BLACK_VALUE=B2_VALUE)) as B2,
       WHITE (keep=WHITE_NAME WHITE_VALUE
              rename=(WHITE_NAME=W1_NAME WHITE_VALUE=W1_VALUE)) as W1,
       WHITE (keep=WHITE_NAME WHITE_VALUE
              rename=(WHITE_NAME=W2_NAME WHITE_VALUE=W2_VALUE)) as W2

where B1_NAME lt B2_NAME and W1_NAME lt W2_NAME


Hope it helps.

Daniel Santos @ www.cgd.pt

View solution in original post


All Replies
Super User
Posts: 10,552

Re: Combination condition from 2 files

It looks like you are either looking to combine data and then transpose or transpose and then combine from your desired output description.

Could you post the exact values for the desired output given your example data?

 

Since d=1 for ALL records in both sets the role it plays is not obvious.

Super User
Posts: 5,099

Re: Combination condition from 2 files

As it stands now, your program gives you all combinations of the observations.  You could have gotten it simply, without creating the variable D:

 

proc sql;

create table want as select * from black, white;

quit;

 

When you describe what you want, it's not clear if you are trying to use the same sets of variables, but with a subset of the combinations.  If that is the case, you are permitted to specify options on the data sets such as:

 

proc sql;

create table want as select * from black (obs=2), white (firstobs=3);

quit;

 

Those may be the tools you are looking for.

Super Contributor
Posts: 371

Re: Combination condition from 2 files

I just update my post with the complete final "want" file to make it clear.

Could you please check it again to see if you can help with that problem?

Thank you,

HHC

Solution
‎02-09-2017 11:12 AM
Super Contributor
Posts: 474

Re: Combination condition from 2 files

OK.

I've modified the macro to get you what you want, as long as you don't mind a different order:

%macro doit(IN_W,IN_B,B,W,OUT);
proc sql noprint;
create table &OUT (drop = _:) as
     select distinct *
     from 
     %do I=1 %to &B;
     &IN_B (keep=BLACK_NAME BLACK_VALUE
            rename=(BLACK_NAME=B&I._NAME BLACK_VALUE=B&I._VALUE)) as B&I,%end;
     %do J=1 %to &W;
     &IN_W (keep=WHITE_NAME WHITE_VALUE
            rename=(WHITE_NAME=W&J._NAME WHITE_VALUE=W&J._VALUE)) as W&J,%end;
     &IN_W (obs=1 keep=WHITE_NAME rename=(WHITE_NAME=_DUMMY)) as _DUMMY
     where %do I=1 %to %eval(&B-1); B&I._NAME lt B%eval(&I+1)_NAME and %end;
           %do J=1 %to %eval(&W-1); W&J._NAME lt W%eval(&J+1)_NAME and %end;
     1;
quit;
%mend doit;
%doit(WHITE,BLACK,2,2,WANT); * in_white, in_black, white, black, out;


The cartesian product will produce the permutations following the rules (where condition) of the results you need.

For example for B=2, W=2 the macro will build the following SQL statement:

 

select distinct * from
       BLACK (keep=BLACK_NAME BLACK_VALUE
              rename=(BLACK_NAME=B1_NAME BLACK_VALUE=B1_VALUE)) as B1,
       BLACK (keep=BLACK_NAME BLACK_VALUE
              rename=(BLACK_NAME=B2_NAME BLACK_VALUE=B2_VALUE)) as B2,
       WHITE (keep=WHITE_NAME WHITE_VALUE
              rename=(WHITE_NAME=W1_NAME WHITE_VALUE=W1_VALUE)) as W1,
       WHITE (keep=WHITE_NAME WHITE_VALUE
              rename=(WHITE_NAME=W2_NAME WHITE_VALUE=W2_VALUE)) as W2

where B1_NAME lt B2_NAME and W1_NAME lt W2_NAME


Hope it helps.

Daniel Santos @ www.cgd.pt

Super Contributor
Posts: 371

Re: Combination condition from 2 files

Hi DanielSantos,

This code is great but complicate.

I can see that for a 2black 2white, the code creates 4 seperate file B1 B2 W1 W2 (from the 2 loop)

1 more file WHITE is created as dummy

 

How you merge them in the last 3 row is really magical to me. Especially the "1" at the end.

Can you ellaborate the code a bit?

Thank you.

HHC

     where %do I=1 %to %eval(&B-1); B&I._NAME lt B%eval(&I+1)_NAME and %end;
           %do J=1 %to %eval(&W-1); W&J._NAME lt W%eval(&J+1)_NAME and %end;
     1;
Super User
Posts: 10,552

Re: Combination condition from 2 files


hhchenfx wrote:

I just update my post with the complete final "want" file to make it clear.

Could you please check it again to see if you can help with that problem?

Thank you,

HHC


You Still have not described the role of your D variable.

This does sort of a two by two case but this will not expand easily to a generic number and I am not sure whether your "want" has been reduced from all combinations or if there is some as yet undefined rule to get that specific result.

proc sql;
   create table blackPaired as
   select a.*, b.black_name as black_name2, b.black_value as black_value2
   from black as a, black as b;
   create table whitePaired as
   select a.*, b.white_name as white_name2, b.white_value as white_value2
   from white as a, white as b;
quit;

data want;
   merge blackpaired whitepaired;
run;
/* or */
proc sql;
  create table want2 as
  select blackpaired.*,whitepaired.*
  from blackpaired, whitepaired;
quit;
Super Contributor
Posts: 371

Re: Combination condition from 2 files

The d=1. I am sorry for my coding skill. I created it only to serve the purpose of merging in the proc sql in the section "on a.d=b.d".

 

Super User
Posts: 10,552

Re: Combination condition from 2 files


hhchenfx wrote:

The d=1. I am sorry for my coding skill. I created it only to serve the purpose of merging in the proc sql in the section "on a.d=b.d".

 


When the variable d has the same value for all records in both sets then a simple join on has the same behavior as a cartesian join which can be accomplished with the

 

From dataset1, dataset2

Super Contributor
Posts: 371

Re: Combination condition from 2 files

Hi Ballardw,

The code produce some undesirable combination within blackpair table and whitepair table.

For example, blackpair table has (b1 b1) , [(b1,b2) and (b2,b1)].

I think if we can eliminate these duplicate, the merge later step will do the work.

 

I try this code below but it only help to eliminate the (b1,b1) type of combination. I cant eliminate the other combination  [(b1,b2) and (b2,b1)]

proc sql;
   create table blackPaired as
   select distinct a.*, b.black_name as b2_name, b.black_value as b2_value
   from black as a left join black as b on a.black_name	^=	b.black_name;
quit;

 

 

Super Contributor
Posts: 474

Re: Combination condition from 2 files

[ Edited ]

Hi.

 

Not very clear for me what are the expected results.

 

If you want to pick the first n, m rows of each datasets and arrange them in a single row, this macro might work for you:

 


%macro doit(IN_W,IN_B,B,W,OUT);
data &OUT;
     drop WHITE_: BLACK_:;
     set WHITE end=_EOFW; * pick from white;
     %do I=1 %to &W;
     if _N_ eq &I then do;
     retain W&I._NAME W&I._VALUE;
     W&I._NAME=WHITE_NAME;
     W&I._VALUE=WHITE_VALUE;
     end;
     %end;
     set BLACK end=_EOFB; * pick from black;
     %do J=1 %to &B;
     if _N_ eq &J then do;
     retain B&J._NAME B&J._VALUE;
     B&J._NAME=BLACK_NAME;
     B&J._VALUE=BLACK_VALUE;
     end;
     %end;
     if _EOFW and _EOFB then output; * check if last row;
run;
%mend doit;

%doit(WHITE,BLACK,2,2,WANT); * in_white, in_black, white, black, out;

Hope it helps.

 

Daniel Santos @ www.cgd.pt

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 229 views
  • 1 like
  • 4 in conversation