BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ANDERRH
Calcite | Level 5

Hi,

I'm stuck creating the appropriate Datastep that will perform a calculation on all possible pair-wise combinations of a numeric variable that pertains to an ID variable, as stratified by the combination of two character variables. The number of the numeric variable values varies across the charcter variable combinations and all character variable values have unique nomenclature (as opposed to the general case below).

The general case:

VAR1     VAR2      ID     Y

A               A          A     y1

A               A          B     y2

A               A          C     y3

Desired output:

VAR1     VAR2     VAR3

A               A          y1+y2

A               A          y1+y3

A               A          y2+y3

Any help is much appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

I don't understand how UNIQUEID comes into play.

Can you provide a more detailed example, if one of the above suggestions doesn't meet your requirements.

Also, an inefficient method is a cross join and then filter out the results you don't want.

View solution in original post

7 REPLIES 7
jwillis
Quartz | Level 8

Hi Anderrh,

Have you tried the lag function?  I am assuming that y1, y2, and y3 are numeric values.  I am also assuming that VAR1 and VAR2 are always the same and the there are always three rows of equal values in VAR1 and VAR2.  


proc sql; drop table work.have; quit;
data work.have;
input VAR1 $ VAR2 $ ID $ Y $;
datalines;
A A A y1
A A B y2
A A C y3
;
RUN;

PROC SORT DATA=HAVE OUT=HAVE2(DROP=ID);
BY VAR1 VAR2 Y;
RUN;

/** ASSUMING THAT Y IS A NUMERIC VALUE **/

DATA WANT;
    SET HAVE2;
BY VAR1 VAR2;
IF LAST.VAR1 AND LAST.VAR2 THEN DO;
        VAR3 = Y + LAG1(Y);  /*(Y3 + Y2)*/
        OUTPUT;
        VAR3 = Y + LAG2(Y);  /*(Y3 + Y1)*/
        OUTPUT;
        VAR3 = LAG2(Y) + LAG3(Y);  /* Y2 + Y1 */
        OUTPUT;
END;
RETURN;
DROP Y;
END;

ANDERRH
Calcite | Level 5

Thanks for the reply.

Y is a numeric variable but, unfortunately, varies in terms of sample size among the various combinations of VAR1 and VAR2, which are not the same.

There are 87 unique values of VAR1, each of which contains anywhere from 2 to 50 unique values of VAR2, each of which contains anywhere from 3 to 50 unique values of ID, each of which contains a unique Y. I'm trying to permutate through all possible pair-wise combinations of Y values among the VAR1*VAR2 combinations.  

jwillis
Quartz | Level 8

Dear ANDERRH,

The code below should group all your var1 and var2 combinations.  The process of adding the group pairs is not that simple. The principle of my previous code is the same.  For group AudiSedan add 13Y + 12Y then 13Y + 11Y then ...... 12Y + 11Y then............2Y + 1Y;  (13+12+11+10....+1 loops for the group AudiSedan).


proc sql;
  create table var3 as
  select distinct
       strip(make)||strip(type) as group,
       a.make as var1,
       a.type as var2,
       a.invoice as y format=8.,
       count(calculated group) as groupcount
from sashelp.cars as a
group by 1
order by 1,4
;
quit;

                                                          var3                            13:35 Monday, May 19, 2014   2

                            Obs       group       var1     var2             y    groupcount

                              1    AcuraSUV       Acura    SUV          33337         1
                              2    AcuraSedan     Acura    Sedan        21761         5
                              3    AcuraSedan     Acura    Sedan        24647         5
                              4    AcuraSedan     Acura    Sedan        30299         5
                              5    AcuraSedan     Acura    Sedan        39014         5
                              6    AcuraSedan     Acura    Sedan        41100         5
                              7    AcuraSports    Acura    Sports       79978         1
                              8    AudiSedan      Audi     Sedan        23508        13
                              9    AudiSedan      Audi     Sedan        28846        13

Reeza
Super User

I don't understand how UNIQUEID comes into play.

Can you provide a more detailed example, if one of the above suggestions doesn't meet your requirements.

Also, an inefficient method is a cross join and then filter out the results you don't want.

ANDERRH
Calcite | Level 5

Thanks to all - I actually figured out the cross-join method using the entire cartesian product, filtering out those combinations across the VAR1 VAR2 strata. As noted, it's not the most efficient but it works. Thanks again.

Ksharp
Super User

Are you doing Survive Analysis ?

proc sort data=sashelp.class(keep= age weight) out=have; by age weight;run;
data want;
 array x{9999} _temporary_;
 n=0; call missing(of x{*});
 do until(last.age);
 set have;
  by age;
  n+1;x{n}=weight;
 end;
 do i=1 to n-1;
  do j=i+1 to n;
   sum=sum(x{i},x{j});
   output;
  end;
 end;
 keep age sum;
run;


Xia Keshan

jwillis
Quartz | Level 8

Dear Anderrh;

I used sashelp.cars to add the current value to a prior value cascading through all pair combinations.  I tested using 16 rows of the same VAR1 and VAR2 equalities.  I built an array to hold all the values associated to the VAR1 and VAR2 unique combinations; then I used do loops to add different array variable pairs.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1269 views
  • 0 likes
  • 4 in conversation