Hi,
I'm stuck creating the appropriate Datastep that will perform a calculation on all possible pair-wise combinations of a numeric variable that pertains to an ID variable, as stratified by the combination of two character variables. The number of the numeric variable values varies across the charcter variable combinations and all character variable values have unique nomenclature (as opposed to the general case below).
The general case:
VAR1 VAR2 ID Y
A A A y1
A A B y2
A A C y3
Desired output:
VAR1 VAR2 VAR3
A A y1+y2
A A y1+y3
A A y2+y3
Any help is much appreciated.
I don't understand how UNIQUEID comes into play.
Can you provide a more detailed example, if one of the above suggestions doesn't meet your requirements.
Also, an inefficient method is a cross join and then filter out the results you don't want.
Hi Anderrh,
Have you tried the lag function? I am assuming that y1, y2, and y3 are numeric values. I am also assuming that VAR1 and VAR2 are always the same and the there are always three rows of equal values in VAR1 and VAR2.
proc sql; drop table work.have; quit;
data work.have;
input VAR1 $ VAR2 $ ID $ Y $;
datalines;
A A A y1
A A B y2
A A C y3
;
RUN;
PROC SORT DATA=HAVE OUT=HAVE2(DROP=ID);
BY VAR1 VAR2 Y;
RUN;
/** ASSUMING THAT Y IS A NUMERIC VALUE **/
DATA WANT;
SET HAVE2;
BY VAR1 VAR2;
IF LAST.VAR1 AND LAST.VAR2 THEN DO;
VAR3 = Y + LAG1(Y); /*(Y3 + Y2)*/
OUTPUT;
VAR3 = Y + LAG2(Y); /*(Y3 + Y1)*/
OUTPUT;
VAR3 = LAG2(Y) + LAG3(Y); /* Y2 + Y1 */
OUTPUT;
END;
RETURN;
DROP Y;
END;
Thanks for the reply.
Y is a numeric variable but, unfortunately, varies in terms of sample size among the various combinations of VAR1 and VAR2, which are not the same.
There are 87 unique values of VAR1, each of which contains anywhere from 2 to 50 unique values of VAR2, each of which contains anywhere from 3 to 50 unique values of ID, each of which contains a unique Y. I'm trying to permutate through all possible pair-wise combinations of Y values among the VAR1*VAR2 combinations.
Dear ANDERRH,
The code below should group all your var1 and var2 combinations. The process of adding the group pairs is not that simple. The principle of my previous code is the same. For group AudiSedan add 13Y + 12Y then 13Y + 11Y then ...... 12Y + 11Y then............2Y + 1Y; (13+12+11+10....+1 loops for the group AudiSedan).
proc sql;
create table var3 as
select distinct
strip(make)||strip(type) as group,
a.make as var1,
a.type as var2,
a.invoice as y format=8.,
count(calculated group) as groupcount
from sashelp.cars as a
group by 1
order by 1,4
;
quit;
var3 13:35 Monday, May 19, 2014 2
Obs group var1 var2 y groupcount
1 AcuraSUV Acura SUV 33337 1
2 AcuraSedan Acura Sedan 21761 5
3 AcuraSedan Acura Sedan 24647 5
4 AcuraSedan Acura Sedan 30299 5
5 AcuraSedan Acura Sedan 39014 5
6 AcuraSedan Acura Sedan 41100 5
7 AcuraSports Acura Sports 79978 1
8 AudiSedan Audi Sedan 23508 13
9 AudiSedan Audi Sedan 28846 13
I don't understand how UNIQUEID comes into play.
Can you provide a more detailed example, if one of the above suggestions doesn't meet your requirements.
Also, an inefficient method is a cross join and then filter out the results you don't want.
Thanks to all - I actually figured out the cross-join method using the entire cartesian product, filtering out those combinations across the VAR1 VAR2 strata. As noted, it's not the most efficient but it works. Thanks again.
Are you doing Survive Analysis ?
proc sort data=sashelp.class(keep= age weight) out=have; by age weight;run; data want; array x{9999} _temporary_; n=0; call missing(of x{*}); do until(last.age); set have; by age; n+1;x{n}=weight; end; do i=1 to n-1; do j=i+1 to n; sum=sum(x{i},x{j}); output; end; end; keep age sum; run;
Xia Keshan
Dear Anderrh;
I used sashelp.cars to add the current value to a prior value cascading through all pair combinations. I tested using 16 rows of the same VAR1 and VAR2 equalities. I built an array to hold all the values associated to the VAR1 and VAR2 unique combinations; then I used do loops to add different array variable pairs.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.