topic Re: PROC EXPAND - w/ duplicates in SAS Programming

PROC EXPAND - w/ duplicates

SASaholic629 — Mon, 13 Aug 2018 21:10:45 GMT

I need to do some simple linear interpolation (also extrapolate beyond the endpoints). I am using a simple PROC EXPAND. The problem occurs when it needs to interpolate for duplicate values. See the code below for an example.

DATA Have;
INPUT Grp $ X Y;
DATALINES;
A	1	.
A	2	10
A	3	.
A	3	.
A	4	.
A	5	25
B	1	.
B	2	20
B	3	.
B	4	.
B	5	50
B	6	.
;RUN;

PROC EXPAND
DATA=Have
OUT=Want
METHOD=JOIN
EXTRAPOLATE;
ID X;
BY Grp;
RUN;

Since there are two X-values of 3 in group A, it gives me the following error and removes the entire group A from the output.

ERROR: Observation with duplicate ID value found. The value of the ID variable, X=3, at
       observation number 4 in data set WORK.HAVE is the same as the previous observation.

How can I interpolate (and extrapolate) while keeping the duplicate values? I can't dedup, because I will later need to calculate a 1-in-200 event based on ALL of my observations, including the duplicates

Re: PROC EXPAND - w/ duplicates

Reeza — Mon, 13 Aug 2018 21:34:09 GMT

Can you add a small decimal amount to get the values? I believe SAS will interpret them as equally spaced though.

DATA Have;
INPUT Grp $ X Y;
DATALINES;
A	1	.
A	2	10
A	3	.
A	3	.
A	4	.
A	5	25
B	1	.
B	2	20
B	3	.
B	4	.
B	5	50
B	6	.
;RUN;

data have;
set have;
by grp X;
if first.x then counter=0;
counter+1;
if not (first.x and last.x) then do;
x2 = x + counter/10;
end;
else x2=x;
run;


PROC EXPAND
DATA=Have
OUT=Want
METHOD=JOIN
EXTRAPOLATE;
ID X2;
BY Grp;
RUN;

@SASaholic629 wrote:

I need to do some simple linear interpolation (also extrapolate beyond the endpoints). I am using a simple PROC EXPAND. The problem occurs when it needs to interpolate for duplicate values. See the code below for an example.
DATA Have;
INPUT Grp $ X Y;
DATALINES;
A	1	.
A	2	10
A	3	.
A	3	.
A	4	.
A	5	25
B	1	.
B	2	20
B	3	.
B	4	.
B	5	50
B	6	.
;RUN;

PROC EXPAND
DATA=Have
OUT=Want
METHOD=JOIN
EXTRAPOLATE;
ID X;
BY Grp;
RUN;
Since there are two X-values of 3 in group A, it gives me the following error and removes the entire group A from the output.
ERROR: Observation with duplicate ID value found. The value of the ID variable, X=3, at
       observation number 4 in data set WORK.HAVE is the same as the previous observation.
How can I interpolate (and extrapolate) while keeping the duplicate values? I can't dedup, because I will later need to calculate a 1-in-200 event based on ALL of my observations, including the duplicates

Re: PROC EXPAND - w/ duplicates

data_null__ — Mon, 13 Aug 2018 22:30:33 GMT

Would it work if you create a FREQ variable for the duplicates. I don't have PROC EXPAND so I can't test but you can.

proc summary data=have nway missing;
   class grp x;
   output out=haveN(drop=_type_) idgroup(out(y)=);
   run;

Re: PROC EXPAND - w/ duplicates

PGStats — Mon, 13 Aug 2018 22:42:36 GMT

Use a freq variable, like this:

DATA Have;
INPUT Grp $ X Y;
DATALINES;
A	1	.
A	2	10
A	3	5
A	3	.
A	4	.
A	5	25
B	1	.
B	2	20
B	3	.
B	4	.
B	5	50
B	6	.
;

data want0;
do freq = 1 by 1 until(last.X);
    set have; by grp x;
    yy = coalesce(yy, y);
    end;
drop y;
run;

PROC EXPAND
DATA=Want0
OUT=Want1
METHOD=JOIN
EXTRAPOLATE;
ID X;
BY Grp;
RUN;

data want;
set want1;
y = yy;
do i = 1 to freq;
    output;
    end;
drop i yy freq;
run;