I need to do some simple linear interpolation (also extrapolate beyond the endpoints). I am using a simple PROC EXPAND. The problem occurs when it needs to interpolate for duplicate values. See the code below for an example.
DATA Have;
INPUT Grp $ X Y;
DATALINES;
A 1 .
A 2 10
A 3 .
A 3 .
A 4 .
A 5 25
B 1 .
B 2 20
B 3 .
B 4 .
B 5 50
B 6 .
;RUN;
PROC EXPAND
DATA=Have
OUT=Want
METHOD=JOIN
EXTRAPOLATE;
ID X;
BY Grp;
RUN;
Since there are two X-values of 3 in group A, it gives me the following error and removes the entire group A from the output.
ERROR: Observation with duplicate ID value found. The value of the ID variable, X=3, at
observation number 4 in data set WORK.HAVE is the same as the previous observation.
How can I interpolate (and extrapolate) while keeping the duplicate values? I can't dedup, because I will later need to calculate a 1-in-200 event based on ALL of my observations, including the duplicates
Use a freq variable, like this:
DATA Have;
INPUT Grp $ X Y;
DATALINES;
A 1 .
A 2 10
A 3 5
A 3 .
A 4 .
A 5 25
B 1 .
B 2 20
B 3 .
B 4 .
B 5 50
B 6 .
;
data want0;
do freq = 1 by 1 until(last.X);
set have; by grp x;
yy = coalesce(yy, y);
end;
drop y;
run;
PROC EXPAND
DATA=Want0
OUT=Want1
METHOD=JOIN
EXTRAPOLATE;
ID X;
BY Grp;
RUN;
data want;
set want1;
y = yy;
do i = 1 to freq;
output;
end;
drop i yy freq;
run;
Can you add a small decimal amount to get the values? I believe SAS will interpret them as equally spaced though.
DATA Have; INPUT Grp $ X Y; DATALINES; A 1 . A 2 10 A 3 . A 3 . A 4 . A 5 25 B 1 . B 2 20 B 3 . B 4 . B 5 50 B 6 . ;RUN; data have; set have; by grp X; if first.x then counter=0; counter+1; if not (first.x and last.x) then do; x2 = x + counter/10; end; else x2=x; run; PROC EXPAND DATA=Have OUT=Want METHOD=JOIN EXTRAPOLATE; ID X2; BY Grp; RUN;
@SASaholic629 wrote:
I need to do some simple linear interpolation (also extrapolate beyond the endpoints). I am using a simple PROC EXPAND. The problem occurs when it needs to interpolate for duplicate values. See the code below for an example.
DATA Have; INPUT Grp $ X Y; DATALINES; A 1 . A 2 10 A 3 . A 3 . A 4 . A 5 25 B 1 . B 2 20 B 3 . B 4 . B 5 50 B 6 . ;RUN; PROC EXPAND DATA=Have OUT=Want METHOD=JOIN EXTRAPOLATE; ID X; BY Grp; RUN;
Since there are two X-values of 3 in group A, it gives me the following error and removes the entire group A from the output.
ERROR: Observation with duplicate ID value found. The value of the ID variable, X=3, at observation number 4 in data set WORK.HAVE is the same as the previous observation.
How can I interpolate (and extrapolate) while keeping the duplicate values? I can't dedup, because I will later need to calculate a 1-in-200 event based on ALL of my observations, including the duplicates
Would it work if you create a FREQ variable for the duplicates. I don't have PROC EXPAND so I can't test but you can.
proc summary data=have nway missing;
class grp x;
output out=haveN(drop=_type_) idgroup(out(y)=);
run;
Use a freq variable, like this:
DATA Have;
INPUT Grp $ X Y;
DATALINES;
A 1 .
A 2 10
A 3 5
A 3 .
A 4 .
A 5 25
B 1 .
B 2 20
B 3 .
B 4 .
B 5 50
B 6 .
;
data want0;
do freq = 1 by 1 until(last.X);
set have; by grp x;
yy = coalesce(yy, y);
end;
drop y;
run;
PROC EXPAND
DATA=Want0
OUT=Want1
METHOD=JOIN
EXTRAPOLATE;
ID X;
BY Grp;
RUN;
data want;
set want1;
y = yy;
do i = 1 to freq;
output;
end;
drop i yy freq;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.