I need to do some simple linear interpolation (also extrapolate beyond the endpoints). I am using a simple PROC EXPAND. The problem occurs when it needs to interpolate for duplicate values. See the code below for an example.
DATA Have;
INPUT Grp $ X Y;
DATALINES;
A 1 .
A 2 10
A 3 .
A 3 .
A 4 .
A 5 25
B 1 .
B 2 20
B 3 .
B 4 .
B 5 50
B 6 .
;RUN;
PROC EXPAND
DATA=Have
OUT=Want
METHOD=JOIN
EXTRAPOLATE;
ID X;
BY Grp;
RUN;
Since there are two X-values of 3 in group A, it gives me the following error and removes the entire group A from the output.
ERROR: Observation with duplicate ID value found. The value of the ID variable, X=3, at
observation number 4 in data set WORK.HAVE is the same as the previous observation.
How can I interpolate (and extrapolate) while keeping the duplicate values? I can't dedup, because I will later need to calculate a 1-in-200 event based on ALL of my observations, including the duplicates
Use a freq variable, like this:
DATA Have;
INPUT Grp $ X Y;
DATALINES;
A 1 .
A 2 10
A 3 5
A 3 .
A 4 .
A 5 25
B 1 .
B 2 20
B 3 .
B 4 .
B 5 50
B 6 .
;
data want0;
do freq = 1 by 1 until(last.X);
set have; by grp x;
yy = coalesce(yy, y);
end;
drop y;
run;
PROC EXPAND
DATA=Want0
OUT=Want1
METHOD=JOIN
EXTRAPOLATE;
ID X;
BY Grp;
RUN;
data want;
set want1;
y = yy;
do i = 1 to freq;
output;
end;
drop i yy freq;
run;
Can you add a small decimal amount to get the values? I believe SAS will interpret them as equally spaced though.
DATA Have; INPUT Grp $ X Y; DATALINES; A 1 . A 2 10 A 3 . A 3 . A 4 . A 5 25 B 1 . B 2 20 B 3 . B 4 . B 5 50 B 6 . ;RUN; data have; set have; by grp X; if first.x then counter=0; counter+1; if not (first.x and last.x) then do; x2 = x + counter/10; end; else x2=x; run; PROC EXPAND DATA=Have OUT=Want METHOD=JOIN EXTRAPOLATE; ID X2; BY Grp; RUN;
@SASaholic629 wrote:
I need to do some simple linear interpolation (also extrapolate beyond the endpoints). I am using a simple PROC EXPAND. The problem occurs when it needs to interpolate for duplicate values. See the code below for an example.
DATA Have; INPUT Grp $ X Y; DATALINES; A 1 . A 2 10 A 3 . A 3 . A 4 . A 5 25 B 1 . B 2 20 B 3 . B 4 . B 5 50 B 6 . ;RUN; PROC EXPAND DATA=Have OUT=Want METHOD=JOIN EXTRAPOLATE; ID X; BY Grp; RUN;
Since there are two X-values of 3 in group A, it gives me the following error and removes the entire group A from the output.
ERROR: Observation with duplicate ID value found. The value of the ID variable, X=3, at observation number 4 in data set WORK.HAVE is the same as the previous observation.
How can I interpolate (and extrapolate) while keeping the duplicate values? I can't dedup, because I will later need to calculate a 1-in-200 event based on ALL of my observations, including the duplicates
Would it work if you create a FREQ variable for the duplicates. I don't have PROC EXPAND so I can't test but you can.
proc summary data=have nway missing;
class grp x;
output out=haveN(drop=_type_) idgroup(out(y)=);
run;
Use a freq variable, like this:
DATA Have;
INPUT Grp $ X Y;
DATALINES;
A 1 .
A 2 10
A 3 5
A 3 .
A 4 .
A 5 25
B 1 .
B 2 20
B 3 .
B 4 .
B 5 50
B 6 .
;
data want0;
do freq = 1 by 1 until(last.X);
set have; by grp x;
yy = coalesce(yy, y);
end;
drop y;
run;
PROC EXPAND
DATA=Want0
OUT=Want1
METHOD=JOIN
EXTRAPOLATE;
ID X;
BY Grp;
RUN;
data want;
set want1;
y = yy;
do i = 1 to freq;
output;
end;
drop i yy freq;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.