I have 500 columns of values that I need to linear interpolate with 50 different sets of 20,000 random draws. Are there any quick ways to do this? What I have below, shows mini-examples of both of the tables I have. Then it linear interpolates all 500 columns of values over only 1 set of random draws. I need to know a quick way of doing it over all 50 (give or take) sets of random draws.
/*500 (could be more/less) sets of values and their associated percentile*/
DATA Values;
INFILE DATALINES DELIMITER='|';
INPUT Percentile COL1 COL2 COL500;
DATALINES;
0.1|-100||-75|250
0.3|-20|-40|175
0.5|0|0|0
0.7|60|25|-140
0.9|150|60|-300
;RUN;
/*50 (could be more/less) sets of 20,000 (fixed) random draws*/
DATA RandomDraws;
INFILE DATALINES DELIMITER='|';
INPUT Scenario Percent1 Percent2 Percent50;
DATALINES;
1|0.06|0.54|0.22
2|0.74|0.85|0.11
3|0.98|0.08|0.41
4|0.24|0.86|0.52
20000|0.05|0.56|0.69
;RUN;
/*Stack only the first set of random draws with all 500 sets of values*/
DATA Stacked;
SET Values
RandomDraws (RENAME=(Percent1=Percentile) DROP=Percent2--Percent50 Scenario);
RUN;
/*Linear interpolate for the first set of random draws only -- but for all 500 sets of values*/
PROC TRANSREG DATA=Stacked;
MODEL IDENTITY(COL:) = SPLINE(Percentile / DEGREE=1 NKNOTS=3);
OUTPUT
OUT=Random_Set_1
PREDICTED;
RUN;
BY variable isn't restricted to one variable. Add all the data at once and then run the regressions with the two variables at once, which is 500*50=25,000 regressions, but whatever, the computer handles it.
http://www2.sas.com/proceedings/forum2007/183-2007.pdf
Otherwise you could write a macro to loop it 50 times. Both approaches are covered in the paper above.
I'm not exactly following, but I would suggest transposing your data so you're dealing with a long format instead and then use a BY statement in the PROC TRANSREG to have the regression done by each draw/variable.
similar to the concept here:
If I'm misinterpreting your question, feel free to disregard this message.
I thought about that (used that exact method last week for something else). The BY statement would do it for each of my 500 sets of values over only 1 of my sets of random draws. The BY statement would allow it to work for all 500 at once. Which is good, but not good enough. I need it to do all 500, 50 different times, all at once.
EDIT: Give me a few minutes and I'll post exactly what I have above using the BY statement so you can see it still gets me to the same place.
BY variable isn't restricted to one variable. Add all the data at once and then run the regressions with the two variables at once, which is 500*50=25,000 regressions, but whatever, the computer handles it.
http://www2.sas.com/proceedings/forum2007/183-2007.pdf
Otherwise you could write a macro to loop it 50 times. Both approaches are covered in the paper above.
I didn't realize this could work for more than one variable - thank you!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.