- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have 500 columns of values that I need to linear interpolate with 50 different sets of 20,000 random draws. Are there any quick ways to do this? What I have below, shows mini-examples of both of the tables I have. Then it linear interpolates all 500 columns of values over only 1 set of random draws. I need to know a quick way of doing it over all 50 (give or take) sets of random draws.
/*500 (could be more/less) sets of values and their associated percentile*/
DATA Values;
INFILE DATALINES DELIMITER='|';
INPUT Percentile COL1 COL2 COL500;
DATALINES;
0.1|-100||-75|250
0.3|-20|-40|175
0.5|0|0|0
0.7|60|25|-140
0.9|150|60|-300
;RUN;
/*50 (could be more/less) sets of 20,000 (fixed) random draws*/
DATA RandomDraws;
INFILE DATALINES DELIMITER='|';
INPUT Scenario Percent1 Percent2 Percent50;
DATALINES;
1|0.06|0.54|0.22
2|0.74|0.85|0.11
3|0.98|0.08|0.41
4|0.24|0.86|0.52
20000|0.05|0.56|0.69
;RUN;
/*Stack only the first set of random draws with all 500 sets of values*/
DATA Stacked;
SET Values
RandomDraws (RENAME=(Percent1=Percentile) DROP=Percent2--Percent50 Scenario);
RUN;
/*Linear interpolate for the first set of random draws only -- but for all 500 sets of values*/
PROC TRANSREG DATA=Stacked;
MODEL IDENTITY(COL:) = SPLINE(Percentile / DEGREE=1 NKNOTS=3);
OUTPUT
OUT=Random_Set_1
PREDICTED;
RUN;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
BY variable isn't restricted to one variable. Add all the data at once and then run the regressions with the two variables at once, which is 500*50=25,000 regressions, but whatever, the computer handles it.
http://www2.sas.com/proceedings/forum2007/183-2007.pdf
Otherwise you could write a macro to loop it 50 times. Both approaches are covered in the paper above.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm not exactly following, but I would suggest transposing your data so you're dealing with a long format instead and then use a BY statement in the PROC TRANSREG to have the regression done by each draw/variable.
similar to the concept here:
If I'm misinterpreting your question, feel free to disregard this message.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I thought about that (used that exact method last week for something else). The BY statement would do it for each of my 500 sets of values over only 1 of my sets of random draws. The BY statement would allow it to work for all 500 at once. Which is good, but not good enough. I need it to do all 500, 50 different times, all at once.
EDIT: Give me a few minutes and I'll post exactly what I have above using the BY statement so you can see it still gets me to the same place.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
BY variable isn't restricted to one variable. Add all the data at once and then run the regressions with the two variables at once, which is 500*50=25,000 regressions, but whatever, the computer handles it.
http://www2.sas.com/proceedings/forum2007/183-2007.pdf
Otherwise you could write a macro to loop it 50 times. Both approaches are covered in the paper above.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I didn't realize this could work for more than one variable - thank you!