Need a help on Linear Interpolation

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 15
Accepted Solution

Need a help on Linear Interpolation

Dear who can help this out;

I have state level socioeconomic data set from census, income, race ratio, marriage ratio, education level etc. It's in each 10 year period (1980, 1990, 2000). I want to linearly interpolate each variable values for the years in between (values for 1981~1989, 1991~1999).

Data structure now looks like this

State     year   income race ratio marriage ratio education level

Alabama  1980 10000     0.40          0.52              0.34

Alabama  1990 12000     0.44          0.50              0.32

Alabama  2000 13000     0.48          0.58              0.39

California 1980 15000     0.20          0.63              0.42

California 1990 16000     0.25          0.66              0.43

California 2000 16500     0.30          0.69              0.47

Any idea on this particular procedure with this set of data would be appreciated!

Thank you,

MB


Accepted Solutions
Solution
‎01-06-2014 05:12 PM
Respected Advisor
Posts: 4,640

Re: Need a help on Linear Interpolation

You could use this code which is a bit more general:


data have2;
length state $16;
input State     year   income race_ratio marriage_ratio education_level;
datalines;
Alabama  1980 10000     0.40          0.52              0.34
Alabama  1990 12000     0.44          0.50              0.32
Alabama  2000 13000     0.48          0.58              0.39
California 1980 15000     0.20          0.63              0.42
California 1990 16000     0.25          0.66              0.43
California 2000 16500     0.30          0.69              0.47
;

data want2;
set have2; by state notsorted;
array prev{5} _temporary_; array actual{5} _temporary_;
array val{5} year income race_ratio marriage_ratio education_level;
do i = 1 to dim(val);
     actual{i} = val{i};
     prev{i} = lag(val{i});
     end;
if first.state then output;
else do;
     do year = prev{1}+1 to actual{1};
          do i = 2 to dim(val);
               val{i} = ((actual{1}-year)*prev{i} + (year-prev{1})*actual{i}) / (actual{1}-prev{1});
               end;
          output;
          end;
     end;
drop i;
run;

proc print data=want2 noobs; run;

PG

PG

View solution in original post


All Replies
Super Contributor
Posts: 644

Re: Need a help on Linear Interpolation

Proc transpose data = want out = step1

          prefix = 'yr_'

          ;

     By     State ;

     id     Year ;

     Var _Numeric_ ;

run ;


/*     you may get a note that year is both an id and analysis variable */


Data step2 ;

     Array annual {*}  yr_1980 - yr_2000 ;

     set step1 ;

     Do k = 1 to 9 ;

          annual (k + 1) = (k * yr_1980 + (10 - k) * yr_1990) / 9 ;

          annual (k + 11) = (k * yr_1990 + (10 - k) * yr_2000) / 9 ;

     end ;

     Drop k ;

Run ;


Proc Transpose data = step2

               out = want (drop = _name_)

               ;

     By      State ;

Run ;

Alert! untested code.  You might have to have another datastep to strip year out of _Name_ in want (in which case do not drop it in the transpose!

Richard in NZ again

Message was edited by: Richard Carson Changed divisor to 9 from 10. 

Occasional Contributor
Posts: 15

Re: Need a help on Linear Interpolation

Thank you Richard for your detailed explanation.

My question is that it seems like the order of k and (10-k) in the following equation matter:

          annual (k + 1) = (k * yr_1980 + (10 - k) * yr_1990) / 9 ;

          annual (k + 11) = (k * yr_1990 + (10 - k) * yr_2000) / 9 ;

For instance, if the variables have increased over the decades, the above formula gives me a decreasing patter in the values for the variable.

If age is 30 at 1980 and 34 in 1990, then it's rational to see values for 1981-1985 to be smaller than values for 1986-1989. But the above code gives me the opposite. Any thoughts would be helpful!

Thank you!

Markov


Solution
‎01-06-2014 05:12 PM
Respected Advisor
Posts: 4,640

Re: Need a help on Linear Interpolation

You could use this code which is a bit more general:


data have2;
length state $16;
input State     year   income race_ratio marriage_ratio education_level;
datalines;
Alabama  1980 10000     0.40          0.52              0.34
Alabama  1990 12000     0.44          0.50              0.32
Alabama  2000 13000     0.48          0.58              0.39
California 1980 15000     0.20          0.63              0.42
California 1990 16000     0.25          0.66              0.43
California 2000 16500     0.30          0.69              0.47
;

data want2;
set have2; by state notsorted;
array prev{5} _temporary_; array actual{5} _temporary_;
array val{5} year income race_ratio marriage_ratio education_level;
do i = 1 to dim(val);
     actual{i} = val{i};
     prev{i} = lag(val{i});
     end;
if first.state then output;
else do;
     do year = prev{1}+1 to actual{1};
          do i = 2 to dim(val);
               val{i} = ((actual{1}-year)*prev{i} + (year-prev{1})*actual{i}) / (actual{1}-prev{1});
               end;
          output;
          end;
     end;
drop i;
run;

proc print data=want2 noobs; run;

PG

PG
Occasional Contributor
Posts: 15

Re: Need a help on Linear Interpolation

Thank you very much PG!

Super Contributor
Posts: 644

Re: Need a help on Linear Interpolation

@Markov

Obviously if you have access to Proc Expand that is the best toool to use.

OK, I did warn you that mycode was untested!  Only 3 errors:  Prefix did not do what I wanted, multipliers in the wrong order and the divisor should have been 10 as I had entered it at first.  Here is the tested (v9.3) code:

options validvarname=v7 ;
Proc transpose data = have out = step1
          ;
     By     State ;
     id     Year ;
     Var _Numeric_ ;
run ;

Data step2 ;
     Array annual {*}  _1980 - _2000 ;
     set step1 ;
     Do k = 1 to 9 ;
          annual (k + 1) = ((10 - k) * _1980 + k * _1990) / 10 ;
          annual (k + 11) = ((10 - k) * _1990 + k * _2000) / 10 ;
     end ;
     Drop k ;
Run ;

Proc Transpose data = step2
               out = want (drop = _name_)
               ;
     By      State ;
Run ;

Richard in NZ

Occasional Contributor
Posts: 15

Re: Need a help on Linear Interpolation

Thank you very much Richard! Your code is very helpful in showing me the logic behind the procedure.

Best,

Markov

Respected Advisor
Posts: 3,777

Re: Need a help on Linear Interpolation

If you have PROC EXPAND it will expand your decade data into years and allow you to compute various types of interpolated values.  I would try to make example but I don't have SAS/ETS.

Respected Advisor
Posts: 4,640

Re: Need a help on Linear Interpolation

You can implement DN's suggestion as follows:

data have;
length state $16;
input State     year   income race_ratio marriage_ratio education_level;
yearD = mdy(1, 1, year);
drop year;
datalines;
Alabama  1980 10000     0.40          0.52              0.34
Alabama  1990 12000     0.44          0.50              0.32
Alabama  2000 13000     0.48          0.58              0.39
California 1980 15000     0.20          0.63              0.42
California 1990 16000     0.25          0.66              0.43
California 2000 16500     0.30          0.69              0.47
;

proc expand data=have out=want from=year10 to=year method=join;
by state notsorted;
id yearD;
run;

proc print data=want noobs; run;

PG

PG
Occasional Contributor
Posts: 15

Re: Need a help on Linear Interpolation

Thank you PG for your explanation.

My few questions on your code would be:

1) is there a particular reason you assigned a new year variable (yearD) and drop year variable?

2)When you specify "from" and "to", what is year10 for? is it a variable that I was supposed to create prior to run proc expand? Also, since we dropped "year" variable in the previous step, is it "yearD" that we need to use for "to"? I am trying to understand your code and it would be great if you can elaborate on this.

Thank you!

Markov

Respected Advisor
Posts: 3,777

Re: Need a help on Linear Interpolation

I think it would be good for you to look at the documentation for PROC EXPAND.

Occasional Contributor
Posts: 15

Re: Need a help on Linear Interpolation

Thanks DN, I am looking into the documents for proc expand for linear interpolation Smiley Happy

Respected Advisor
Posts: 4,640

Re: Need a help on Linear Interpolation

from=year10 means from 10 year intervals,  to=year means convert to 1 year intervals. yearD are proper SAS dates, which is what is expected by proc Expand. You can see the effect of using SAS dates in the interpolation which appears slightly irregular. These small irregularities are the consequence of leap years.

PG

PG
Super User
Posts: 10,466

Re: Need a help on Linear Interpolation

Since the values are most likely rounded, I would be very hesitant about keeping additional digits implying more precision than the data might actually allow.

Super User
Posts: 17,748

Re: Need a help on Linear Interpolation

Can you explain why you're doing this?

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 18 replies
  • 1047 views
  • 7 likes
  • 7 in conversation