BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
elbarto
Obsidian | Level 7

I have data that looks like the following:

 

DATA have;
input unit year variable;
DATALINES;
1	2000	0.12
1	2001	0.12
1	2002	0.98
1	2003	0.89
1	2004	0.59
1	2005	0.70
1	2006	0.17
1	2007	0.71
1	2008	0.70
1	2009	0.85
1	2010	0.86
1	2011	0.25
1	2012	0.60
1	2013	0.11
1	2014	0.39
1	2015	0.28
1	2016	0.04
1	2017	0.18
2	1978	0.97
2	1979	0.70
2	1981	0.30
2	1983	0.15
2	1984	0.23
2	1986	0.02
2	1987	0.90
2	1988	0.22
2	1989	0.70
2	1990	0.13
2	1991	0.95
2	1992	0.23
2	1994	0.57
2	1995	0.83
2	1996	0.89
2	1998	0.39
3	2010	0.33
3	2011	0.14
3	2014	0.09
;
RUN;

I want to create the following dataset with four variables: fwd_sum1 fwd_sum2 back_sum1 back_sum2 as follows:

 

 

DATA want;
input unit year variable fwd_sum1 fwd_sum2 back_sum1 back_sum2;
DATALINES;
1	2000	0.12	6.43	6.43	.	0
1	2001	0.12	5.84	5.84	.	0
1	2002	0.98	5.14	5.14	.	0
1	2003	0.89	4.97	4.97	.	0
1	2004	0.59	4.26	4.26	0.12	0.12
1	2005	0.70	3.56	3.56	0.24	0.24
1	2006	0.17	2.71	2.71	1.22	1.22
1	2007	0.71	1.85	1.85	2.11	2.11
1	2008	0.70	1.60	1.60	2.70	2.70
1	2009	0.85	1.00	1.00	3.40	3.40
1	2010	0.86	0.89	0.89	3.57	3.57
1	2011	0.25	0.50	0.50	4.28	4.28
1	2012	0.60	0.22	0.22	4.98	4.98
1	2013	0.11	0.18	0.18	5.83	5.83
1	2014	0.39	.	0	6.69	6.69
1	2015	0.28	.	0	6.94	6.94
1	2016	0.04	.	0	7.54	7.54
1	2017	0.18	.	0	7.65	7.65
2	1978	0.97	6.21	6.21	.	0
2	1979	0.70	6.21	6.21	.	0
2	1981	0.30	5.83	5.83	.	0
2	1983	0.15	5.81	5.81	1.67	1.67
2	1984	0.23	4.91	4.91	1.67	1.67
2	1986	0.02	3.99	3.99	1.97	1.97
2	1987	0.90	3.86	3.86	2.12	2.12
2	1988	0.22	2.91	2.91	2.35	2.35
2	1989	0.70	2.68	2.68	2.35	2.35
2	1990	0.13	2.68	2.68	2.37	2.37
2	1991	0.95	2.11	2.11	3.27	3.27
2	1992	0.23	1.28	1.28	3.49	3.49
2	1994	0.57	0.39	0.39	4.32	4.32
2	1995	0.83	.	0	5.27	5.27
2	1996	0.89	.	0	5.50	5.5
2	1998	0.39	.	0	6.07	6.07
3	2010	0.33	0.09	0.09	.	0
3	2011	0.14	.	0	.	0
3	2014	0.09	.	0	0.33	0.33
;
RUN;

 

fwd_sum1 is a "forward" sum defined as follows: For each year of a given unit, fwd_sum1 is obtained by the sum of the variable values beginning from four years into the future until the end of the sample period for that unit. For example, for unit 1, year 2000, fwd_sum1 is obtained by adding all values of variable from 2004 (four years from 2000) until 2017 (end of sample period for unit 1). However, note that when year=2014, there is no 2018 for this unit, so fwd_sum1 takes on a missing value. 

 

However, note that the panel of data is not balanced, so some years may be missing for some units, for these missing years, the value of variable is assumed to be missing. For example, for unit 2, year=1978. Four years into the future is 1982, but this year is missing for this unit, so the 1982 value of variable is missing.

 

fwd_sum2 is defined almost the same as fwd_sum1 but if the entire sum given by fwd_sum1 is missing, then set the value to 0.

 

back_sum1 is a "backward" sum defined as follows: For each year of a given unit, back_sum1 is obtained by the sum of the variable values beginning from four years into the past until the start of the sample period for that unit. For example, for unit=1, year=2017, back_sum1 is obtained by adding all values of variable from 2013 (four years in the past) until 2000 (start of sample period for unit 1). Again, note that when year=2003, there is no 1999 for this unit, so back_sum1 is just missing.

 

back_sum2 is defined almost the same as back_sum1 except if the entire sum given by back_sum1 is missing, then set the value to 0.

 

Finally, the above example is given for four years in the future (for the fwd sums) and four years in the past (for the back sums). Ideally, I would like a more general result, i.e., N years in the future (for the forward sum) and M years in the past (for the back sums), where I can toggle N and M by myself.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Use a double loop, and an array:

%let fwd = 4;
%let bck =4;

data want;
array years {1900:2099} _temporary_;
call missing(of years{*});
do until (last.unit);
  set have;
  by unit;
  if first.unit then start = year;
  years{year} = variable;
end;
end = year;
do until (last.unit);
  set have;
  by unit;
  do i = year + &fwd. to end;
    fwd_sum1 = fwd_sum1 + years{i};
  end;
  fwd_sum2 = sum(fdw_sum1,0);
  do i = start to year - &bck.;
    back_sum1 = back_sum1 + years{i};
  end;
  back_sum2 = sum(back_sum1,0);
  output;
end;
drop start end i;
run;

Untested, posted from my tablet.

View solution in original post

3 REPLIES 3
Kurt_Bremser
Super User

Use a double loop, and an array:

%let fwd = 4;
%let bck =4;

data want;
array years {1900:2099} _temporary_;
call missing(of years{*});
do until (last.unit);
  set have;
  by unit;
  if first.unit then start = year;
  years{year} = variable;
end;
end = year;
do until (last.unit);
  set have;
  by unit;
  do i = year + &fwd. to end;
    fwd_sum1 = fwd_sum1 + years{i};
  end;
  fwd_sum2 = sum(fdw_sum1,0);
  do i = start to year - &bck.;
    back_sum1 = back_sum1 + years{i};
  end;
  back_sum2 = sum(back_sum1,0);
  output;
end;
drop start end i;
run;

Untested, posted from my tablet.

elbarto
Obsidian | Level 7

Thanks Kurt, however when I tested the code, all the fwd_sum1 and back_sum1 are missing.

 

This is after correcting the typo in

fwd_sum2 = sum(fdw_sum1,0);

 

Any ideas why?

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 279 views
  • 1 like
  • 2 in conversation