BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
vickyCh
Obsidian | Level 7

In the time series of gdp for several countries (part of the data is attached), we want to know the continuing performance of gdp of each country.

1) We want to identify the date (maybe many observations would satisfy the requirement in the series) that was better than the maximum gdp value of the past 6 months.

2) The average gdp growth rate in the following 6 months after we identify the date above.

3) We'd like to start the procedure again to identify another specified date and another average gdp growth rate till the end of the data.

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

If you have SAS/ETS then this proc expand code does what I believe you want:

 

data have1;
  input date :yymmdd10. gdp country;
  format date yymmdds10.;
  if country=257 then gdp=gdp+100;
cards;
19800131 28 222
19800229 29.375 222
19800331 27.125 222
19800430 24.25 222
19800530 26.75 222
19800630 30.75 222
19800731 37.75 222
19800829 40.75 222
19800930 43.375 222
19801031 51.5 222
19801128 56 222
19801231 24.5 222
19810130 22.25 222
19810227 20 222
19810331 22.75 222
19810430 20.5 222
19810529 19.5 222
19810630 18 222
19810731 18.5 222
19810831 17.5 222
19810930 14.25 222
19811030 14.375 222
19811130 15.75 222
19811231 14.625 222
19820129 14.625 222
19820226 12.875 222
19820331 12.625 222
19820430 12.625 222
19820528 12 222
19820630 12.125 222
19820730 10.875 222
19820831 11.375 222
19820930 11.75 222
19821029 14.25 222
19821130 13.75 222
19821231 13 222
19830131 14.25 222
19830228 13.5 222
19830331 14 222
19830429 14.625 222
19830531 14.5 222
19830630 15.75 222
19830729 15.375 222
19830831 14.625 222
19830930 16.625 222
19831031 15 222
19831130 15 222
19831230 15.5 222
19840131 15.375 222
19840229 14 222
19840330 13.625 222
19840430 12.125 222
19840531 11.625 222
19840629 11.5 222
19840731 10.125 222
19840831 11.875 222
19840928 12.125 222
19841031 12.25 222
19841130 11.5 222
19841231 11.125 222
19850131 13.5 222
19850228 14.875 222
19850329 14.25 222
19850430 13.625 222
19850531 13.25 222
19850628 13.375 222
19850731 13.75 222
19850830 12.75 222
19850930 12.25 222
19851031 11.25 222
19851129 12 222
19851231 12.75 222
19860131 13.125 222
19860228 14 222
19860331 14.5 222
19860430 13.25 222
19860530 13.875 222
19860630 13.5 222
19860731 11.875 222
19860829 12.25 222
19860930 11.25 222
19861031 13.25 222
19861128 13 222
19861231 13.125 222
19870130 17 222
19870227 17 222
19870331 16 222
19870430 15.25 222
19870529 19.5 222
19870630 20.25 222
19870731 22 222
19870831 22 222
19870930 21.625 222
19871030 13.125 222
19871130 10.875 222
19871231 13.75 222
19880129 14.5 222
19880229 14.75 222
19880331 13.125 222
19880429 12 222
19880531 13.375 222
19880630 16.625 222
19880729 15.5 222
19880831 15.875 222
19880930 15.25 222
19881031 14.375 222
19881130 13.375 222
19881230 14.5 222
19890131 16.125 222
19890228 16.625 222
19890331 17 222
19890428 18.25 222
19890531 20 222
19890630 18.625 222
19890731 21 222
19890831 21.25 222
19890929 20 222
19891031 17.375 222
19891130 19.875 222
19891229 20.75 222
19800131 25.75 257
19800229 25.625 257
19800331 20.375 257
19800430 20.5 257
19800530 20.75 257
19800630 22 257
19800731 21.625 257
19800829 22.875 257
19800930 24.375 257
19801031 24.5 257
19801128 24.5 257
19801231 22.875 257
19810130 23.75 257
19810227 24.875 257
19810331 28.5 257
19810430 27.75 257
19810529 27.5 257
19810630 27.125 257
19810731 29.5 257
19810831 26.875 257
19810930 27.5 257
19811030 26.75 257
19811130 25.75 257
19811231 23.25 257
19820129 23.5 257
19820226 22.375 257
19820331 22.625 257
19820430 22.125 257
19820528 23.125 257
19820630 22.75 257
19820730 20.5 257
19820831 23.5 257
19820930 24.25 257
19821029 27.75 257
19821130 31 257
19821231 30 257
19830131 31.25 257
19830228 30.875 257
19830331 33.75 257
19830429 32.875 257
19830531 34.375 257
19830630 36.25 257
;
run;

proc expand data=have1 
            out=want (where=((gdp>=max_gdp_hist6) and (centered_window_range=13))) 
            method=none;
  by country;
  id date;
  convert gdp=max_gdp_hist6         / transformin=(lag 1) transformout=(nomiss movmax 6);
  convert gdp=gmean_future_rate_6   / transformin=(ratio 1 log cmovave (0 0 0 0 0 0  0  1 1 1 1 1 1) exp -1);
  convert gdp=centered_window_range / transformin=(*0 +1 cmovsum 13)  ;
 run; 

Although proc expand does not offer forward-looking windows, you can get a centered window, which includes both lags and leads.  I.e. "cmovave 13" gets an average of the preceding 6, current obs, and next 6.  Since PROC EXPAND offers weighting, I use the weights "0 0 0 0 0 0  0  1 1 1 1 1 1" to modify CMOVAVE to get average of the next six observations.  As a result, the statement

 

 

  convert gdp=gmean_future_rate_6  / transformin= (ratio 1   log cmovave (0 0 0 0 0 0  0  1 1 1 1 1 1) exp -1;

 

does the following for each observation:

  1. gets the ratio of current vs single lag of gdp
  2. gets its natural log
  3. gets the average of that log for 13 centered observations, with all the weight on future obs
  4. exponeniates that average
  5. substracts one

which gives the simple monthly growth rate which, when compounded over the next 6 obs, would yield the compounded rate using the actual gdp data.  The program also generates the arithmetic average of growth.

 

Also you don't want short historical or future windows, so there is a where   "centered_range=13" filter.  I.e. the only eligible observations will have 6 preceding and 6 following observations within the country series.

  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

8 REPLIES 8
Kurt_Bremser
Super User

Please post example data in a data step with datalines. Spreadsheets do not have fixed column attributes and can not convey important information about your real dataset.

vickyCh
Obsidian | Level 7

Part of the monthly data is shown as follows.

 

data have1;
input date :yymmdd10. gdp country;
format date yymmdds10.;
cards;
19800131 28 222
19800229 29.375 222
19800331 27.125 222
19800430 24.25 222
19800530 26.75 222
19800630 30.75 222
19800731 37.75 222
19800829 40.75 222
19800930 43.375 222
19801031 51.5 222
19801128 56 222
19801231 24.5 222
19810130 22.25 222
19810227 20 222
19810331 22.75 222
19810430 20.5 222
19810529 19.5 222
19810630 18 222
19810731 18.5 222
19810831 17.5 222
19810930 14.25 222
19811030 14.375 222
19811130 15.75 222
19811231 14.625 222
19820129 14.625 222
19820226 12.875 222
19820331 12.625 222
19820430 12.625 222
19820528 12 222
19820630 12.125 222
19820730 10.875 222
19820831 11.375 222
19820930 11.75 222
19821029 14.25 222
19821130 13.75 222
19821231 13 222
19830131 14.25 222
19830228 13.5 222
19830331 14 222
19830429 14.625 222
19830531 14.5 222
19830630 15.75 222
19830729 15.375 222
19830831 14.625 222
19830930 16.625 222
19831031 15 222
19831130 15 222
19831230 15.5 222
19840131 15.375 222
19840229 14 222
19840330 13.625 222
19840430 12.125 222
19840531 11.625 222
19840629 11.5 222
19840731 10.125 222
19840831 11.875 222
19840928 12.125 222
19841031 12.25 222
19841130 11.5 222
19841231 11.125 222
19850131 13.5 222
19850228 14.875 222
19850329 14.25 222
19850430 13.625 222
19850531 13.25 222
19850628 13.375 222
19850731 13.75 222
19850830 12.75 222
19850930 12.25 222
19851031 11.25 222
19851129 12 222
19851231 12.75 222
19860131 13.125 222
19860228 14 222
19860331 14.5 222
19860430 13.25 222
19860530 13.875 222
19860630 13.5 222
19860731 11.875 222
19860829 12.25 222
19860930 11.25 222
19861031 13.25 222
19861128 13 222
19861231 13.125 222
19870130 17 222
19870227 17 222
19870331 16 222
19870430 15.25 222
19870529 19.5 222
19870630 20.25 222
19870731 22 222
19870831 22 222
19870930 21.625 222
19871030 13.125 222
19871130 10.875 222
19871231 13.75 222
19880129 14.5 222
19880229 14.75 222
19880331 13.125 222
19880429 12 222
19880531 13.375 222
19880630 16.625 222
19880729 15.5 222
19880831 15.875 222
19880930 15.25 222
19881031 14.375 222
19881130 13.375 222
19881230 14.5 222
19890131 16.125 222
19890228 16.625 222
19890331 17 222
19890428 18.25 222
19890531 20 222
19890630 18.625 222
19890731 21 222
19890831 21.25 222
19890929 20 222
19891031 17.375 222
19891130 19.875 222
19891229 20.75 222
19800131 25.75 257
19800229 25.625 257
19800331 20.375 257
19800430 20.5 257
19800530 20.75 257
19800630 22 257
19800731 21.625 257
19800829 22.875 257
19800930 24.375 257
19801031 24.5 257
19801128 24.5 257
19801231 22.875 257
19810130 23.75 257
19810227 24.875 257
19810331 28.5 257
19810430 27.75 257
19810529 27.5 257
19810630 27.125 257
19810731 29.5 257
19810831 26.875 257
19810930 27.5 257
19811030 26.75 257
19811130 25.75 257
19811231 23.25 257
19820129 23.5 257
19820226 22.375 257
19820331 22.625 257
19820430 22.125 257
19820528 23.125 257
19820630 22.75 257
19820730 20.5 257
19820831 23.5 257
19820930 24.25 257
19821029 27.75 257
19821130 31 257
19821231 30 257
19830131 31.25 257
19830228 30.875 257
19830331 33.75 257
19830429 32.875 257
19830531 34.375 257
19830630 36.25 257
;
run;

PeterClemmensen
Tourmaline | Level 20

For your question 1: You can identify the moving maximum for gdp for the last 6 month by country with PROC EXPAND like this

 

proc expand data=have1 out=have2;
   by country;
   id date;
   convert gdp=lastsixmonthmax / transformout=(movmax 6);
run;

For question 2: How is the growth rate calculated?

PeterClemmensen
Tourmaline | Level 20

This sounds like a job for PROC EXPAND 🙂

 

Do you have SAS/ETS?

mkeintz
PROC Star

If you have SAS/ETS then this proc expand code does what I believe you want:

 

data have1;
  input date :yymmdd10. gdp country;
  format date yymmdds10.;
  if country=257 then gdp=gdp+100;
cards;
19800131 28 222
19800229 29.375 222
19800331 27.125 222
19800430 24.25 222
19800530 26.75 222
19800630 30.75 222
19800731 37.75 222
19800829 40.75 222
19800930 43.375 222
19801031 51.5 222
19801128 56 222
19801231 24.5 222
19810130 22.25 222
19810227 20 222
19810331 22.75 222
19810430 20.5 222
19810529 19.5 222
19810630 18 222
19810731 18.5 222
19810831 17.5 222
19810930 14.25 222
19811030 14.375 222
19811130 15.75 222
19811231 14.625 222
19820129 14.625 222
19820226 12.875 222
19820331 12.625 222
19820430 12.625 222
19820528 12 222
19820630 12.125 222
19820730 10.875 222
19820831 11.375 222
19820930 11.75 222
19821029 14.25 222
19821130 13.75 222
19821231 13 222
19830131 14.25 222
19830228 13.5 222
19830331 14 222
19830429 14.625 222
19830531 14.5 222
19830630 15.75 222
19830729 15.375 222
19830831 14.625 222
19830930 16.625 222
19831031 15 222
19831130 15 222
19831230 15.5 222
19840131 15.375 222
19840229 14 222
19840330 13.625 222
19840430 12.125 222
19840531 11.625 222
19840629 11.5 222
19840731 10.125 222
19840831 11.875 222
19840928 12.125 222
19841031 12.25 222
19841130 11.5 222
19841231 11.125 222
19850131 13.5 222
19850228 14.875 222
19850329 14.25 222
19850430 13.625 222
19850531 13.25 222
19850628 13.375 222
19850731 13.75 222
19850830 12.75 222
19850930 12.25 222
19851031 11.25 222
19851129 12 222
19851231 12.75 222
19860131 13.125 222
19860228 14 222
19860331 14.5 222
19860430 13.25 222
19860530 13.875 222
19860630 13.5 222
19860731 11.875 222
19860829 12.25 222
19860930 11.25 222
19861031 13.25 222
19861128 13 222
19861231 13.125 222
19870130 17 222
19870227 17 222
19870331 16 222
19870430 15.25 222
19870529 19.5 222
19870630 20.25 222
19870731 22 222
19870831 22 222
19870930 21.625 222
19871030 13.125 222
19871130 10.875 222
19871231 13.75 222
19880129 14.5 222
19880229 14.75 222
19880331 13.125 222
19880429 12 222
19880531 13.375 222
19880630 16.625 222
19880729 15.5 222
19880831 15.875 222
19880930 15.25 222
19881031 14.375 222
19881130 13.375 222
19881230 14.5 222
19890131 16.125 222
19890228 16.625 222
19890331 17 222
19890428 18.25 222
19890531 20 222
19890630 18.625 222
19890731 21 222
19890831 21.25 222
19890929 20 222
19891031 17.375 222
19891130 19.875 222
19891229 20.75 222
19800131 25.75 257
19800229 25.625 257
19800331 20.375 257
19800430 20.5 257
19800530 20.75 257
19800630 22 257
19800731 21.625 257
19800829 22.875 257
19800930 24.375 257
19801031 24.5 257
19801128 24.5 257
19801231 22.875 257
19810130 23.75 257
19810227 24.875 257
19810331 28.5 257
19810430 27.75 257
19810529 27.5 257
19810630 27.125 257
19810731 29.5 257
19810831 26.875 257
19810930 27.5 257
19811030 26.75 257
19811130 25.75 257
19811231 23.25 257
19820129 23.5 257
19820226 22.375 257
19820331 22.625 257
19820430 22.125 257
19820528 23.125 257
19820630 22.75 257
19820730 20.5 257
19820831 23.5 257
19820930 24.25 257
19821029 27.75 257
19821130 31 257
19821231 30 257
19830131 31.25 257
19830228 30.875 257
19830331 33.75 257
19830429 32.875 257
19830531 34.375 257
19830630 36.25 257
;
run;

proc expand data=have1 
            out=want (where=((gdp>=max_gdp_hist6) and (centered_window_range=13))) 
            method=none;
  by country;
  id date;
  convert gdp=max_gdp_hist6         / transformin=(lag 1) transformout=(nomiss movmax 6);
  convert gdp=gmean_future_rate_6   / transformin=(ratio 1 log cmovave (0 0 0 0 0 0  0  1 1 1 1 1 1) exp -1);
  convert gdp=centered_window_range / transformin=(*0 +1 cmovsum 13)  ;
 run; 

Although proc expand does not offer forward-looking windows, you can get a centered window, which includes both lags and leads.  I.e. "cmovave 13" gets an average of the preceding 6, current obs, and next 6.  Since PROC EXPAND offers weighting, I use the weights "0 0 0 0 0 0  0  1 1 1 1 1 1" to modify CMOVAVE to get average of the next six observations.  As a result, the statement

 

 

  convert gdp=gmean_future_rate_6  / transformin= (ratio 1   log cmovave (0 0 0 0 0 0  0  1 1 1 1 1 1) exp -1;

 

does the following for each observation:

  1. gets the ratio of current vs single lag of gdp
  2. gets its natural log
  3. gets the average of that log for 13 centered observations, with all the weight on future obs
  4. exponeniates that average
  5. substracts one

which gives the simple monthly growth rate which, when compounded over the next 6 obs, would yield the compounded rate using the actual gdp data.  The program also generates the arithmetic average of growth.

 

Also you don't want short historical or future windows, so there is a where   "centered_range=13" filter.  I.e. the only eligible observations will have 6 preceding and 6 following observations within the country series.

  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
vickyCh
Obsidian | Level 7

I wonder what would happen if the next specified date is out of the (gmean_future_rate_6) period. For example, if the first specified date is 1980/07/31 for country 222, then the next possible specified date will be after 1981/02/27 (including)?
Meanwhile, I wonder what would happen if there is one month gap between the specified date and the first month of (gmean_future_rate_6) period?
In addition, maybe my expression was not clear to understand. Actually, it was not necessary to add 100 to country 257 gdp.

vickyCh
Obsidian | Level 7

Furthermore, I wonder what would happen if the time period before and after the specified date is not the same.

For example, 6 months before the specified date and 3 months after the specified date.

 

mkeintz
PROC Star

There is nothing in my suggestion that requires the same number of preceding and succeeding months. If instead of 6 months future, you want 3 months future, then instead of a series of 13 weights centered on the current record, you only need 7 weight (3 before, 1 current and 3 following).   You then can change the where condition accordingly .

 

It's time to experiment.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2627 views
  • 1 like
  • 4 in conversation