BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
drip_
Obsidian | Level 7

Hello,

I'm trying to take variables from future years and insert them into the same row of data as the current year's value, i.e. take dividends from 2019, 2020 and 2021 and add them to the data row of fiscal year 2018.

I'm trying to do it via the Macro below, my problem is that for some reason in the output, only i=3 is actually present, i=1 and i=2 are not there. In my example I would have the row for fiscal year 2018 and would have the 2021 dividends, but not those from 2019 and 2020.

How can I modify my code to not have this problem?

Thanks in advance!

%Macro Leadvariables (var=, dataset=);

	%Do i = 1 %to 3;

		Proc SQL;
			Create table &dataset. as
			select a.*,
			       b.&var. as F&i._&var. label = " "
			      
			From 	  &dataset. as a
			left join &dataset. as b
On a.gvkey = b.gvkey and a.fyear = b.fyear - &i.; Quit; %End; %Mend;

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

You are processing the entire dataset three times, once for each desired forward period.  I'd suggest processing the data once, in a data step, using an array to build up a complete history of dividends, and then subsequently retrieve those dividends 

 

BTW - variables FYEAR?  GVKEY?   You are apparently using Compustat data.   And it is probably sorted by GVKEY/DATADATE (almost, but not quite the same as sorted by GVKEY/FYEAR).   I assume you are looking at annual data.  So here is a DATA step to take advantage of the fact that the data are sorted by GVKEY (it won't matter what the order is within GVKEY).

 

Untested in the absence of sample data in a working data step:  

 

%let minyear=1998;
%let maxyear=2021;
%let upper_bound=%eval(&maxyear+3);


data want (drop=_:);

  array _v1_hist{&minyear:&upper_bound} ;  /*dividend history*/
  array _v2_hist{&minyear:&upper_bound} ;  /*some other history*/

  do until (last.gvkey);     /*Build history of variables of interest */
    set have;
    by gvkey;
    _v1_hist{fyear}=var1;
    _v2_hist{fyear}=var2;
  end;

  /* Now reread the data and fetch values from future years*/
  do until (last.gvkey);     
    set have;
    by gvkey;
    var1_f1=_v1_hist{fyear+1};
    var2_f1=_v2_hist{fyear+1};
    var1_f2=_v1_hist{fyear+2};
    var2_f2=_v2_hist{fyear+2};
    var1_f3=_v1_hist{fyear+3};
    var2_f3=_v2_hist{fyear+3};
    output;
  end;
run;

The arrays are indexed by fiscal year.  You can modify the macrovars minyear and maxyear for your data.   This code makes the upper bound of the array equivalent to maxyear+3, so as to avoid an error when retrieving a value for three years after 2021 in this case.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

4 REPLIES 4
Quentin
Super User

Hi,

 

If you turn on options MPRINT and look at the log, I think you'll see that the macro did generate all three PROC SQL steps, as you expect, and hopefully they all ran without any errors.

 

The problem is you're using the CREATE TABLE statement which does not insert data into an existing dataset, it creates a dataset.  If that dataset already exists, it over-writes the dataset.  Also a problem is that you are over-writing the same dataset you read in.

 

The easiest way in SAS to add rows to an existing dataset is to use PROC APPEND.

 

That said, if you can post a little example of the data you have, and the data you want, people might be able to help with other approaches.  For example, it sounds like you're trying to do a 'look-ahead'?  Generally the structure of data you have (with a year variable) is better than having separate variables for Y2019dividend, Y2020dividend, Y2021dividend..

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
tarheel13
Rhodochrosite | Level 12

can you post some sample data as datalines? it looks like you're joining the same table to itself. not sure if you meant to do that.

PaigeMiller
Diamond | Level 26

When creating macros, you really ought to first have working code without macros and without macro variables that does what you want for 2 instances (in this case 2 years). I'm guessing you didn't do that, and so it is difficult to get code with macros and with macro variables to work.

 

If you did write the code without macros and without macro variables and it works properly for two years, it should be simple to turn this code into a macro. So, please show us the working code without macros and without macro variables and it works properly for two years.

 

NOTE: Most people ignore this advice and continue to struggle writing macros that work. Don't be one of those people.

--
Paige Miller
mkeintz
PROC Star

You are processing the entire dataset three times, once for each desired forward period.  I'd suggest processing the data once, in a data step, using an array to build up a complete history of dividends, and then subsequently retrieve those dividends 

 

BTW - variables FYEAR?  GVKEY?   You are apparently using Compustat data.   And it is probably sorted by GVKEY/DATADATE (almost, but not quite the same as sorted by GVKEY/FYEAR).   I assume you are looking at annual data.  So here is a DATA step to take advantage of the fact that the data are sorted by GVKEY (it won't matter what the order is within GVKEY).

 

Untested in the absence of sample data in a working data step:  

 

%let minyear=1998;
%let maxyear=2021;
%let upper_bound=%eval(&maxyear+3);


data want (drop=_:);

  array _v1_hist{&minyear:&upper_bound} ;  /*dividend history*/
  array _v2_hist{&minyear:&upper_bound} ;  /*some other history*/

  do until (last.gvkey);     /*Build history of variables of interest */
    set have;
    by gvkey;
    _v1_hist{fyear}=var1;
    _v2_hist{fyear}=var2;
  end;

  /* Now reread the data and fetch values from future years*/
  do until (last.gvkey);     
    set have;
    by gvkey;
    var1_f1=_v1_hist{fyear+1};
    var2_f1=_v2_hist{fyear+1};
    var1_f2=_v1_hist{fyear+2};
    var2_f2=_v2_hist{fyear+2};
    var1_f3=_v1_hist{fyear+3};
    var2_f3=_v2_hist{fyear+3};
    output;
  end;
run;

The arrays are indexed by fiscal year.  You can modify the macrovars minyear and maxyear for your data.   This code makes the upper bound of the array equivalent to maxyear+3, so as to avoid an error when retrieving a value for three years after 2021 in this case.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 455 views
  • 3 likes
  • 5 in conversation