DATA Step, Macro, Functions and more

2 Questions about the following code. firstobs and one macro varaible.

Reply
Contributor
Posts: 22

2 Questions about the following code. firstobs and one macro varaible.

[ Edited ]

 

data _CALDATES;
   merge CRSP.DSIX (keep=CALDT rename=(CALDT=ESTPER_BEG))								
   crsp.asix (keep=CALDT firstobs=%eval(&estper) rename=(CALDT=ESTPER_END))					
																					
  																							
   CRSP.DSIX (keep=CALDT firstobs=%eval(&estper+&gap+1) rename=(CALDT=EVTWIN_BEG))		
   CRSP.DSIX (keep=CALDT firstobs=%eval(&estper+&gap-&start+1) rename=(CALDT=&evtdate))
   CRSP.DSIX (keep=CALDT firstobs=%eval(&estper+&gap+&evtwin) rename=(CALDT=EVTWIN_END));
   format ESTPER_BEG ESTPER_BEG EVTWIN_BEG &evtdate EVTWIN_END date9.;
   if nmiss(ESTPER_BEG,ESTPER_BEG,EVTWIN_BE,EVTWIN_END,&evtdate)=0;							
   time+1;																					
  run;
 %put ### DONE!;

All the macro variables(ex, %estper, &gap, etc) are either date or duration. Although this code worked, I could not quite understand the rol of firstobs. From what I can find, it was mostly used to limit the shown results but I think the role here is quite different from that. 

 

 

Also, I was not quite sure about the role of &evtdate. According to the original script's description, EVTDATE represents the name of the event date variable in INSET dataset. So, I put one of the event date variables caldt, but when I put this variable I see the log saying.

 

NOTE: Variable CALDT is uninitialized.
NOTE: There were 23787 observations read from the data set CRSP.DSIX.
NOTE: There were 23678 observations read from the data set CRSP.DSIX.
NOTE: There were 23647 observations read from the data set CRSP.DSIX.
NOTE: There were 23646 observations read from the data set CRSP.DSIX.
NOTE: There were 23645 observations read from the data set CRSP.DSIX.
NOTE: The data set WORK._CALDATES has 0 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 2.49 seconds
cpu time 0.40 seconds

 

The included code is one part of the macro and I was testing each of small parts so that I can understand how the whole macro works. The macro variable &evtdate is used several times after this part of code, but I could not see why when &evtdate is replaced with CALDT, which is one date variable from the dataset, is not working.  

 

Thank you all the time for sharing your opinion SAS community!

 

 

Super User
Posts: 19,867

Re: 2 Questions about the following code. firstobs and one macro varaible.

Posted in reply to Leon_Seungmin

Firstobs linits the data. Test it on a sample dataset. 

 

Data want; 

set sashelp.class(firstobs=5);

run;

 

Regarding macro variable, what does it resolve to? A date or duration wouldn't make sense. I would expect it to be a variable name. 

Trusted Advisor
Posts: 1,584

Re: 2 Questions about the following code. firstobs and one macro varaible.

Posted in reply to Leon_Seungmin

Output has 0 observations probably because of:

      

if nmiss(ESTPER_BEG,ESTPER_BEG,EVTWIN_BE,EVTWIN_END,&evtdate)=0;	

which means that, at least one of the variables is a missing value (then nmiss() is > 0) 

the IF  <condition> ; makes a selection to output.

Super User
Posts: 7,854

Re: 2 Questions about the following code. firstobs and one macro varaible.

Posted in reply to Leon_Seungmin

I strongly suspect that &evtdate. contains CALDT; since you renamed CALDT to something else in all input dataset options, SAS will automatically create a new variable CALDT inside the data step, which will always be missing. See the respective

NOTE: Variable CALDT is uninitialized.

Therefore the result of your nmiss function will never be less than 1.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Trusted Advisor
Posts: 1,584

Re: 2 Questions about the following code. firstobs and one macro varaible.

Posted in reply to KurtBremser

within your code you are reading the same input dataset 5 times.

each time you create a new variable equal to CALDT, from different group of observations,

starting with the calculated FIRSTOBS to the end of the dataset..

 

while executing 

       if nmiss(ESTPER_BEG,ESTPER_BEG,EVTWIN_BE,EVTWIN_END,&evtdate)=0;

you are looking for the common group (cutting group / inner join group) among all groups;

 

I think it can be done more efficiently by calculating the resulting group size and

reading this amount of observations only, instead reading up to end of the dataset.

 

you may try next code:

 

/* calculate size of desired group */

data _NULL_;

       id = open('CRSP.DSIX ');

              num1 = attrn(id, 'NOBS');       /* dataset number of observations */

       id = close(id);

       num2 = &estper;                            /* firstobs to get ESTPER_END */

       num3 = &estper+&gap+1;              /* firstobs to get EVTWIN_BEG */

       num4 = &estper+&gap-&start+1;    /* firstobs to get &evtdate         */

       num5 = &estper+&gap+&evtwin;    /* firstobs to get EVTWIN_END */

 

       group_start = max(1 , num2, num3, num4, num5);

       group_last  = min( num1, num2, num3, num4, num5);  

       group_size = group_last - group_start +1);

      put ">>> Resulting group size is " obs " observations";

 

      if group_size > o then

         call symput('OBS',left(group_size ));

     else do;

             call symput('OBS', '0');

             put '>>> Zero observations expected. program not run ';

             ABORT;

     end;

run;

 

 

/* using size group in your code */

 

data _CALDATES;
merge CRSP.DSIX (keep=CALDT rename=(CALDT=ESTPER_BEG)  obs = &obs )
crsp.asix (keep=CALDT firstobs=%eval(&estper) rename=(CALDT=ESTPER_END)  obs = &obs )


   /* ??? why have left those gap rows - are there other input lines ???  */

CRSP.DSIX (keep=CALDT firstobs=%eval(&estper+&gap+1) rename=(CALDT=EVTWIN_BEG)  obs = &obs)
CRSP.DSIX (keep=CALDT firstobs=%eval(&estper+&gap-&start+1) rename=(CALDT=&evtdate)  obs = &obs)
CRSP.DSIX (keep=CALDT firstobs=%eval(&estper+&gap+&evtwin) rename=(CALDT=EVTWIN_END  obs = &obs));


format ESTPER_BEG ESTPER_BEG EVTWIN_BEG &evtdate EVTWIN_END date9.;
if nmiss(ESTPER_BEG,ESTPER_BEG,EVTWIN_BE,EVTWIN_END,&evtdate)=0;  /*** this line may be unnecessary */
time+1;
run;
%put ### DONE!;

        

Contributor
Posts: 22

Re: 2 Questions about the following code. firstobs and one macro varaible.

Thank you so much for your input! I was away for 2 days and I will check all the comments here and put feedback! Smiley Happy

 

Thanks a lot!!

Contributor
Posts: 22

Re: 2 Questions about the following code. firstobs and one macro varaible.

[ Edited ]
Posted in reply to Leon_Seungmin

I did a few test runs and found out the the meaing of firstobs and why CALDT was not working when it was put in the macro variable, &evtdate. 

 

I found that when I put firstobs=%eval(1), the data variable started from the first date it is available and when I put %eval(2) it started from the second date. I do not know why it works like that even after reading firstobs and %eval syntax(I would appreciate if someone can tell me where to look to understand it thoroughly), at least I know what this code means Smiley Happy

 

Also, CALDT did not work because I did not created CALDT variable in the merged data. So I just added one more line

   CRSP.DSIX (keep=CALDT firstobs=%eval(1)); 

it worked fine.

 

While I was testing this code, I found that when there are less than 4 argument after format, somehow it did not work. Once again, I tried to find the syntax of format, but I could not find it. This happens when the name of statement is not unique and I am not sure what to do when this happens.

 

For now, the best thing I can do when I cannot find proper explanation appears to be asking help in this SAS community or doing a small test so that I can figure out how the code works. Apart from these two approches, if there are other things I can do, please let me know!

 

Thank you very much for all of your help.  

Trusted Advisor
Posts: 1,584

Re: 2 Questions about the following code. firstobs and one macro varaible.

Posted in reply to Leon_Seungmin

I would like yo clarify usage of FIRSTOBS and %EVAL().

Run the next two exersizes:

 

(1).

data test;

     do var= 1 to 5;

          x='Row No.';

          output;

     end;

run;

 

data test1(firstobs=2);  /* no need to use %eval as firstobs is given by literal */

     set test;

run;

 

(2).

%mcaro check;

    %let x = 5;

    %let y=7;

    %let A = &x  + &y;                %put A=&a;

    %let B= %eval(&x + &y);      %put B=&b;

%mend;

%check;

Ask a Question
Discussion stats
  • 7 replies
  • 259 views
  • 0 likes
  • 4 in conversation