BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
abdulla
Pyrite | Level 9

Hi,

My data are sorted by gvkey and year. I ran the following code. 

Data want;

set have;

if first.gvkey then sales_growth=0;

else sales_growth=log((1+sale)/lag((1+sale)));

run;

I get sales growth number as expected for the whole data set except the following. 

Gvkey   year     sale    sales_growth

1004    2000    28796        0     

1004    2001    34697          .

1004    2002    36784      0.0601

 

My question is why I am getting missing value in the 2nd year? In my data set 1004 is the first gvkey. The problem is happening only for the first gvkey. For other gvkey I am not having this problem. Can anyone help me please?  

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
1 at 3647:19 1 at 3647:32

This is telling you there is a problem somewhere. That at line 3647 column 32 a missing value is produced. What happens at column 32 of line 3647 ... you are evaluating the LAG() function.

 

The LAG() function doesn't work the way most people think it works. I think you need the LAG() function to be evaluated on every record, not just the records where first.gvkey is zero (because your code only evaluates the LAG() function when first.gvkey is zero).

 

So, this should fix the issue, by evaluating LAG() on every record.

 

data comp33;
set comp3;
by gvkey fy;
zz=lag(1+wsale);
if first.gvkey then sales_growth=0;
else sales_growth=log((1+wsale)/zz);
run;

 

--
Paige Miller

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

Well, you didn't run that exact code. There's no variable "sale" in the data you show, despite what the code you show says. And of course, that code won't run unless its in a DATA step, so let's see the entire DATA step. So, show us the entire LOG (code plus notes, warnings, errors) for this DATA step. Paste the log into the window that appears when you click on the </> icon so as to preserve the formatting and make it easier for us to read.

 

Also, show us a portion of the actual data, where the variable names match the variable names in the code.

--
Paige Miller
abdulla
Pyrite | Level 9
I have edited the code and I don't have any error in the log.
PaigeMiller
Diamond | Level 26

I don't want to see "edited" code. I want to see the actual code you are running. And show us the LOG anyway, even if there are no errors or warnings. 

--
Paige Miller
abdulla
Pyrite | Level 9


3643 Data comp33;
3644 set comp3;
3645 by gvkey fy;
3646 if first.gvkey then sales_growth=0;
3647 else sales_growth=log((1+wsale)/lag((1+wsale)));
3648 run;

NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
1 at 3647:19 1 at 3647:32
NOTE: There were 110195 observations read from the data set WORK.COMP3.
NOTE: The data set WORK.COMP33 has 110195 observations and 43 variables.
NOTE: DATA statement used (Total process time):
real time 0.11 seconds
cpu time 0.07 seconds

 

 

By the way, I don't have any missing observation in my first gvkey and  problem is I am missing value for the first gvkey only

PaigeMiller
Diamond | Level 26
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
1 at 3647:19 1 at 3647:32

This is telling you there is a problem somewhere. That at line 3647 column 32 a missing value is produced. What happens at column 32 of line 3647 ... you are evaluating the LAG() function.

 

The LAG() function doesn't work the way most people think it works. I think you need the LAG() function to be evaluated on every record, not just the records where first.gvkey is zero (because your code only evaluates the LAG() function when first.gvkey is zero).

 

So, this should fix the issue, by evaluating LAG() on every record.

 

data comp33;
set comp3;
by gvkey fy;
zz=lag(1+wsale);
if first.gvkey then sales_growth=0;
else sales_growth=log((1+wsale)/zz);
run;

 

--
Paige Miller
abdulla
Pyrite | Level 9
Paige Miller
Thank you very much. It works.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 699 views
  • 1 like
  • 2 in conversation