BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
lioradam
Obsidian | Level 7

Hello all,

I'm trying to write a code that will calculate the column's colores in yellow in the attached file.

I tryed the following SAS  code :

 

if ID_number is not equal to ID_number_lag

than sales_in_first_year_in_the_ job=salse

else sales_in_first_year_in_the_ job=lag(sales_in_first_year_in_the_ job)

 

But I didn't receive the expected results.

Does anyone have an idea for codes (for the two colums)?

Thanks in advance!

Lior

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

@lioradam wrote:

I wonder if I can adjust the code in order to receive a column for sales in the second year from the start of the group and a column for sales in the second year from the end of the group (instead of column for sales in the first year and in the last year).


Yes, this is possible. To be more flexible in view of potential future requirements (sales in third year, etc.) you could write the SALES values (of one ID at a time) to a temporary array and then select what you need from the array between the two DOW-loops:

data want(drop=_m);
array s[99] _temporary_;
do _n_=1 by 1 until(last.id);
  set have;
  by id notsorted;
  s[n_yrs]=sales;
  _m=max(_m,n_yrs);
end;
sales_second=s[2];
if _m>=2 then sales_lastbut1=s[_m-1];
do _n_=1 to _n_;
  set have;
  output;
end;
call missing(of s[*]);
run;

This code does no longer rely on the order of observations within the BY groups.

View solution in original post

7 REPLIES 7
FreelanceReinh
Jade | Level 19

Hello @lioradam,

 

One idea is to use a double DOW-loop:

data want;
do _n_=1 by 1 until(last.id);
  set have;
  by id notsorted;
  if n_yrs=1 then sales_first=sales;
  if last.id then sales_last=sales;
end;
do _n_=1 to _n_;
  set have;
  output;
end;
run;

This assumes that the input dataset (HAVE) is grouped by ID and that the last observation within each BY group contains the SALES value for the last year in the job. (If needed, the code could be modified to determine the observation with the highest number of years in the job per ID based on variable N_YRS rather than relying on ascending sort order.) Variable SALES_LAST could also be set unconditionally.

 

In the first loop the new variables are populated. The second loop writes them to dataset WANT, together with the content of HAVE.

lioradam
Obsidian | Level 7

Hi,

Thank you very much!

Its works perfectly.

Since this code is new to me,  I wonder if I can adjust the code in order to receive a column for sales in the second year from the start of the group and a column for sales in the second year from the end of the group (instead of column for sales in the first year and in the last year).

 

Kind Regards,

Lior

 

FreelanceReinh
Jade | Level 19

@lioradam wrote:

I wonder if I can adjust the code in order to receive a column for sales in the second year from the start of the group and a column for sales in the second year from the end of the group (instead of column for sales in the first year and in the last year).


Yes, this is possible. To be more flexible in view of potential future requirements (sales in third year, etc.) you could write the SALES values (of one ID at a time) to a temporary array and then select what you need from the array between the two DOW-loops:

data want(drop=_m);
array s[99] _temporary_;
do _n_=1 by 1 until(last.id);
  set have;
  by id notsorted;
  s[n_yrs]=sales;
  _m=max(_m,n_yrs);
end;
sales_second=s[2];
if _m>=2 then sales_lastbut1=s[_m-1];
do _n_=1 to _n_;
  set have;
  output;
end;
call missing(of s[*]);
run;

This code does no longer rely on the order of observations within the BY groups.

lioradam
Obsidian | Level 7

Thank you!

Lior

Kurt_Bremser
Super User

First of all, get rid of the five (or so) syntax errors you managed to put into three lines of code.

Hint: "is not equal to" is not a valid SAS comparison operator. See SAS Operators 

Next, never call the LAG function in a conditional branch, as that will prevent the FIFO queue from being filled in each data step iteration.

Third, please provide usable example data in a data step with datalines and show the complete log of the step you ran (all code and messages).

 

PaigeMiller
Diamond | Level 26

You don't say what you are trying to calculate. Sum by group? Cumulative sum by group? Mean by group? Std Deviation by group? Kurtosis?

 

--
Paige Miller
lioradam
Obsidian | Level 7

Hi,

Just the value of the sales in the second year.

This means that there will be a column called "sales in the second year", in which in the first four rows the value will be 102, in the next three rows the value will be 40 and in the last six rows, the value will be 1352.

 

Regards,

Lior

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1001 views
  • 0 likes
  • 4 in conversation