I Have a dataset named Details with the following columns:
Date | Ind | Stacks | Nets | Start | End |
01st May 2020 | Y | 10 | 4 | 9 | 13 |
2nd May 2020 | Y | 5 | 6 | ||
3rd May 2020 | Y | 4 | 7 | ||
4th May 2020 | Y | 3 | 9 | ||
5th May 2020 | Y | 2 | 5 |
I want to create the values of Start and End in the following table using previous day values. Start will be Previous day end and End will be start + nets and need to iterate this for all days using proc data step. The table needs to look like this:
Date | Ind | Stacks | Nets | Start | End |
01st May 2020 | Y | 10 | 4 | 9 | 13 |
2nd May 2020 | Y | 5 | 6 | 13 | 19 |
3rd May 2020 | Y | 4 | 7 | 26 | 33 |
4th May 2020 | Y | 3 | 9 | 33 | 42 |
5th May 2020 | Y | 2 | 5 | 42 | 47 |
I tried the following code with no success and needs help in determining what I did wrong:
data FinalTable;
set Details ;
by Date ;
retain IND STACK NETS START END
;
if first. date then do;
IND = IND;
STACK = STACKS;
NETS = NETS;
START = START;
END = END;
end;
if last. date then do;
END =START + NETS;
output;
START=END;
end;
run;
proc print; run;
data want;
set details;
retain newend;
if _n_=1 then do;
newstart=start;
newend=end;
end;
else if _n_>1 then do;
newstart=newend;
newend=newstart+nets;
end;
run;
Assumes you have fixed the code as indicated by @Tom
/* UNTESTED CODE */
data want;
set have;
prev_end=lag(end);
if missing(start) then start=prev_end;
if missing(end) then end=start+nets;
drop prev_end;
run;
If you want tested code, please provide the data as SAS data step code (instructions), and not as screen captures or Excel.
Please provide data as working SAS data step code, as I explained.
@arjunaj91 wrote:
data Details;
input Date :date. IND $ STACK NETS START END ;
format date date9.;
datalines;
01May-2020 Y 10 4 9 13
02May-2020 Y 5 6
03May-2020 Y 4 7
04May-2020 Y 3 9
05May-2020 Y 2 5
;
That data step does not appear to work properly.
2449 data Details; 2450 input Date :date. IND $ STACK NETS START END ; 2451 format date date9.; 2452 datalines; NOTE: Invalid data for START in line 2455 1-10. NOTE: Invalid data for END in line 2455 12-12. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 2455 03May-2020 Y 4 7 Date=02MAY2020 IND=Y STACK=5 NETS=6 START=. END=. _ERROR_=1 _N_=2 NOTE: Invalid data for START in line 2457 1-10. NOTE: Invalid data for END in line 2457 12-12. 2457 05May-2020 Y 2 5 Date=04MAY2020 IND=Y STACK=3 NETS=9 START=. END=. _ERROR_=1 _N_=3 NOTE: SAS went to a new line when INPUT statement reached past the end of a line. NOTE: The data set WORK.DETAILS has 3 observations and 6 variables.
The last four lines only have four values on them.
Did you mean to use a data step like this instead?
data Details;
input Date :date. IND $ STACK NETS START END ;
format date date9.;
datalines;
01May-2020 Y 10 4 9 13
02May-2020 Y 5 6 . .
03May-2020 Y 4 7 . .
04May-2020 Y 3 9 . .
05May-2020 Y 2 5 . .
;
data want;
set details;
retain newend;
if _n_=1 then do;
newstart=start;
newend=end;
end;
else if _n_>1 then do;
newstart=newend;
newend=newstart+nets;
end;
run;
Assumes you have fixed the code as indicated by @Tom
It does no good to RETAIN variables that are sourced from an input dataset. First it is redundant as such variables are already not reset to missing at the start of each data step iteration (which is what RETAIN actually means). But mainly because the values are overwritten by the values read from the input data.
You need to define NEW variables to retain the calculated new values of START and END.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.