Good morning.
The below dataset I have all but the Wanted column. That is what I require. So for every first.reference the Wanted should be 1. For the next reference, if it is the same value as the reference in the line above, check the FLAG value. If it is the same as the line above, it should be the same as the Wanted value above. If the FLAG value is different from the line above (i.e. from 0 to 1 or 1 to 0), the WANTED value should be lag(Wanted )+ 1. DO that until the next Reference and if it is the first reference start again.
I have tried so many pieces of code. The lag creates a blank. Also the lag when using a loop seems to keep an old version of the lag. I am just unsure as to where to start...
Ta,
Arch
Month_A | Month_B | Reference | Date_C | FLAG | Wanted |
Nov-23 | Jan-24 | A | 2/11/2023 | 1 | 1 |
Oct-22 | Jul-23 | B | 3/10/2022 | 1 | 1 |
Nov-23 | Jan-24 | B | 3/10/2022 | 0 | 2 |
May-23 | Jan-24 | C | 15/05/2023 | 1 | 1 |
Feb-23 | Sep-23 | D | 9/02/2023 | 1 | 1 |
Oct-23 | Jan-24 | D | 9/02/2023 | 0 | 2 |
May-23 | Dec-23 | E | 30/05/2023 | 1 | 1 |
Mar-23 | Mar-23 | F | 14/03/2023 | 1 | 1 |
Apr-23 | Jun-23 | F | 27/04/2023 | 1 | 1 |
Jul-23 | Jan-24 | F | 27/04/2023 | 0 | 2 |
May-23 | May-23 | G | 31/05/2023 | 1 | 1 |
Mar-23 | Apr-23 | H | 6/03/2023 | 1 | 1 |
May-23 | Nov-23 | H | 6/03/2023 | 0 | 2 |
Dec-23 | Jan-24 | H | 5/12/2023 | 1 | 3 |
No need to use LAG(). Just tell SAS to treat FLAG as one of the BY variables. Make sure to include the NOTSORTED keyword since the observations are not actually sorted.
data want;
set have;
by reference flag notsorted;
if first.reference then WANT=0;
WANT + first.flag;
run;
proc print;
var reference flag wanted want ;
run;
Result
Obs Reference FLAG Wanted WANT 1 A 1 1 1 2 B 1 1 1 3 B 0 2 2 4 C 1 1 1 5 D 1 1 1 6 D 0 2 2 7 E 1 1 1 8 F 1 1 1 9 F 1 1 1 10 F 0 2 2 11 G 1 1 1 12 H 1 1 1 13 H 0 2 2 14 H 1 3 3
If you really did want to use LAG() you could do that also.
For example if you are not positive if the observations are really in chronological order you could include DATE_C in the BY statement so SAS will check. Then use LAG() (or since it is numeric perhaps DIF()) to find when the FLAG variable changed. Make sure to execute the LAG() function on every observations so you are really comparing to the previous observations value.
data want;
set have;
by reference date_c;
if first.reference then WANT=0;
WANT + (first.reference or lag(flag) ne flag);
run;
No need to use LAG(). Just tell SAS to treat FLAG as one of the BY variables. Make sure to include the NOTSORTED keyword since the observations are not actually sorted.
data want;
set have;
by reference flag notsorted;
if first.reference then WANT=0;
WANT + first.flag;
run;
proc print;
var reference flag wanted want ;
run;
Result
Obs Reference FLAG Wanted WANT 1 A 1 1 1 2 B 1 1 1 3 B 0 2 2 4 C 1 1 1 5 D 1 1 1 6 D 0 2 2 7 E 1 1 1 8 F 1 1 1 9 F 1 1 1 10 F 0 2 2 11 G 1 1 1 12 H 1 1 1 13 H 0 2 2 14 H 1 3 3
If you really did want to use LAG() you could do that also.
For example if you are not positive if the observations are really in chronological order you could include DATE_C in the BY statement so SAS will check. Then use LAG() (or since it is numeric perhaps DIF()) to find when the FLAG variable changed. Make sure to execute the LAG() function on every observations so you are really comparing to the previous observations value.
data want;
set have;
by reference date_c;
if first.reference then WANT=0;
WANT + (first.reference or lag(flag) ne flag);
run;
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.