data have;
input ID1 ID2 R I Seq ;
datalines;
1 11 1 . 1
2 11 0 1 2
1 143 0 1 1
1 22 1 . 1
2 22 1 . 2
3 22 1 . 1
4 22 1 . 2
1 165 0 1 1
1 164 0 1 1
1 166 0 1 1
1 33 0 1 1
2 33 1 . 2
3 33 1 . 3
4 33 1 . 4
;
proc sort data = have;
by ID2;
run;
data want;
set have;
by ID2;
if I = 1 and lag(R) =1 the Rnew = 1;
Else Rnew = 0;
run;
Want
To create a new field called Rnew where if I = 1 then Rnew = Lag(R)
I sorted the data by ID2 so that this rule applies to only ID2 and changes everytime ID2 changes - but its not working
For ID2 = 11 SAS is reading the R field for the next ID2 (143) and ignoring the lag function and shows RNew = 1 when infact it should be 0;
Any ideas what am I doing incorrect and how do I tell SAS to apply this rule for a set of IDs only.
Thanks
WANT
ID1 | ID2 | R | I | Seq | Rnew |
1 | 11 | 1 | 1 | 0 | |
2 | 11 | 0 | 1 | 2 | 0 |
1 | 22 | 1 | 1 | 0 | |
3 | 22 | 1 | 1 | 0 | |
2 | 22 | 1 | 2 | 0 | |
4 | 22 | 1 | 2 | 0 | |
1 | 33 | 0 | 1 | 1 | 1 |
2 | 33 | 1 | 2 | 0 | |
3 | 33 | 1 | 3 | 0 | |
4 | 33 | 1 | 4 | 0 | |
1 | 143 | 0 | 1 | 1 | 0 |
1 | 164 | 0 | 1 | 1 | 0 |
1 | 165 | 0 | 1 | 1 | 0 |
1 | 166 | 0 | 1 | 1 | 0 |
Is that "want" data show the result of your code? You "want" code has at least one error:
by ID2 and seq;
AS the variable AND on your BY statement does not exist in the shown HAVE data set.
Second, without the "and" you will get an error of
ERROR: BY variables are not properly sorted on data set USER.HAVE.
So, please provide your example data in the form of a data step that will run to create your Have data such as.
data have; input ID1 ID2 R I Seq ; datalines; 1 11 1 . 1 2 11 0 1 2 1 143 0 1 1 1 22 1 . 1 2 22 1 . 2 3 22 1 . 1 4 22 1 . 2 1 165 0 1 1 1 164 0 1 1 1 166 0 1 1 1 33 0 1 1 2 33 1 . 3 3 33 1 . 4 4 33 1 . 2 ;
plus the actual code used to sort the data as your shown "want" data set does not show data sorted by Id2 Seq.
And a similar data set to show the desired result.
Thank you I fixed my question and reposted it with the code.
Sorting is fine its just the if statement that I need help with, How do I tell SAS to stop at each set of ID2.
Does your reporting of data set WANT really represent your desired output? I ask because it does not meet my understanding of your requirement.
I suggest:
data have;
input ID1 ID2 R I Seq ;
datalines;
1 11 1 . 1
2 11 0 1 2
1 143 0 1 1
1 22 1 . 1
2 22 1 . 2
3 22 1 . 1
4 22 1 . 2
1 165 0 1 1
1 164 0 1 1
1 166 0 1 1
1 33 0 1 1
2 33 1 . 2
3 33 1 . 3
4 33 1 . 4
;
proc sort data = have;
by ID2;
run;
data want;
set have;
by ID2;
rnew=ifn(i=1 and first.id2=0,lag(r),0);
run;
The statement
rnew=ifn(i=1 and first.id2=0,lag(r),0);
probably produces the result you describe, while the code you provide:
if I = 1 and lag(R) =1 then Rnew = 1;
Else Rnew = 0;
probably does not. That's for two reasons:
What you probably want is to update the queue with every observation, but return that update to the RNEW result only when the if condition is met. Unlike the IF … THEN … statement, the IFN function always updates both of the possible results (i.e. both LAG(r) and 0). And it returns LAG(r) when I=1 and first.ID2=0, and returns zero otherwise.
Whenever I see or use the LAG function, in my mind I substitute the term UFQ (update-fifo-queue), which I find to be a useful way to recognize what the function is actually doing.
Also:
BTW, it looks like your dataset HAVE is already grouped by ID2, even though it is not SORTED by ID2, you could avoid the PROC SORT. Just change the "BY ID2" statement to "BY ID2 NOTSORTED".
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.