My dataset looks like this.
pagenum
51
51
51
51
52
52
157
157
158
158
158
158
Desired output:
pagenum
51
51
51
51
51
51
157
157
157
157
157
157
my code:
proc sort;by pagenum;run;
data itemqc;
set itemqc;
by pagenum;
if lag(pagenum)=sum(pagenum,-1,0) then do;;pagenum=lag(pagenum);end;
run;
This is not working. Can you please help me either using lag function or retain.
thanks very much
You should not use conditionals with a lag() function. Clearly states it in the manual.
Something like:
proc sort data=itemqc; by pagenum; run; data itemqc; set itemqc (rename=(pagenum=oldpge)); retain pagenum; by oldpge; if _n_=1 then pagenum=oldpge; else if first.oldpge and oldpge ne pagenum+1 then pagenum=oldpge; run;
Note not tested - post test data in the form of a datastep in future, I am not here to type in test data!
You should not use conditionals with a lag() function. Clearly states it in the manual.
Something like:
proc sort data=itemqc; by pagenum; run; data itemqc; set itemqc (rename=(pagenum=oldpge)); retain pagenum; by oldpge; if _n_=1 then pagenum=oldpge; else if first.oldpge and oldpge ne pagenum+1 then pagenum=oldpge; run;
Note not tested - post test data in the form of a datastep in future, I am not here to type in test data!
thank you very much for your quick reply. It works perfectly. I had a feeling I was on a wrong track. 🙂
Many thanks for the solution.
data have;
input pagenum;
cards;
51
51
51
51
52
52
157
157
158
158
158
158
;
data want;
set have;
by pagenum;
retain k k1;
if first.pagenum then do;
k=dif(pagenum);
k1=ifn(k=1,lag(pagenum),k1);
end;
if k=1 then pagenum=k1;
drop k:;
run;
Seems like you could do something like:
if pagenum=sum(lag(pagenum),1) then pagenum=pagenum-1;
Well, that would break for first record if it has pagenum=1.
It also doesn't handle 'chaining', i.e. it's not clear what you would want to do with data like:
51 51 52 53 117
It would recode 53 to 52, not 51.
Conditional lagging works, but the key point to know is that LAG() does not return a value from the previous record. It returns a value from a queue. So using conditional lagging can be tricky. It's a common problem. One way to avoid this complexity is to use the LAG() function unconditionally, to create a temporary variable, e.g.:
lagPagenum=lag(Pagenum) ;
if lagPagenum=sum(pagenum,-1,0) then do;
pagenum=lagPagenum;
end;
drop lagPagenum;
By the way, it the 0 you have as an argument in the sum function shouldn't be doing anything. Sometimes people add a 0 argument to the sum function like y=sum(x,0); so that it will return 0 if X is missing. But in this case you already have an -1 argument, so adding the 0 argument is unnecessary.
Thank you everyone for the response.
This solution by RW9 worked perfectly.
proc sort data=itemqc;
by pagenum;
run;
data itemqc;
set itemqc (rename=(pagenum=oldpge));
retain pagenum;
by oldpge;
if _n_=1 then pagenum=oldpge;
else if first.oldpge and oldpge ne pagenum+1 then pagenum=oldpge;
run;
Not too blow my own trumpet, but if my solution was correct, please mark my post as correct by clicking the button.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.