My dataset looks like this.
pagenum
51
51
51
51
52
52
157
157
158
158
158
158
Desired output:
pagenum
51
51
51
51
51
51
157
157
157
157
157
157
my code:
proc sort;by pagenum;run;
data itemqc;
set itemqc;
by pagenum;
if lag(pagenum)=sum(pagenum,-1,0) then do;;pagenum=lag(pagenum);end;
run;
This is not working. Can you please help me either using lag function or retain.
thanks very much
You should not use conditionals with a lag() function. Clearly states it in the manual.
Something like:
proc sort data=itemqc; by pagenum; run; data itemqc; set itemqc (rename=(pagenum=oldpge)); retain pagenum; by oldpge; if _n_=1 then pagenum=oldpge; else if first.oldpge and oldpge ne pagenum+1 then pagenum=oldpge; run;
Note not tested - post test data in the form of a datastep in future, I am not here to type in test data!
You should not use conditionals with a lag() function. Clearly states it in the manual.
Something like:
proc sort data=itemqc; by pagenum; run; data itemqc; set itemqc (rename=(pagenum=oldpge)); retain pagenum; by oldpge; if _n_=1 then pagenum=oldpge; else if first.oldpge and oldpge ne pagenum+1 then pagenum=oldpge; run;
Note not tested - post test data in the form of a datastep in future, I am not here to type in test data!
thank you very much for your quick reply. It works perfectly. I had a feeling I was on a wrong track. 🙂
Many thanks for the solution.
data have;
input pagenum;
cards;
51
51
51
51
52
52
157
157
158
158
158
158
;
data want;
set have;
by pagenum;
retain k k1;
if first.pagenum then do;
k=dif(pagenum);
k1=ifn(k=1,lag(pagenum),k1);
end;
if k=1 then pagenum=k1;
drop k:;
run;
Seems like you could do something like:
if pagenum=sum(lag(pagenum),1) then pagenum=pagenum-1;
Well, that would break for first record if it has pagenum=1.
It also doesn't handle 'chaining', i.e. it's not clear what you would want to do with data like:
51 51 52 53 117
It would recode 53 to 52, not 51.
Conditional lagging works, but the key point to know is that LAG() does not return a value from the previous record. It returns a value from a queue. So using conditional lagging can be tricky. It's a common problem. One way to avoid this complexity is to use the LAG() function unconditionally, to create a temporary variable, e.g.:
lagPagenum=lag(Pagenum) ;
if lagPagenum=sum(pagenum,-1,0) then do;
pagenum=lagPagenum;
end;
drop lagPagenum;
By the way, it the 0 you have as an argument in the sum function shouldn't be doing anything. Sometimes people add a 0 argument to the sum function like y=sum(x,0); so that it will return 0 if X is missing. But in this case you already have an -1 argument, so adding the 0 argument is unnecessary.
Thank you everyone for the response.
This solution by RW9 worked perfectly.
proc sort data=itemqc;
by pagenum;
run;
data itemqc;
set itemqc (rename=(pagenum=oldpge));
retain pagenum;
by oldpge;
if _n_=1 then pagenum=oldpge;
else if first.oldpge and oldpge ne pagenum+1 then pagenum=oldpge;
run;
Not too blow my own trumpet, but if my solution was correct, please mark my post as correct by clicking the button.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.