I'm tasked with answering the following (it appears partially correct based upon results though there may be a more optimal way to do this):
Each patient comes in for one or more visits (VISIT variable) to get their SBP (SBP variable) measured.
A new variable, MAXSBP, is the highest sbp measurement; if the sbp is greater than 300, it is considered invalid (missing).
Another variable, MAXVISIT, should correspond to when the maximum sbp was received.
An example of a scenario:
One patient comes in 3 times with the following measurements:
Visit # SBP
1 120
2 305
3 132
So we'd want the MAXSBP to be 132 when MAXVISIT is 2.
The code I have is as follows and gets me the MAXSBP; however, it's not correlating with the correct visit.
Any help much appreciated!
data sbp1;
set sbp;
if max(of sbp:) > 300 then call missing(of sbp:); /* this looks ok */
maxvisit = max(of visit:);
run;
proc print data = sbp1 (rename=(sbp=maxsbp));
var id maxsbp maxvisit;
run;
That would be more simple.
data have;
input ID visit sbp;
cards;
1348 1 200
1348 2 310
1348 3 .
1352 1 120
1352 2 142
1352 3 132
2152 1 .
2152 2 320
;
run;
data want;
set have ;
by id;
retain max MAXVISIT ;
if first.id then do;call missing(max);MAXVISIT=visit;end;
if SBP le 300 and max lt SBP then do;max=SBP;MAXVISIT=visit;end;
if last.id then output;
keep id max MAXVISIT;
run;
Xia Keshan
It appears that your data may be structured as:
patientid sbp1 sbp2 sb3 .....
and the 1, 2, or 3 indicates the visit, is that correct?
In the case of tie values for maximum sbp which one do you want counted?
/* step is to simulate some date in ranges from 200 to 325 if your measurement isn't whole numbers then we'll need a different approach */
data have;
array s sbp1-sbp20;
do patient = 1 to 10;
do j=1 to 20;
s
end;
output;
end;
drop j;
run;
data want;
set have;
array s sbp1-sbp20;
maxvisit=0; /* this variable will hold the number of the visit with the maximum value <= 300, if there are ties I think this is going to be
the earliest visit*/
do maxsbp= 300 to 0 by -1; /* maxsbp will be the largest value <=300*/
maxvisit= whichn(maxsbp,of s
/* this could work for decimals just change the by value to -0.1, -0.01 and such but could take much longer*/
if maxvisit>0 then leave;
end;
run;
I don't understand why MAXVISIT is 2 ? since MAXSBP is to be 132 .
data have; input Visit SBP; cards; 1 120 2 305 3 132 ; run; data want; set have end=last; retain max max2 MAXVISIT ; if SBP gt max then do;max=SBP;MAXVISIT=visit;end; if SBP le 300 then max2=max(max2,SBP); if last then output; keep max2 MAXVISIT; run;
Xia Keshan
Xia,
Thanks so much!
Here's what I'm still stuggling with though think it's very close - I tweaked your code slightly:
data sbp2;
set sbp1 end=last;
retain max maxsbp MAXVISIT ;
if SBP gt max then do;
max=SBP;
MAXVISIT=visit;
end;
if SBP le 300 then maxsbp=max(maxsbp,SBP);
if last then output;
keep newid sex maxsbp MAXVISIT;
run;
proc print data = sbp2 noobs;
var newid sex maxsbp maxvisit;
run;
The output I get is:
newid sex maxsbp maxvisit
9900 | M | 132 | 2 |
This is great in that it gives me the LAST subject's maxsbp and maxvisit and this subject 9900 had a maxsbp at visit 1 of 116 so is definitely picking correct visit of the two.
I'm just not getting a listing of the other subjects and their maxsbp and maxvisit.
Apologies...I should have been clearer with the data upfront.
So the input data set looks like:
ID visit sbp
125 1 122
13000 1 .
1312 1 116
2000 1 302
1384 2 124
...
And for the output data set I'm creating 2 new vars, maxsbp and maxvisit:
ID maxsbp maxvisit
125 122 1
13000 . 1
1312 116 1
2000 . 1 (set to missing b/c exceeds 300)
1384 124 2
Xia, I had a typo - you are correct - the maxsbp is 132 at visit 3.
When I try your code, it's not excluding the values >= 300 and like with the code I wrote above, I'm not always getting the correct visit (for ex. for ID 1384 I'm not getting visit 2, I'm getting a missing value).
Thanks so much for your help!
Hi,
I would suggest post some test data in the form of a datastep and required output:
proc sql;
create table WANT as
select *
from (select ID,VISIT,case when SBP > 300 then . else SBP end as SBP from HAVE where SBP < 300)
group by ID
having VISIT=max(VISIT);
quit;
Good point thanks...I've taken a small subset below...so you may see if anything ge 300, it disqualifies it for a "maxvisit" and a "maxvisit" may in fact return a missing value if the maxsbp is ge 300.
I'd like to try to not use PROC SQL; am seeing if I may be able to rework the code with retain above.
Many thanks for all your help!
Input data set:
ID visit sbp
13480 1 200
13480 2 310
13480 3 .
1352 1 120
1352 2 142
1352 3 132
2152 1 .
2152 2 320
Output data set:
ID maxvisit maxsbp
13480 1 200
1352 2 142
2152 1 .
Sorry I didn't see this posting before - this works really well, thanks!
That would be more simple.
data have;
input ID visit sbp;
cards;
1348 1 200
1348 2 310
1348 3 .
1352 1 120
1352 2 142
1352 3 132
2152 1 .
2152 2 320
;
run;
data want;
set have ;
by id;
retain max MAXVISIT ;
if first.id then do;call missing(max);MAXVISIT=visit;end;
if SBP le 300 and max lt SBP then do;max=SBP;MAXVISIT=visit;end;
if last.id then output;
keep id max MAXVISIT;
run;
Xia Keshan
Thank you to everyone - really appreciate your feedback!!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.