I'm tasked with answering the following (it appears partially correct based upon results though there may be a more optimal way to do this):
Each patient comes in for one or more visits (VISIT variable) to get their SBP (SBP variable) measured.
A new variable, MAXSBP, is the highest sbp measurement; if the sbp is greater than 300, it is considered invalid (missing).
Another variable, MAXVISIT, should correspond to when the maximum sbp was received.
An example of a scenario:
One patient comes in 3 times with the following measurements:
Visit # SBP
1 120
2 305
3 132
So we'd want the MAXSBP to be 132 when MAXVISIT is 2.
The code I have is as follows and gets me the MAXSBP; however, it's not correlating with the correct visit.
Any help much appreciated!
data sbp1;
set sbp;
if max(of sbp:) > 300 then call missing(of sbp:); /* this looks ok */
maxvisit = max(of visit:);
run;
proc print data = sbp1 (rename=(sbp=maxsbp));
var id maxsbp maxvisit;
run;
That would be more simple.
data have;
input ID visit sbp;
cards;
1348 1 200
1348 2 310
1348 3 .
1352 1 120
1352 2 142
1352 3 132
2152 1 .
2152 2 320
;
run;
data want;
set have ;
by id;
retain max MAXVISIT ;
if first.id then do;call missing(max);MAXVISIT=visit;end;
if SBP le 300 and max lt SBP then do;max=SBP;MAXVISIT=visit;end;
if last.id then output;
keep id max MAXVISIT;
run;
Xia Keshan
It appears that your data may be structured as:
patientid sbp1 sbp2 sb3 .....
and the 1, 2, or 3 indicates the visit, is that correct?
In the case of tie values for maximum sbp which one do you want counted?
/* step is to simulate some date in ranges from 200 to 325 if your measurement isn't whole numbers then we'll need a different approach */
data have;
array s sbp1-sbp20;
do patient = 1 to 10;
do j=1 to 20;
s
end;
output;
end;
drop j;
run;
data want;
set have;
array s sbp1-sbp20;
maxvisit=0; /* this variable will hold the number of the visit with the maximum value <= 300, if there are ties I think this is going to be
the earliest visit*/
do maxsbp= 300 to 0 by -1; /* maxsbp will be the largest value <=300*/
maxvisit= whichn(maxsbp,of s
/* this could work for decimals just change the by value to -0.1, -0.01 and such but could take much longer*/
if maxvisit>0 then leave;
end;
run;
I don't understand why MAXVISIT is 2 ? since MAXSBP is to be 132 .
data have; input Visit SBP; cards; 1 120 2 305 3 132 ; run; data want; set have end=last; retain max max2 MAXVISIT ; if SBP gt max then do;max=SBP;MAXVISIT=visit;end; if SBP le 300 then max2=max(max2,SBP); if last then output; keep max2 MAXVISIT; run;
Xia Keshan
Xia,
Thanks so much!
Here's what I'm still stuggling with though think it's very close - I tweaked your code slightly:
data sbp2;
set sbp1 end=last;
retain max maxsbp MAXVISIT ;
if SBP gt max then do;
max=SBP;
MAXVISIT=visit;
end;
if SBP le 300 then maxsbp=max(maxsbp,SBP);
if last then output;
keep newid sex maxsbp MAXVISIT;
run;
proc print data = sbp2 noobs;
var newid sex maxsbp maxvisit;
run;
The output I get is:
newid sex maxsbp maxvisit
9900 | M | 132 | 2 |
This is great in that it gives me the LAST subject's maxsbp and maxvisit and this subject 9900 had a maxsbp at visit 1 of 116 so is definitely picking correct visit of the two.
I'm just not getting a listing of the other subjects and their maxsbp and maxvisit.
Apologies...I should have been clearer with the data upfront.
So the input data set looks like:
ID visit sbp
125 1 122
13000 1 .
1312 1 116
2000 1 302
1384 2 124
...
And for the output data set I'm creating 2 new vars, maxsbp and maxvisit:
ID maxsbp maxvisit
125 122 1
13000 . 1
1312 116 1
2000 . 1 (set to missing b/c exceeds 300)
1384 124 2
Xia, I had a typo - you are correct - the maxsbp is 132 at visit 3.
When I try your code, it's not excluding the values >= 300 and like with the code I wrote above, I'm not always getting the correct visit (for ex. for ID 1384 I'm not getting visit 2, I'm getting a missing value).
Thanks so much for your help!
Hi,
I would suggest post some test data in the form of a datastep and required output:
proc sql;
create table WANT as
select *
from (select ID,VISIT,case when SBP > 300 then . else SBP end as SBP from HAVE where SBP < 300)
group by ID
having VISIT=max(VISIT);
quit;
Good point thanks...I've taken a small subset below...so you may see if anything ge 300, it disqualifies it for a "maxvisit" and a "maxvisit" may in fact return a missing value if the maxsbp is ge 300.
I'd like to try to not use PROC SQL; am seeing if I may be able to rework the code with retain above.
Many thanks for all your help!
Input data set:
ID visit sbp
13480 1 200
13480 2 310
13480 3 .
1352 1 120
1352 2 142
1352 3 132
2152 1 .
2152 2 320
Output data set:
ID maxvisit maxsbp
13480 1 200
1352 2 142
2152 1 .
Sorry I didn't see this posting before - this works really well, thanks!
That would be more simple.
data have;
input ID visit sbp;
cards;
1348 1 200
1348 2 310
1348 3 .
1352 1 120
1352 2 142
1352 3 132
2152 1 .
2152 2 320
;
run;
data want;
set have ;
by id;
retain max MAXVISIT ;
if first.id then do;call missing(max);MAXVISIT=visit;end;
if SBP le 300 and max lt SBP then do;max=SBP;MAXVISIT=visit;end;
if last.id then output;
keep id max MAXVISIT;
run;
Xia Keshan
Thank you to everyone - really appreciate your feedback!!
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.