- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data ONE ;
input id visit sbp ;
cards ;
101 10 95
101 20 .
101 30 90
102 10 100
102 20 105
102 30 .
103 10 90
103 20 90
103 30 115
104 10 .
104 20 100
104 30 .
104 40 105
;
proc sort data = one out = sorted ; by id visit ;run;
I want LOCF BOCF WOCF for the above data I tried my code but I didn't get the desired output, Please any anyone help me write code?
Need output like below :
LOCF means Last observation carryforward
BOCF means Baseline observation carryforward (Here baseline visit as 10)
WOCF means Worst observation carryforward
Obs id visit sbp locf bocf wocf
1 1 01 10 95 95 95 95
2 101 20 . 95 95 95
3 101 30 90 90 90 90
4 102 10 100 100 100 100
5 102 20 105 105 105 105
6 102 30 . 105 100 105
7 103 10 90 90 90 90
8 103 20 90 90 90 90
9 103 30 115 115 115 115
10 104 10 . . . 105
11 104 20 100 100 100 100
12 104 30 . 100 . 105
13 104 40 105 105 105 105
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
/*Assuming I understood what you mean.*/
data ONE ;
input id visit sbp ;
cards ;
101 10 95
101 20 .
101 30 90
102 10 100
102 20 105
102 30 .
103 10 90
103 20 90
103 30 115
104 10 .
104 20 100
104 30 .
104 40 105
;
data want;
do until(last.id);
set one;
by id;
if first.id then baseline=sbp;
max=max(max,sbp);
end;
do until(last.id);
set one;
by id;
if not missing(sbp) then locf=sbp;
bocf=coalesce(sbp,baseline);
wocf=coalesce(sbp,max);
output;
end;
drop max baseline;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Can you show the output data you want?
For simple last obs carry forward, you probably want to use the first value per ID. And after that, use the value of SBP if SBP is not a missing value. You can do that like:
data lf ;
set sorted ;
retain lo;
by id visit ;
if first.id then lo = sbp ;
else if not missing(sbp) then lo=sbp ;
run ;
proc print ;
run ;
I assume BOCF and WOCF are something like "best" and "worst" ?
I think you could take above and adapt it. The ELSE statement would compare the value of SBP to the value of LO and then decide whether or not to update the value of LO.
Next up: SAS Trivia Quiz hosted by SAS on Wednesday May 21.
Register now at https://www.basug.org/events.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You haven't provided any definitions of what you mean by LOCF, BOCF and WOCF. We need those before we can help code for them.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
LOCF (presumably "last" observation carried forward) typically means you want to carry forward the most recent non-missing value of a variable
I guess BOCF and WOCF mean "best" and "worst" observation carried forward. If "best" is highest, and "worst" is lowest, then
dm 'clear log';
data ONE ;
input id visit sbp ;
cards ;
101 10 95
101 20 .
101 30 90
102 10 100
102 20 105
102 30 .
103 10 90
103 20 90
103 30 115
104 10 .
104 20 100
104 30 .
104 40 105
run;
data want;
set one;
by id visit;
retain locf bocf wocf;
locf=ifn(first.id,.,coalesce(lag(sbp),locf));
bocf=ifn(first.id,.,max(bocf,locf));
wocf=ifn(first.id,.,min(wocf,locf));
run;
proc print;
id id;
run;
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@mkeintz I don't think that logic works.
Those goofy IFN,IFC functions never make much sense to me (ripoffs from EXCEL I think).
If think you are looking for something like:
data want;
set one;
by id visit;
retain locf bocf wocf;
if first.id then call missing(locf,bocf,wocf);
locf=coalesce(sbp,locf);
bocf=max(bocf,locf);
wocf=min(wocf,locf);
run;
Result
id=101 visit sbp locf bocf wocf 10 95 95 95 95 20 . 95 95 95 30 90 90 95 90 id=102 visit sbp locf bocf wocf 10 100 100 100 100 20 105 105 105 100 30 . 105 105 100 id=103 visit sbp locf bocf wocf 10 90 90 90 90 20 90 90 90 90 30 115 115 115 90 id=104 visit sbp locf bocf wocf 10 . . . . 20 100 100 100 100 30 . 100 100 100 40 105 105 105 100
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Tom wrote:
@mkeintz I don't think that logic works.
Those goofy IFN,IFC functions never make much sense to me (ripoffs from EXCEL I think).
If think you are looking for something like:
data want; set one; by id visit; retain locf bocf wocf; if first.id then call missing(locf,bocf,wocf); locf=coalesce(sbp,locf); bocf=max(bocf,locf); wocf=min(wocf,locf); run;
In this case, I took the OP's request as defining "carry-forward" as excluding the current observation, which I should have clearly stated. And which is also the reason my results differed from yours.
But even if that assumption is wrong, I'm still a fan of the IFN and IFC functions - for good reason. They are particularly beneficial for situations such as conditionally retrieving lags. Since IFN and IFC evaluate both of the alternative return expressions, they always update the lag queue in those expressions, even if the lag is not the returned value. This allows the replacement of these 4 lines
_tempvar=lag(x);
if some-condition then result=_tempvar;
else result=y;
drop _tempvar;
with this single line
result=ifn(some-condition,lag(x),y);
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
/*Assuming I understood what you mean.*/
data ONE ;
input id visit sbp ;
cards ;
101 10 95
101 20 .
101 30 90
102 10 100
102 20 105
102 30 .
103 10 90
103 20 90
103 30 115
104 10 .
104 20 100
104 30 .
104 40 105
;
data want;
do until(last.id);
set one;
by id;
if first.id then baseline=sbp;
max=max(max,sbp);
end;
do until(last.id);
set one;
by id;
if not missing(sbp) then locf=sbp;
bocf=coalesce(sbp,baseline);
wocf=coalesce(sbp,max);
output;
end;
drop max baseline;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Ksharp wrote:/*Assuming I understood what you mean.*/ data ONE ; input id visit sbp ; cards ; 101 10 95 101 20 . 101 30 90 102 10 100 102 20 105 102 30 . 103 10 90 103 20 90 103 30 115 104 10 . 104 20 100 104 30 . 104 40 105 ; data want; do until(last.id); set one; by id; if first.id then baseline=sbp; max=max(max,sbp); end; do until(last.id); set one; by id; if not missing(sbp) then locf=sbp; bocf=coalesce(sbp,baseline); wocf=coalesce(sbp,max); output; end; drop max baseline; run;
Thankyou, it helped me in understanding my queries.