Hi there,
I have a dataset with medication claims for different patients. If a certain medication is claimed, this shows up in a specific varbiable (yes=1, no=.). Now if within a single patient_id, the wanted variable is "1" once or more, I would like to be it "1" in every single row of that patient_id.
My data looks like this:
data mydata;
input patient_id $ medication $ claim;
datalines;
ID1 medication .
ID1 medication .
ID1 medicationX 1
ID1 medication .
ID2 medication .
ID2 medication .
ID2 medication .
ID3 medicationX 1
ID3 medication .
ID3 medicationX 1
ID3 medication .
;
run;
I would like to obtain a dataset with filled "claim" variable for each patient_id, if "1" is present once or more within a patient_id, that looks like this:
data mydata;
input patient_id $ medication $ claim;
datalines;
ID1 medication 1
ID1 medication 1
ID1 medicationX 1
ID1 medication 1
ID2 medication .
ID2 medication .
ID2 medication .
ID3 medicationX 1
ID3 medication 1
ID3 medicationX 1
ID3 medication 1
;
run;
I hope it is understandable. Thank you for your help.
data mydata;
input patient_id $ medication $12. claim;
datalines;
ID1 medication .
ID1 medication .
ID1 medicationX 1
ID1 medication .
ID2 medication .
ID2 medication .
ID2 medication .
ID3 medicationX 1
ID3 medication .
ID3 medicationX 1
ID3 medication .
;
run;
proc sql;
create table mydata_filled as
select *, max(claim) as claim_filled
from mydata
group by patient_id;
quit;
SQL is one way.
@Tamino wrote:
Hi there,
I have a dataset with medication claims for different patients. If a certain medication is claimed, this shows up in a specific varbiable (yes=1, no=.). Now if within a single patient_id, the wanted variable is "1" once or more, I would like to be it "1" in every single row of that patient_id.
My data looks like this:
data mydata; input patient_id $ medication $ claim; datalines; ID1 medication . ID1 medication . ID1 medicationX 1 ID1 medication . ID2 medication . ID2 medication . ID2 medication . ID3 medicationX 1 ID3 medication . ID3 medicationX 1 ID3 medication . ; run;
I would like to obtain a dataset with filled "claim" variable for each patient_id, if "1" is present once or more within a patient_id, that looks like this:
data mydata; input patient_id $ medication $ claim; datalines; ID1 medication 1 ID1 medication 1 ID1 medicationX 1 ID1 medication 1 ID2 medication . ID2 medication . ID2 medication . ID3 medicationX 1 ID3 medication 1 ID3 medicationX 1 ID3 medication 1 ; run;
I hope it is understandable. Thank you for your help.
data mydata;
input patient_id $ medication $12. claim;
datalines;
ID1 medication .
ID1 medication .
ID1 medicationX 1
ID1 medication .
ID2 medication .
ID2 medication .
ID2 medication .
ID3 medicationX 1
ID3 medication .
ID3 medicationX 1
ID3 medication .
;
run;
proc sql;
create table mydata_filled as
select *, max(claim) as claim_filled
from mydata
group by patient_id;
quit;
SQL is one way.
@Tamino wrote:
Hi there,
I have a dataset with medication claims for different patients. If a certain medication is claimed, this shows up in a specific varbiable (yes=1, no=.). Now if within a single patient_id, the wanted variable is "1" once or more, I would like to be it "1" in every single row of that patient_id.
My data looks like this:
data mydata; input patient_id $ medication $ claim; datalines; ID1 medication . ID1 medication . ID1 medicationX 1 ID1 medication . ID2 medication . ID2 medication . ID2 medication . ID3 medicationX 1 ID3 medication . ID3 medicationX 1 ID3 medication . ; run;
I would like to obtain a dataset with filled "claim" variable for each patient_id, if "1" is present once or more within a patient_id, that looks like this:
data mydata; input patient_id $ medication $ claim; datalines; ID1 medication 1 ID1 medication 1 ID1 medicationX 1 ID1 medication 1 ID2 medication . ID2 medication . ID2 medication . ID3 medicationX 1 ID3 medication 1 ID3 medicationX 1 ID3 medication 1 ; run;
I hope it is understandable. Thank you for your help.
Can also be done in a DATA step, but PROC SQL might be better since you don't have to sort.
data mydata;
input patient_id $ medication :$11. claim;
datalines;
ID1 medication .
ID1 medication .
ID1 medicationX 1
ID1 medication .
ID2 medication .
ID2 medication .
ID2 medication .
ID3 medicationX 1
ID3 medication .
ID3 medicationX 1
ID3 medication .
;
run;
proc sort data = mydata;
by patient_id descending claim;
run;
data want;
set mydata;
by patient_id;
retain new_claim;
if first.patient_id then call missing(new_claim);
if not missing(claim) then new_claim = claim;
run;
Obs patient_id medication new_claim 1 ID1 medicationX 1 2 ID1 medication 1 3 ID1 medication 1 4 ID1 medication 1 5 ID2 medication . 6 ID2 medication . 7 ID2 medication . 8 ID3 medicationX 1 9 ID3 medicationX 1 10 ID3 medication 1 11 ID3 medication 1
This is commonly called last observation carried forward or LOCF.
It can be done with a single data step, like this:
data want;
merge mydata(drop=claim) mydata(keep=patient_id claim where=(claim=1));
by patient_id;
run;
Still another DATA step method, assuming your data is sorted by patient_id:
data want;
do until (last.patient_id);
set mydata;
by patient_id;
_claim = max(claim,_claim);
end;
do until (last.patient_id);
set mydata;
by patient_id;
claim = _claim;
output;
end;
drop _claim;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.