Hi there,
I have a dataset with medication claims for different patients. If a certain medication is claimed, this shows up in a specific varbiable (yes=1, no=.). Now if within a single patient_id, the wanted variable is "1" once or more, I would like to be it "1" in every single row of that patient_id.
My data looks like this:
data mydata;
input patient_id $ medication $ claim;
datalines;
ID1 medication .
ID1 medication .
ID1 medicationX 1
ID1 medication .
ID2 medication .
ID2 medication .
ID2 medication .
ID3 medicationX 1
ID3 medication .
ID3 medicationX 1
ID3 medication .
;
run;
I would like to obtain a dataset with filled "claim" variable for each patient_id, if "1" is present once or more within a patient_id, that looks like this:
data mydata;
input patient_id $ medication $ claim;
datalines;
ID1 medication 1
ID1 medication 1
ID1 medicationX 1
ID1 medication 1
ID2 medication .
ID2 medication .
ID2 medication .
ID3 medicationX 1
ID3 medication 1
ID3 medicationX 1
ID3 medication 1
;
run;
I hope it is understandable. Thank you for your help.
data mydata;
input patient_id $ medication $12. claim;
datalines;
ID1 medication .
ID1 medication .
ID1 medicationX 1
ID1 medication .
ID2 medication .
ID2 medication .
ID2 medication .
ID3 medicationX 1
ID3 medication .
ID3 medicationX 1
ID3 medication .
;
run;
proc sql;
create table mydata_filled as
select *, max(claim) as claim_filled
from mydata
group by patient_id;
quit;
SQL is one way.
@Tamino wrote:
Hi there,
I have a dataset with medication claims for different patients. If a certain medication is claimed, this shows up in a specific varbiable (yes=1, no=.). Now if within a single patient_id, the wanted variable is "1" once or more, I would like to be it "1" in every single row of that patient_id.
My data looks like this:
data mydata; input patient_id $ medication $ claim; datalines; ID1 medication . ID1 medication . ID1 medicationX 1 ID1 medication . ID2 medication . ID2 medication . ID2 medication . ID3 medicationX 1 ID3 medication . ID3 medicationX 1 ID3 medication . ; run;
I would like to obtain a dataset with filled "claim" variable for each patient_id, if "1" is present once or more within a patient_id, that looks like this:
data mydata; input patient_id $ medication $ claim; datalines; ID1 medication 1 ID1 medication 1 ID1 medicationX 1 ID1 medication 1 ID2 medication . ID2 medication . ID2 medication . ID3 medicationX 1 ID3 medication 1 ID3 medicationX 1 ID3 medication 1 ; run;
I hope it is understandable. Thank you for your help.
data mydata;
input patient_id $ medication $12. claim;
datalines;
ID1 medication .
ID1 medication .
ID1 medicationX 1
ID1 medication .
ID2 medication .
ID2 medication .
ID2 medication .
ID3 medicationX 1
ID3 medication .
ID3 medicationX 1
ID3 medication .
;
run;
proc sql;
create table mydata_filled as
select *, max(claim) as claim_filled
from mydata
group by patient_id;
quit;
SQL is one way.
@Tamino wrote:
Hi there,
I have a dataset with medication claims for different patients. If a certain medication is claimed, this shows up in a specific varbiable (yes=1, no=.). Now if within a single patient_id, the wanted variable is "1" once or more, I would like to be it "1" in every single row of that patient_id.
My data looks like this:
data mydata; input patient_id $ medication $ claim; datalines; ID1 medication . ID1 medication . ID1 medicationX 1 ID1 medication . ID2 medication . ID2 medication . ID2 medication . ID3 medicationX 1 ID3 medication . ID3 medicationX 1 ID3 medication . ; run;
I would like to obtain a dataset with filled "claim" variable for each patient_id, if "1" is present once or more within a patient_id, that looks like this:
data mydata; input patient_id $ medication $ claim; datalines; ID1 medication 1 ID1 medication 1 ID1 medicationX 1 ID1 medication 1 ID2 medication . ID2 medication . ID2 medication . ID3 medicationX 1 ID3 medication 1 ID3 medicationX 1 ID3 medication 1 ; run;
I hope it is understandable. Thank you for your help.
Can also be done in a DATA step, but PROC SQL might be better since you don't have to sort.
data mydata;
input patient_id $ medication :$11. claim;
datalines;
ID1 medication .
ID1 medication .
ID1 medicationX 1
ID1 medication .
ID2 medication .
ID2 medication .
ID2 medication .
ID3 medicationX 1
ID3 medication .
ID3 medicationX 1
ID3 medication .
;
run;
proc sort data = mydata;
by patient_id descending claim;
run;
data want;
set mydata;
by patient_id;
retain new_claim;
if first.patient_id then call missing(new_claim);
if not missing(claim) then new_claim = claim;
run;
Obs patient_id medication new_claim 1 ID1 medicationX 1 2 ID1 medication 1 3 ID1 medication 1 4 ID1 medication 1 5 ID2 medication . 6 ID2 medication . 7 ID2 medication . 8 ID3 medicationX 1 9 ID3 medicationX 1 10 ID3 medication 1 11 ID3 medication 1
This is commonly called last observation carried forward or LOCF.
It can be done with a single data step, like this:
data want;
merge mydata(drop=claim) mydata(keep=patient_id claim where=(claim=1));
by patient_id;
run;
Still another DATA step method, assuming your data is sorted by patient_id:
data want;
do until (last.patient_id);
set mydata;
by patient_id;
_claim = max(claim,_claim);
end;
do until (last.patient_id);
set mydata;
by patient_id;
claim = _claim;
output;
end;
drop _claim;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.