I have patients test values recorded at their each of visits (n_visits). There are missing in test values in different patterns. I want to keep all the available information while removing rows with missing while tracking for patients who had no test value at all visits (patient 4 and 6).
What would be a solution to flag patients 4 and 6 to indicate that they had no test values at all while keeping other patients information yet getting rid of missing rows?
Thanks!!!!
data have;
input patient_id n_visits value value_now;
cards;
1 1 22 21
1 2 23 23
1 3 . .
1 4 . .
2 1 . .
2 2 . .
2 3 56 .
2 4 . .
2 5 . .
3 1 23 .
3 2 . .
3 3 . .
4 1 . .
5 1 22 .
6 1 . .
6 2 . .
6 3 . .
;
data want;
input patient_id n_visits value value_now flag_missing;
cards;
1 1 22 21 0
1 2 23 23 0
2 3 56 . 0
3 1 23 . 0
4 1 . . 1
5 1 22 . 0
6 1 . . 1
;
data have;
input patient_id n_visits value value_now;
cards;
1 1 22 21
1 2 23 23
1 3 . .
1 4 . .
2 1 . .
2 2 . .
2 3 56 .
2 4 . .
2 5 . .
3 1 23 .
3 2 . .
3 3 . .
4 1 . .
5 1 22 .
6 1 . .
6 2 . .
6 3 . .
;
proc sql;
create table want(drop=r) as
select *,missing(value) as flag_missing,range(calculated flag_missing) as r
from have
group by patient_id
having ((r=0)*n_visits)=min(n_visits) or (value ne .)
order by patient_id,n_visits;
quit;
Assuming I understood what you mean.
data have;
input patient_id n_visits value value_now;
cards;
1 1 22 21
1 2 23 23
1 3 . .
1 4 . .
2 1 . .
2 2 . .
2 3 56 .
2 4 . .
2 5 . .
3 1 23 .
3 2 . .
3 3 . .
4 1 . .
5 1 22 .
6 1 . .
6 2 . .
6 3 . .
;
proc sql;
create table temp as
select *
from have
where value is not missing or value_now is not missing;
create table missing as
select * from have where patient_id not in (select patient_id from temp)
order by 1,2;
quit;
data missing_first;
set missing;
by patient_id;
if first.patient_id;
run;
data want;
set temp missing_first;
by patient_id n_visits;
run;
You want to delete all records with value missing, EXCEPT keep one record for any patient with all missing values. This program does that:
data have;
input patient_id n_visits value value_now;
cards;
1 1 22 21
1 2 23 23
1 3 . .
1 4 . .
2 1 . .
2 2 . .
2 3 56 .
2 4 . .
2 5 . .
3 1 23 .
3 2 . .
3 3 . .
4 1 . .
5 1 22 .
6 1 . .
6 2 . .
6 3 . .
;
data want (drop=_:);
set have;
by patient_id;
if first.patient_id then _nzeroes=0;
if value^=. then flag=0;
_nzeroes+(flag=0);
if last.patient_id and _nzeroes=0 then flag=1;
if flag=1 or flag=0;
run;
However, instead of keeping the first record for id 6, this keeps the last - which you might actually prefer, since n_visit becomes the number of visit records for that patient. But if you actually want to keep the first record:
data have;
input patient_id n_visits value value_now;
cards;
1 1 22 21
1 2 23 23
1 3 . .
1 4 . .
2 1 . .
2 2 . .
2 3 56 .
2 4 . .
2 5 . .
3 1 23 .
3 2 . .
3 3 . .
4 1 . .
5 1 22 .
6 1 . .
6 2 . .
6 3 . .
;
data want (drop=_:);
set have;
by patient_id;
if first.patient_id then do;
_nzeroes=0;
_firstrec=_n_;
end;
if value^=. then flag=0;
_nzeroes+(flag=0);
if last.patient_id and _nzeroes=0 then do;
set have point=_firstrec;
flag=1;
end;
if flag=1 or flag=0;
run;
data have;
input patient_id n_visits value value_now;
cards;
1 1 22 21
1 2 23 23
1 3 . .
1 4 . .
2 1 . .
2 2 . .
2 3 56 .
2 4 . .
2 5 . .
3 1 23 .
3 2 . .
3 3 . .
4 1 . .
5 1 22 .
6 1 . .
6 2 . .
6 3 . .
;
proc sql;
create table want(drop=r) as
select *,missing(value) as flag_missing,range(calculated flag_missing) as r
from have
group by patient_id
having ((r=0)*n_visits)=min(n_visits) or (value ne .)
order by patient_id,n_visits;
quit;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.