Hi folks,
I'd like to know how many missing information each patient has and at what location? For example, PAT_ID=3 had missing at three occasions (Count_Missing=3) corresponding to his/her first, third and fourth visits (Location_Missing=1,0,3,4). Missing data is denoted as 99.
Would you please help solve this problem? The output I want are the the last two columns in mock data: Count_Missing and Location_missing. The output can be presented differently as long as rows are marked up in meaningful ways.
Thanks millions in advance.
DATA HAVE; INPUT PAT_ID date_visit HB1 COUNT_ID Count_Missing Location_Missing; CARDS; 1 15370 99 4 1 1 1 15824 12.1 4 1 0 1 16643 11 4 1 0 1 16820 11.6 4 1 0 2 15377 99 3 2 1 2 15888 99 3 2 2 2 16413 13.6 3 2 0 3 16029 99 4 3 1 3 16399 12.8 4 3 0 3 16587 99 4 3 3 3 16771 99 4 3 4 4 16561 12.6 4 3 0 4 16749 99 4 3 2 4 17099 99 4 3 3 4 17301 99 4 3 4
;
99s are missing. That's a key piece of information you missed as well.....
Use a DoW loop to calculate the number of missing and then use the standard data step to do the counter identification.
data want;
do n=1 by 1 until (last.pat_id);
set have (rename=(count_missing =want1 location_missing=want2));
by pat_id;
if first.pat_id then count_missing = 0;
if hb1=99 then count_missing+1;
end;
do until(last.pat_id);
set have (rename=(count_missing =want1 location_missing=want2));
by pat_id;
if first.pat_id then counter=1;
else counter+1;
if hb1=99 then location_missing=counter;
else location_missing = 0;
output;
end;
drop counter;
run;
@Cruise wrote:
The columns in the mock data: Count_Missing, Location_Missing. Sorry for ambiguity. I'll add this info in the post.
The columns in the mock data: Count_Missing, Location_Missing. Sorry for ambiguity. I'll add this info in the post.
99s are missing. That's a key piece of information you missed as well.....
Use a DoW loop to calculate the number of missing and then use the standard data step to do the counter identification.
data want;
do n=1 by 1 until (last.pat_id);
set have (rename=(count_missing =want1 location_missing=want2));
by pat_id;
if first.pat_id then count_missing = 0;
if hb1=99 then count_missing+1;
end;
do until(last.pat_id);
set have (rename=(count_missing =want1 location_missing=want2));
by pat_id;
if first.pat_id then counter=1;
else counter+1;
if hb1=99 then location_missing=counter;
else location_missing = 0;
output;
end;
drop counter;
run;
@Cruise wrote:
The columns in the mock data: Count_Missing, Location_Missing. Sorry for ambiguity. I'll add this info in the post.
Since we count_id in the incoming dataset, one pass should do
DATA HAVE;
INPUT PAT_ID date_visit HB1 COUNT_ID;* Count_Missing Location_Missing;
format date_visit date9.;
CARDS;
1 15370 99 4 1 1
1 15824 12.1 4 1 0
1 16643 11 4 1 0
1 16820 11.6 4 1 0
2 15377 99 3 2 1
2 15888 99 3 2 2
2 16413 13.6 3 2 0
3 16029 99 4 3 1
3 16399 12.8 4 3 0
3 16587 99 4 3 3
3 16771 99 4 3 4
4 16561 12.6 4 3 0
4 16749 99 4 3 2
4 17099 99 4 3 3
4 17301 99 4 3 4
;
data want;
if 0 then set have;
Count_Missing=0;
do _n_=1 by 1 until(last.pat_id);
set have;
by pat_id;
Count_Missing+(hb1=99);
if hb1=99 then Location_Missing=_n_;
else Location_Missing=0;
output;
end;
run;
Hi @Cruise Hmm good catch and my poor attention to detail. I think in that case, you would need a double pass,
DATA HAVE;
INPUT PAT_ID date_visit HB1 COUNT_ID;* Count_Missing Location_Missing;
format date_visit date9.;
CARDS;
1 15370 99 4 1 1
1 15824 12.1 4 1 0
1 16643 11 4 1 0
1 16820 11.6 4 1 0
2 15377 99 3 2 1
2 15888 99 3 2 2
2 16413 13.6 3 2 0
3 16029 99 4 3 1
3 16399 12.8 4 3 0
3 16587 99 4 3 3
3 16771 99 4 3 4
4 16561 12.6 4 3 0
4 16749 99 4 3 2
4 17099 99 4 3 3
4 17301 99 4 3 4
;
data want;
if 0 then set have;
Count_Missing=0;
do until(last.pat_id);
set have;
by pat_id;
Count_Missing+(hb1=99);
end;
do _n_=1 by 1 until(last.pat_id);
set have;
by pat_id;
Location_Missing=(hb1=99)*_n_;
output;
end;
run;
Let me think through a single pass . Sorry about that. My sincere apologies
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.