Hi there,
For your kind information, I am trying to create a flag based on multiple observations for each subject. For example, multiple observations for a subject, if any observation value is missing, one missing flag will be created as well as if any of the value is invalid, an invalid flag will be created. In my example given bleow, valid range of values for category are AAA, BBB and CCC. So, if category contains DDD must be flagged as invalid.
data have;
length id $3. category $3. ;
infile datalines TRUNCOVER;
input id $ category $;
datalines;
101 AAA
101 BBB
101 CCC
201 AAA
201
201 BBB
301 AAA
301
301 DDD
;
run;
data want;
length id $3. missing_flag $1. invalid_flag $1;
infile datalines TRUNCOVER;
input id $ missing_flag $ invalid_flag $ ;
datalines;
101 0 0
201 1 0
301 1 1
;
run;
Thank you in advance for your kind guidance.
Regards,
I would suggest that you are better off if:
Getting that:
data want;
set have;
by id;
if first.id then do;
missing_count=0;
invalid_count=0;
end;
if category=' ' then missing_count + 1;
else if category not in ('AAA', 'BBB', 'CCC') then invalid_count + 1;
if last.id;
keep id missing_count invalid_count;
run;
Something like:
data have; length id $3. category $3. ; infile datalines TRUNCOVER; input id $ category $; datalines; 101 AAA 101 BBB 101 CCC 201 AAA 201 201 BBB 301 AAA 301 301 DDD ; run; data want (keep=id missing_flag invalid_flag); set have; by id; retain missing_flag invalid_flag; if first.id then do; missing_flag=0; invalid_flag=1; end; if category="" then missing_flag=1; if category not in ("AAA","BBB","CCC") then invalid_flag=0; if last.id then output; run;
I would suggest that you are better off if:
Getting that:
data want;
set have;
by id;
if first.id then do;
missing_count=0;
invalid_count=0;
end;
if category=' ' then missing_count + 1;
else if category not in ('AAA', 'BBB', 'CCC') then invalid_count + 1;
if last.id;
keep id missing_count invalid_count;
run;
Hi Astounding,
Thank you for providing me a proactive solution by including counter.
Regards,
An alternative solution, using sql code (which may be used in many statistical and database applications such as R, SPSS, MSSQL, etc.):
data have;
length id $3. category $3. ;
infile datalines TRUNCOVER;
input id $ category $;
datalines;
101 AAA
101 BBB
101 CCC
201 AAA
201
201 BBB
301 AAA
301
301 DDD
;
run;
proc sql;
create table flags as
select id
, max(case when category = "" then 1 else 0 end) as missing_flag
, max(case when category in ('DDD') then 1 else 0 end) as invalid_flag
from have
group by id;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.