Hi there,
For your kind information, I am trying to create a flag based on multiple observations for each subject. For example, multiple observations for a subject, if any observation value is missing, one missing flag will be created as well as if any of the value is invalid, an invalid flag will be created. In my example given bleow, valid range of values for category are AAA, BBB and CCC. So, if category contains DDD must be flagged as invalid.
data have;
length id $3. category $3. ;
infile datalines TRUNCOVER;
input id $ category $;
datalines;
101 AAA
101 BBB
101 CCC
201 AAA
201
201 BBB
301 AAA
301
301 DDD
;
run;
data want;
length id $3. missing_flag $1. invalid_flag $1;
infile datalines TRUNCOVER;
input id $ missing_flag $ invalid_flag $ ;
datalines;
101 0 0
201 1 0
301 1 1
;
run;
Thank you in advance for your kind guidance.
Regards,
I would suggest that you are better off if:
Getting that:
data want;
set have;
by id;
if first.id then do;
missing_count=0;
invalid_count=0;
end;
if category=' ' then missing_count + 1;
else if category not in ('AAA', 'BBB', 'CCC') then invalid_count + 1;
if last.id;
keep id missing_count invalid_count;
run;
Something like:
data have; length id $3. category $3. ; infile datalines TRUNCOVER; input id $ category $; datalines; 101 AAA 101 BBB 101 CCC 201 AAA 201 201 BBB 301 AAA 301 301 DDD ; run; data want (keep=id missing_flag invalid_flag); set have; by id; retain missing_flag invalid_flag; if first.id then do; missing_flag=0; invalid_flag=1; end; if category="" then missing_flag=1; if category not in ("AAA","BBB","CCC") then invalid_flag=0; if last.id then output; run;
I would suggest that you are better off if:
Getting that:
data want;
set have;
by id;
if first.id then do;
missing_count=0;
invalid_count=0;
end;
if category=' ' then missing_count + 1;
else if category not in ('AAA', 'BBB', 'CCC') then invalid_count + 1;
if last.id;
keep id missing_count invalid_count;
run;
Hi Astounding,
Thank you for providing me a proactive solution by including counter.
Regards,
An alternative solution, using sql code (which may be used in many statistical and database applications such as R, SPSS, MSSQL, etc.):
data have;
length id $3. category $3. ;
infile datalines TRUNCOVER;
input id $ category $;
datalines;
101 AAA
101 BBB
101 CCC
201 AAA
201
201 BBB
301 AAA
301
301 DDD
;
run;
proc sql;
create table flags as
select id
, max(case when category = "" then 1 else 0 end) as missing_flag
, max(case when category in ('DDD') then 1 else 0 end) as invalid_flag
from have
group by id;
quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.