## Create flag based on multiple observations for each subject

Solved
Frequent Contributor
Posts: 120

# Create flag based on multiple observations for each subject

Hi there,

For your kind information, I am trying to create a flag based on multiple observations for each subject. For example, multiple observations for a subject, if any observation value is missing, one missing flag will be created as well as if any of the value is invalid, an invalid flag will be created. In my example given bleow, valid range of values for category are AAA, BBB and CCC. So, if category contains DDD must be flagged as invalid.

``````data have;
length id \$3. category \$3. ;
infile datalines TRUNCOVER;
input id \$ category \$;
datalines;
101 AAA
101 BBB
101 CCC
201 AAA
201
201 BBB
301 AAA
301
301 DDD
;
run;

data want;
length id \$3. missing_flag \$1. invalid_flag \$1;
infile datalines TRUNCOVER;
input id \$  missing_flag \$ invalid_flag \$ ;
datalines;
101 0 0
201 1 0
301 1 1
;
run;``````

Regards,

Swain

Accepted Solutions
Solution
‎04-27-2017 09:53 AM
Super User
Posts: 6,921

## Re: Create flag based on multiple observations for each subject

I would suggest that you are better off if:

• Your "flags" should be numeric, not character, and
• they should be counts, not 0/1 flags

Getting that:

data want;

set have;

by id;

if first.id then do;

missing_count=0;

invalid_count=0;

end;

if category=' ' then missing_count + 1;

else if category not in ('AAA', 'BBB', 'CCC') then invalid_count + 1;

if last.id;

keep id missing_count invalid_count;

run;

All Replies
Super User
Posts: 9,813

## Re: Create flag based on multiple observations for each subject

Something like:

```data have;
length id \$3. category \$3. ;
infile datalines TRUNCOVER;
input id \$ category \$;
datalines;
101 AAA
101 BBB
101 CCC
201 AAA
201
201 BBB
301 AAA
301
301 DDD
;
run;
data want (keep=id missing_flag invalid_flag);
set have;
by id;
retain missing_flag invalid_flag;
if first.id then do;
missing_flag=0;
invalid_flag=1;
end;
if category="" then missing_flag=1;
if category not in ("AAA","BBB","CCC") then invalid_flag=0;
if last.id then output;
run;```
Solution
‎04-27-2017 09:53 AM
Super User
Posts: 6,921

## Re: Create flag based on multiple observations for each subject

I would suggest that you are better off if:

• Your "flags" should be numeric, not character, and
• they should be counts, not 0/1 flags

Getting that:

data want;

set have;

by id;

if first.id then do;

missing_count=0;

invalid_count=0;

end;

if category=' ' then missing_count + 1;

else if category not in ('AAA', 'BBB', 'CCC') then invalid_count + 1;

if last.id;

keep id missing_count invalid_count;

run;

Frequent Contributor
Posts: 120

## Re: Create flag based on multiple observations for each subject

Hi Astounding,

Thank you for providing me a proactive solution by including counter.

Regards,

Swain
Frequent Contributor
Posts: 93

## Re: Create flag based on multiple observations for each subject

An alternative solution, using sql code (which may be used in many statistical and database applications such as R, SPSS, MSSQL, etc.):

``````data have;
length id \$3. category \$3. ;
infile datalines TRUNCOVER;
input id \$ category \$;
datalines;
101 AAA
101 BBB
101 CCC
201 AAA
201
201 BBB
301 AAA
301
301 DDD
;
run;

proc sql;
create table flags as
select id
, max(case when category = "" then 1 else 0 end) as missing_flag
, max(case when category in ('DDD') then 1 else 0 end) as invalid_flag
from have
group by id;
quit;``````
☑ This topic is solved.