BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DeepakSwain
Pyrite | Level 9

Hi there,

 

For your kind information, I am trying to create a flag based on multiple observations for each subject. For example, multiple observations for a subject, if any observation value is missing, one missing flag will be created as well as if any of the value is invalid, an invalid flag will be created. In my example given bleow, valid range of values for category are AAA, BBB and CCC. So, if category contains DDD must be flagged as invalid. 

data have;
length id $3. category $3. ;
infile datalines TRUNCOVER;
input id $ category $;
datalines;
101 AAA
101 BBB
101 CCC
201 AAA
201    
201 BBB
301 AAA
301    
301 DDD
;
run;

data want;
length id $3. missing_flag $1. invalid_flag $1;
infile datalines TRUNCOVER;
input id $  missing_flag $ invalid_flag $ ;
datalines;
101 0 0 
201 1 0
301 1 1
;
run;

Thank you in advance for your kind guidance. 

 

Regards,

Swain
1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

I would suggest that you are better off if:

 

  • Your "flags" should be numeric, not character, and
  • they should be counts, not 0/1 flags

 

Getting that:

 

data want;

set have;

by id;

if first.id then do;

   missing_count=0;

   invalid_count=0;

end;

if category=' ' then missing_count + 1;

else if category not in ('AAA', 'BBB', 'CCC') then invalid_count + 1;

if last.id;

keep id missing_count invalid_count;

run;

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Something like:

data have;
  length id $3. category $3. ;
  infile datalines TRUNCOVER;
  input id $ category $;
datalines;
101 AAA
101 BBB
101 CCC
201 AAA
201    
201 BBB
301 AAA
301    
301 DDD
;
run;
data want (keep=id missing_flag invalid_flag);
  set have;
  by id;
  retain missing_flag invalid_flag;
  if first.id then do;
    missing_flag=0;
    invalid_flag=1;
  end;
  if category="" then missing_flag=1;
  if category not in ("AAA","BBB","CCC") then invalid_flag=0;
  if last.id then output;
run;
Astounding
PROC Star

I would suggest that you are better off if:

 

  • Your "flags" should be numeric, not character, and
  • they should be counts, not 0/1 flags

 

Getting that:

 

data want;

set have;

by id;

if first.id then do;

   missing_count=0;

   invalid_count=0;

end;

if category=' ' then missing_count + 1;

else if category not in ('AAA', 'BBB', 'CCC') then invalid_count + 1;

if last.id;

keep id missing_count invalid_count;

run;

DeepakSwain
Pyrite | Level 9

Hi Astounding,

Thank you for providing me a proactive solution by including counter. 

Regards,

Swain
thomp7050
Pyrite | Level 9

An alternative solution, using sql code (which may be used in many statistical and database applications such as R, SPSS, MSSQL, etc.):

 

data have;
length id $3. category $3. ;
infile datalines TRUNCOVER;
input id $ category $;
datalines;
101 AAA
101 BBB
101 CCC
201 AAA
201    
201 BBB
301 AAA
301    
301 DDD
;
run;

proc sql;
create table flags as
select id
, max(case when category = "" then 1 else 0 end) as missing_flag
, max(case when category in ('DDD') then 1 else 0 end) as invalid_flag
from have
group by id;
quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2075 views
  • 2 likes
  • 4 in conversation