BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tarheel13
Rhodochrosite | Level 12

is there a way to create a flag for if all records for a subject are missing? Here is some fake example data. I want to flag 002 since its only record is missing.

 

ID.   class

001 HIGH

001 LOW

002 

003 HIGH

004 LOW

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

You have to pass through each ID twice - the first time to check for non-missing class and setting the flag, and the second time to output the data with the flag value.  In the absence of a working data set, here is an untested program.

data want;
  set have (in=firstpass) have (in=secondpass);
  by id;
  retain flag ;
  if first.id then flag=0;
  if firstpass and missing(class)=0 then flag=1;
  if secondpass;
run;

The program requires dataset HAVE to be sorted by ID.  The SET statement (with the associated "BY ID;") reads each ID in HAVE twice, but allows only the secondpass to be kept in the output dataset.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

13 REPLIES 13
maguiremq
SAS Super FREQ

You can use the MISSING function, which will pick up on missing numeric or character values. It's my preferred way since it has that dual-effect.

if missing(class) then flag = "1";
else flag = "0";

/* alternatively, not my preferred method */

if class = "" then flag = "1";
else flag = "0";

Not sure if you're trying to summarize any missing record to ID's, so let us know if that's what you mean.

tarheel13
Rhodochrosite | Level 12
Hey so actually I wanted to create a Boolean variable for the class variable and then use it with SQL max and group by to collapse it to 1 row per ID. At first, I did class=‘HIGH’ but this didn’t work as it won’t capture someone who has all records missing. All records missing also need to be counted as value of 1 for Boolean variable. I tried highind=class=‘HIGH’ but this didn’t work. So I was thinking if I used highind=Missingflag=1 or class=‘HIGH’ then this might work?
maguiremq
SAS Super FREQ

I'm sorry, I'm not quite understanding. What do you want for your first ID? Is there an inherent order? Do you want the highest value on top? Not sure if this is exactly what you're looking for:

 

data have;
infile datalines missover;
input id :$3. class :$4.;
datalines;
001 HIGH
001 LOW
002 
003 HIGH
004 LOW
;

proc sql;
	select
				distinct id,
				class,
				case 
					when class = "HIGH" then 3
					when class = "LOW" then 2
					when missing(class) then 1
						else 0
				end as value
	from
				have
	group by
				id
	having
				value = max(value);
quit;

I'm assuming you want the highest value on top for each record.

 

id class value 
001 HIGH 3 
002   1 
003 HIGH 3 
004 LOW 2 
tarheel13
Rhodochrosite | Level 12
No I wanted a flag for if all their records were missing so that I could use it in a Boolean variable.
PaigeMiller
Diamond | Level 26

@tarheel13 wrote:
No I wanted a flag for if all their records were missing so that I could use it in a Boolean variable.

Please show us the desired output.

--
Paige Miller
tarheel13
Rhodochrosite | Level 12

Since 002 only has 1 record and it's missing, I want the missingflag=1. If there was another ID that had 2 records and both were missing, then I also want that to be missingflag=1. Does that make sense?

mkeintz
PROC Star

You have to pass through each ID twice - the first time to check for non-missing class and setting the flag, and the second time to output the data with the flag value.  In the absence of a working data set, here is an untested program.

data want;
  set have (in=firstpass) have (in=secondpass);
  by id;
  retain flag ;
  if first.id then flag=0;
  if firstpass and missing(class)=0 then flag=1;
  if secondpass;
run;

The program requires dataset HAVE to be sorted by ID.  The SET statement (with the associated "BY ID;") reads each ID in HAVE twice, but allows only the secondpass to be kept in the output dataset.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tarheel13
Rhodochrosite | Level 12

Does this flag if all records are missing within a subject?

Ksharp
Super User

Sorry. I found an error in my code .

 

data have;
input ID  class $;
cards;
001 HIGH
001 LOW
002  .
003 HIGH
004 LOW
;

proc sql;
create table want as
select *,n(class) = 0 as flag
 from have
  group by id;
quit;
tarheel13
Rhodochrosite | Level 12
Thanks. Huge fan of SQL!
tarheel13
Rhodochrosite | Level 12
Wait doesn’t the argument have to be a numeric value for nmiss function?
Ksharp
Super User
That is for Data Step . But For SQL you can apply nmiss() , n() ... to either character or numeric type variables .
tarheel13
Rhodochrosite | Level 12

Cool, I just ran it and it worked!

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 5165 views
  • 2 likes
  • 5 in conversation