BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ANKH1
Pyrite | Level 9

We have this table, but we need to replace missing values with "-".

data have;
input Question	sequence sequence2 var1 var4 var8;
datalines;
A	1	0.1	.	.	.
Yes	1	1	VALUE	.	VALUE
No	1	2	VALUE	.	VALUE
Ukn	1	3	VALUE	.	VALUE
B	2	0.1	.	.	.
Yes	2	1	.	VALUE	.
No	2	2	.	VALUE	.
Ukn	2	3	.	VALUE	.
C	3	0.1	.	.	.
Yes	3	1	.	.	VALUE
No	3	2	.	.	VALUE
Ukn	3	3	.	.	VALUE
;
run,

But missing values in rows for when sequence2 = 0.1 should remain missing. See the desired output:

Questionsequencesequence2var1var4var8
A10.1   
Yes11VALUE--VALUE
No12VALUE--VALUE
Ukn13VALUE--VALUE
B20.1   
Yes21--VALUE--
No22--VALUE--
Ukn23--VALUE--
C30.1   
Yes31----VALUE
No32----VALUE
Ukn33----VALUE
1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

That you've got a period stored in a character variable indicates some numeric to character conversion of your data somewhere upstream which doesn't address that missings in characters should be represented by a blank.

 

Is the change you're after just for display/a report or do you really want to change the values permanently? 

For a report look into Proc Report and compute blocks that would allow you to change the display of variables based on another variable. 

For actually changing the internal values in your table code along the line of below should do.

data want;
  set have;
  array vars(*) var1 var4 var8;
  do _i=1 to dim(vars);
    if vars(_i)='.' then
      do;
        if sequence2=0.1 then vars(_i)='--';
        else vars(_i)=' ';
      end;
  end;
  drop _i;
run;

 

View solution in original post

5 REPLIES 5
ballardw
Super User

are your variables actually numeric or character?

Your data step attempts to read variables Var1 Var4 and Var8 as numeric. So all those text "VALUE" also become missing because "VALUE" is not acceptable a numeric value.

 

SAS actually has 28 "levels" of missing. The . is just one of them. Special missing can also be indicated as . plus a letter or the underscore character. .A .B ._

So you could assign a custom format to display special missing as a dash, double dash, word or what have you.

This modifies your data step to 1) read question as a character varaible 2) changed "value" to 9 so it doesn't throw errors when read and replaces the . with .A in places you indicated.

The custom format then will display the values as desired.

data have;
input Question $	sequence sequence2 var1 var4 var8;
datalines;
A	   1	0.1	.	.	.
Yes	1	1	9	.a	9
No	   1	2	9	.a	9
Ukn	1	3	9	.a	9
B	   2	0.1	.	.	.
Yes	2	1	.a	9	.a
No	   2	2	.a	9	.a
Ukn	2	3	.a	9	.a
C	   3	0.1	.	.	.
Yes	3	1	.a	.a	9
No	   3	2	.a	.a	9
Ukn	3	3	.a	.a	9
;
run,

proc format;
value mycustommiss
. = ' '
.A= '--'
other=[best5.];
run;

proc print data=have;
 format var1 var4 var8 mycustommiss.;
run;

If you look at the data set in a the table viewer you will see A, not .A. That is okay and expected.

You can assign a special missing like .A (case doesn't matter) .Z or whatever anyplace you would assign .  as a value.

Note that the FORMAT will need to be available when using these values to display as desired.

 

Special missing will not be included in statistics (n, sum, mean std deviation etc) unless you use/request the MISSING option.

 

I have used these to document reasons data is missing such as in a survey where a respondent might refuse a question (.R) or answer "Don't Know" (.D) when I don't want those counted in the results.

 

ANKH1
Pyrite | Level 9

Hi, thank you for responding. var1, var4, var8 etc are character variables. Data=want is from an output, so I cannot change that (it comes from a macro and I rather don't change the macro). I am not sure I understand how to use format to change to . and .a. Is there any other way to program this?

mkeintz
PROC Star

If you do a PROC FREQ of your character variables VAR1, VAR4, and VAR8, remember that the blanks will be treated as missing values by PROC FREQ, but the dashes will not be so treated.  Is that what you want?

 

But if you really have two types of missingness, indicated by the values " " (blank) and "-", and you want them both automatically treated as missing by various PROC's, then you might be well advised to convert those variables to numeric variables, which as @ballardw has pointed out, supports up to 28 distinct missing values.  You can always assign formats to those variables, so that the displayed values are "Value", " ", and "-", even though the internal values are numeric.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Patrick
Opal | Level 21

That you've got a period stored in a character variable indicates some numeric to character conversion of your data somewhere upstream which doesn't address that missings in characters should be represented by a blank.

 

Is the change you're after just for display/a report or do you really want to change the values permanently? 

For a report look into Proc Report and compute blocks that would allow you to change the display of variables based on another variable. 

For actually changing the internal values in your table code along the line of below should do.

data want;
  set have;
  array vars(*) var1 var4 var8;
  do _i=1 to dim(vars);
    if vars(_i)='.' then
      do;
        if sequence2=0.1 then vars(_i)='--';
        else vars(_i)=' ';
      end;
  end;
  drop _i;
run;

 

ANKH1
Pyrite | Level 9

Thank you for responding and explaining. The change we are after is for display/report. I am not sure I follow how to code this in proc report.
Thanks again for your help.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 883 views
  • 1 like
  • 4 in conversation