Why valid values became missing

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 131
Accepted Solution

Why valid values became missing

Hi,

I have a set of variables which I am recoding so certain valid values are reinterpreted and displayed as .M (missing), .N (not applicable), and .U (unexpected) valid values in a data cleaning step.

To do this, I first create a new variable n2_original variable name = original variable name.


Then, for each n2 variable, I include if/then logic for the .M, .N. and .U values.

For example:

n2_adv= adv;

if (status=1 or status =6) and adv=" " then n2_adv=.M;

if status^=1 or status^=6 then n2_adv=.N;

My original adv variable has 526 character valid values.  However, after I run this code, my .M and .N frequencies are very unexpected and I'm unsure why.  Ideas??

When I run freqs for the original adv variable, SAS shows the 526 character valid values but in the freqs for the new variable it is counting the 526 character valid values as missing.

If I should clarify, let me know.


Thanks in advance!


Accepted Solutions
Solution
‎09-29-2014 03:57 PM
Super User
Posts: 11,104

Re: Why valid values became missing

I'm a little concerned that you say "526 character valid values" as the example code you show numeric values

if (status=1 or status =6) and adv=" " then n2_adv=.M;

if status^=1 or status^=6 then n2_adv=.N;

If they were actually character values you should be getting all kinds of conversion warnings.

This line is the likely culprit:

if status^=1 or status^=6 then n2_adv=.N;

As "not equal to 1 or not equal to 6" is going to be true for any value of status. For your consideration: 1 is not equal to 6 so the second part of the comparison is true and the result is true.

What explicit values of status are in your data? Which are unexpected? and which are Not Applicable?

View solution in original post


All Replies
Super Contributor
Posts: 490

Re: Why valid values became missing

Could you share your code and sample output?

You need to clarify more

Frequent Contributor
Posts: 131

Re: Why valid values became missing

code.PNG

Above is the code for recoding .M, .N, and .U values and below are the freqs for the relevant new variables.  What I don't understand:  even though my original adm01_adv var has 526 character valid values and a SAS freq on original variable will show that, my new n2_adm01_adv variable doesn't show the 526 character valid values. Instead, I think it s interpreting it as missing...

n2_status.PNGn2 adv.PNG

Super User
Posts: 11,104

Re: Why valid values became missing

The first line of you code is possibly doing what you expect but the second line is assigning ALL other values to .N. Which is a "special Missing" value.

Create a format:

Proc format;

value mymiss

.N='Not Applicable'

.U='Unexpected'

.M='Missing'

;

run;

and use that format for

Proc freq;

tables adm01_status  n2_adm01_adv/ missing;

format n2_adm01_adv mymiss.;

run;

you might get a clue. There is a real strong hint when you get 678 results in your current results which happens to be the number of 1, 6 and 7 for adm01_status.

Solution
‎09-29-2014 03:57 PM
Super User
Posts: 11,104

Re: Why valid values became missing

I'm a little concerned that you say "526 character valid values" as the example code you show numeric values

if (status=1 or status =6) and adv=" " then n2_adv=.M;

if status^=1 or status^=6 then n2_adv=.N;

If they were actually character values you should be getting all kinds of conversion warnings.

This line is the likely culprit:

if status^=1 or status^=6 then n2_adv=.N;

As "not equal to 1 or not equal to 6" is going to be true for any value of status. For your consideration: 1 is not equal to 6 so the second part of the comparison is true and the result is true.

What explicit values of status are in your data? Which are unexpected? and which are Not Applicable?

Super User
Posts: 5,353

Re: Why valid values became missing

You haven't decided if you want your new variable n2_adv to be character or numeric.  It can't be both.

If it's character, you can't store .N or .M (although you could store ".N" and ".M").  If it's numeric, you can't store all the other ADV values in it.

What are you trying to create as  your new variable?

Frequent Contributor
Posts: 131

Re: Why valid values became missing

Thanks so far all!

n2_adv should be character.  I can't code .M, .N, and .U for a character variable, then?

Super User
Posts: 11,104

Re: Why valid values became missing

I'm not sure why you want .N but the assignment statement

if status^=1 or status^=6 then n2_adv=".N";

should work to assign ".N". UNLESS you have done something to create the variable as numeric above the section of code you have displayed, such as an assignment without the quotes.

However, your logic with multiple "not equal to 1 or not equal to 6 or not equal to 7" will assign almost all of your values to ".N" and you will not get any of the ".U" because you have a logic flaw.

Frequent Contributor
Posts: 131

Re: Why valid values became missing

thanks so much!  Both for pointing out the logic flaw and character/numeric issue

Super Contributor
Posts: 490

Re: Why valid values became missing

then n2_adv = '.N'

then n2_adv = '.M'

then n2_adv = '.U'

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 307 views
  • 0 likes
  • 4 in conversation