DATA Step, Macro, Functions and more

Trying to create new variable but it is returning missing values for all observations

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Trying to create new variable but it is returning missing values for all observations

[ Edited ]

Hi everyone,

 

I am trying to create a new variable but am running into a problem where the new variable returns "." for all observations, even though I can see from my data that these should not be missing.  I am using SAS9.   Please see my code below to illustrate the problem I am having:

 

data mergedtest;
    set work.merged_all;

    if completed="Y" then test1=1;
    if experience= "1 to 5 years" then test2=1;
    if stream= "General Labour" then test3=1;

run;

 

The "completed", "experience", and "stream" variables are all string variables.  SAS does not tell me that there are any errors when I run this data step.  However, when I run the tables command on test1, test2, and test3 (to check to see if variables were created properly), test1 and test2 are fine, but test3 returns all missing values (even though i can see from the raw data that several observations have a value of "General Labour" for the "stream" variable).

 

Has anyone had this issue before?  Any thoughts on what might be causing this problem? Any suggestions would be very appreciated.

Thank you for taking the time to help!

 

Take care,

Yuna


Accepted Solutions
Solution
‎06-22-2017 01:52 PM
Super User
Posts: 5,093

Re: Trying to create new variable but it is returning missing values for all observations

[ Edited ]

So if DEBUG2 is Y, it suggests there is something else in the data besides blanks. 

 

The most common suspects:  a carriage return, or hex null characters instead of blanks. 

 

The way to find this out is to print out the values in hex form.  For example, if STREAM is defined as $ 25, try:

 

hex_format = put(stream, $hex50.);

 

Then run the PROC FREQ on HEX_FORMAT.

 

If you want a quick fix for the program without addressing stray characters in the data, you can always use:

 

if stream =: 'General Labour' then test3 = 1;

View solution in original post


All Replies
Trusted Advisor
Posts: 1,630

Re: Trying to create new variable but it is returning missing values for all observations


Yuna1 wrote:

Hi everyone,

 

I am trying to create a new variable but am running into a problem where the new variable returns "." for all observations, even though I can see from my data that these should not be missing.  I am using SAS9.   Please see my code below to illustrate the problem I am having:

 

data mergedtest;
    set work.merged_all;

    if completed="Y" then test1=1;
    if experience= "1 to 5 years" then test2=1;
    if stream= "General Labour" then test3=1;

run;

 

The "completed", "experience", and "stream" variables are all string variables.  SAS does not tell me that there are any errors when I run this data step.  However, when I run the tables command on test1, test2, and test3 (to check to see if variables were created properly), test1 and test2 are fine, but test3 returns all missing values (even though i can see from the raw data that several observations have a value of "General Labour" for the "stream" variable).


When stream is not "General Labour", test3 gets assign a missing value. Similarly, if completed is not "Y", test1 gets assigned missing. And I'm sure you can see the pattern and so test2 also gets assigned missing.

Occasional Contributor
Posts: 5

Re: Trying to create new variable but it is returning missing values for all observations

[ Edited ]

Hi PaigeMiller,

 

Thank you for your response! I see what you're saying for sure; that makes complete sense.

 

I ran a table command on the "Stream" variable to see how many "General Labour" responses I should expect  and I was hoping that test3 would equal 1 for 11,908 observations (and the rest would be missing).  So I'm at a loss as to what might be going on here.

 

 

Thanks again - any additional thoughts you  may have are, of course, very welcome!

 

Yuna

 

 

Super User
Posts: 5,093

Re: Trying to create new variable but it is returning missing values for all observations

It's time to dig into the values of STREAM, to see what is really contained there.  For example:

 

data want;

set have;

if stream ne 'General Labour' then do;

   if left(stream) = 'General Labour' then debug1='Y';

   if stream =: 'General Labour' then debug2='Y';

end;

run;

 

DEBUG1:  find out whether the values of STREAM contain leading blanks

DEBUG2:  find out whether some stray character such as a carriage return appears after the characters "General Labour"

 

Anything else you could imagine can be tested.  But SAS is telling you that the actual value is actually a little bit different than "General Labour"

 

Trusted Advisor
Posts: 1,400

Re: Trying to create new variable but it is returning missing values for all observations

Some more options to test beyond given by @Astounding:

 

- can it be that "General Labour" has two spaces between the words or some unvisible characters at end? 
   Check by

 

 -  if compbl(stream) = 'General Labour' then debug3='Y';

 -  if length(strip(stream)) ne length('General Labour') then debug4='Y';

Occasional Contributor
Posts: 5

Re: Trying to create new variable but it is returning missing values for all observations

Thank you for your help! These are handy troubleshooting tips to know about.
Occasional Contributor
Posts: 5

Re: Trying to create new variable but it is returning missing values for all observations

Thank you very much!

I tried your suggested code and got the following result:
For the General Labour values, debug1 returns a blank (i.e. not equal to Y) and debug2=Y. Does this suggest I have a stray character after "General Labour"?

Thank you again!
Solution
‎06-22-2017 01:52 PM
Super User
Posts: 5,093

Re: Trying to create new variable but it is returning missing values for all observations

[ Edited ]

So if DEBUG2 is Y, it suggests there is something else in the data besides blanks. 

 

The most common suspects:  a carriage return, or hex null characters instead of blanks. 

 

The way to find this out is to print out the values in hex form.  For example, if STREAM is defined as $ 25, try:

 

hex_format = put(stream, $hex50.);

 

Then run the PROC FREQ on HEX_FORMAT.

 

If you want a quick fix for the program without addressing stray characters in the data, you can always use:

 

if stream =: 'General Labour' then test3 = 1;

Occasional Contributor
Posts: 5

Re: Trying to create new variable but it is returning missing values for all observations

Thank you so much! This was really helpful. I realized there are some hidden carriage returns that were causing problems. I used the compress command (newvar=compress(stream,'0D0A'x)) to get rid of them and it seems to be working. Thank you for helping me to trouble shoot and offering your advice!
Super User
Posts: 10,538

Re: Trying to create new variable but it is returning missing values for all observations

Perhaps instead of

if completed="Y" then test1=1;

You actually want

 

Test1 = (completed="Y");

This will assign a value of 1 ("true", "yes" similar boolean interpretations apply) when completed equals 'Y' and 0 when completed is not equal to 'Y'.

 

And similar for you other test variables.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 131 views
  • 3 likes
  • 5 in conversation