Help using Base SAS procedures

Unable to replace replace missing data

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Unable to replace replace missing data

I have been trying to replace the observations which had "NA " in their place with some value ; but what ever I do the value at those points remain "NA"

PROC IMPORT DATAFILE="/folders/myfolders/cars.xls"
	DBMS=XLS
	OUT=WORK.cars_2;
	GETNAMES=YES;
RUN;

data xx;
set cars_2;
if n_cyl=NA then n_cyl=4;		*NOT WORKING; 
if luggage=NA then luggage=9;
run;
proc print data=xx;
run;

Accepted Solutions
Solution
‎01-13-2018 01:05 PM
Super User
Super User
Posts: 8,119

Re: Unable to replace replace missing data

Posted in reply to Sai_Sampath

Take a look at the code you wrote.

if n_cyl=NA then n_cyl=4;		*NOT WORKING; 
if luggage=NA then luggage=9;

So in the first line you are comparing the character variable N_CYL to an unknown variable named NA.  So SAS will create a new variable which will be missing for every observation.  Then in the second line you are comparing the numeric variable LUGGAGE to this new empty variable.  So since NA is missing on every observation then the second line will replace all of the missing values with nines.

1    data test;
2      n_cyl='NA';
3      luggage=.;
4    run;

NOTE: The data set WORK.TEST has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.20 seconds
      cpu time            0.01 seconds


5    data want ;
6     set test;
7    if n_cyl=NA then n_cyl=4;		*NOT WORKING;
8    if luggage=NA then luggage=9;
9    run;

NOTE: Numeric values have been converted to character values at the places given by:
      (Line):(Column).
      7:24
NOTE: Character values have been converted to numeric values at the places given by:
      (Line):(Column).
      8:12
NOTE: Variable NA is uninitialized.

The best way to fix this is to NOT put character strings like 'NA' into cells of your Excel file for numeric variables.

 

Do you really mean that the value is missing?  If so then just leave the cell empty in Excel.

 

Do you want to convey that the value is missing, but for a different reason than a simple missing value? In SAS you would use a special missing value, like .A or .B. That is hard to do in Excel and get it to easily translate into SAS.  If you store your raw data in a text file then just using the single letter from the special missing will allow you to store those and normal numeric input will convert the letter into the special missing.  You should run the MISSING statement first to let SAS know which letter to recognize and which to reject as invalid numeric values.

View solution in original post


All Replies
Trusted Advisor
Posts: 1,837

Re: Unable to replace replace missing data

Posted in reply to Sai_Sampath

Is N_CYL a char type variable?

If positive then you nead check for 'NA' instead just NA.

 

If it is numeric with a format, check what code displayes NA ?

PROC Star
Posts: 1,831

Re: Unable to replace replace missing data

right on @Shmuel , neat catch. I am just wondering why the OP hasn't commented the second if as not working like the first one

 

PROC IMPORT DATAFILE="/folders/myfolders/cars.xls"
	DBMS=XLS
	OUT=WORK.cars_2;
	GETNAMES=YES;
RUN;

data xx;
set cars_2;
if n_cyl=NA then n_cyl=4;		*NOT WORKING; 
if luggage=NA then luggage=9; /*this wouldn't work either??????*/
run;
proc print data=xx;
run;
Trusted Advisor
Posts: 1,837

Re: Unable to replace replace missing data

Posted in reply to novinosrin

Pay attention, using NA unquoted treats it as a new numeric variable.

 

"NA" or 'NA' - is treated as a literal.

Super User
Posts: 13,583

Re: Unable to replace replace missing data

Posted in reply to Sai_Sampath

Yet another reason not to use XLXS files.

 

The variable(s) containing "NA" were created as text. So you would have to use a comparison such as

if n_cyl="NA" then n_cyl=4; HOWEVER the variable will actually be text and not numeric. This line will also generate a note in the log that a numeric value was treated as text.

If you want to create a numeric variable then you would a few choices. Easiest at this point is to create new variables by inputting the text:

 

If n_cyl='NA' then Num_n_cyl=4;

else Num_n_cyl = input(n_cyl, 4.);

 

Other options would be to save the xlsx as a CSV file and use a data step to read it using a custom informat that would turn NA into the appropriate value for each variable (or if missing is the desired numeric result a standard SAS numeric informat). Note that Proc import for delimited files will create data step code that you can copy from the log and edit to change informats and such.

 

Solution
‎01-13-2018 01:05 PM
Super User
Super User
Posts: 8,119

Re: Unable to replace replace missing data

Posted in reply to Sai_Sampath

Take a look at the code you wrote.

if n_cyl=NA then n_cyl=4;		*NOT WORKING; 
if luggage=NA then luggage=9;

So in the first line you are comparing the character variable N_CYL to an unknown variable named NA.  So SAS will create a new variable which will be missing for every observation.  Then in the second line you are comparing the numeric variable LUGGAGE to this new empty variable.  So since NA is missing on every observation then the second line will replace all of the missing values with nines.

1    data test;
2      n_cyl='NA';
3      luggage=.;
4    run;

NOTE: The data set WORK.TEST has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.20 seconds
      cpu time            0.01 seconds


5    data want ;
6     set test;
7    if n_cyl=NA then n_cyl=4;		*NOT WORKING;
8    if luggage=NA then luggage=9;
9    run;

NOTE: Numeric values have been converted to character values at the places given by:
      (Line):(Column).
      7:24
NOTE: Character values have been converted to numeric values at the places given by:
      (Line):(Column).
      8:12
NOTE: Variable NA is uninitialized.

The best way to fix this is to NOT put character strings like 'NA' into cells of your Excel file for numeric variables.

 

Do you really mean that the value is missing?  If so then just leave the cell empty in Excel.

 

Do you want to convey that the value is missing, but for a different reason than a simple missing value? In SAS you would use a special missing value, like .A or .B. That is hard to do in Excel and get it to easily translate into SAS.  If you store your raw data in a text file then just using the single letter from the special missing will allow you to store those and normal numeric input will convert the letter into the special missing.  You should run the MISSING statement first to let SAS know which letter to recognize and which to reject as invalid numeric values.

New Contributor
Posts: 2

Re: Unable to replace replace missing data

Posted in reply to Sai_Sampath
Thanks Tom
Problem solved ,now I can carry on analysis
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 172 views
  • 1 like
  • 5 in conversation