BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SSin
Obsidian | Level 7

I have a data set containing "Temperature" variable. This data is continuous numerical data with values, but SAS doesn't see it this way. I tried the following command from a Youtube video to try and correct this:

 

data WORK.IMPORT2; set WORK.IMPORT1;
* Convert  Temperature from character to numeric*
Temperature  = input(Temperature_n, bestw.);
run;

 

No errors show up when I run this, but in new data set all values of Temperature and Temperature_n are missing (showing as ".")

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@SSin 

Posting sample data in form of a SAS data step often leads to better suited answers quicker. 

I've now created such sample data (Have). If that's not representative for the data you're dealing with then can you please post an amended Have data set? That would make it so much easier to give you an answer.

data have;
  infile datalines truncover;
  input temperature $15.;
  datalines;
15.5
23.7
N/A
 
32.7
;
run;

data want;
  set have;
  Temperature_n=input(temperature,?? best32.);
/*  if missing(Temperature_n) then delete;*/
run;
proc print data=want;
run;

View solution in original post

12 REPLIES 12
Patrick
Opal | Level 21

@SSin 

Try: Temperature  = input(Temperature_n, best32.);

 

If above doesn't help then please post exact samples of how the string currently looks like which you want to convert into a number.

SSin
Obsidian | Level 7
Thank you for your suggestion. Temperature = input(Temperature_n, best32.); does not work either. The data contains "Temperature" with responses of numeric values. However, there are some responses with "NA". I guess it might be the reason why SAS is treating "Temperature" as character variable. The other data set that does not contain "NA" response, SAS reads it as numeric variable.

Now, I am trying to drop these "NA" response. If you do not mind could you please let me know how to drop those "NA" from my data. Thank you very much again.
Patrick
Opal | Level 21

@SSin 

Posting sample data in form of a SAS data step often leads to better suited answers quicker. 

I've now created such sample data (Have). If that's not representative for the data you're dealing with then can you please post an amended Have data set? That would make it so much easier to give you an answer.

data have;
  infile datalines truncover;
  input temperature $15.;
  datalines;
15.5
23.7
N/A
 
32.7
;
run;

data want;
  set have;
  Temperature_n=input(temperature,?? best32.);
/*  if missing(Temperature_n) then delete;*/
run;
proc print data=want;
run;
SSin
Obsidian | Level 7
Thank you very much for your patience. I really appreciate. Actually I was still learning how to make a sample data :-(. Will try your new suggestion.
SSin
Obsidian | Level 7
This one works perfect. Thank you very much indeed.
Astounding
PROC Star
You have found the right tools, but need to switch them around a bit:

temperature_n = input(temperature, 32.) ;
ballardw
Super User

@SSin wrote:

I have a data set containing "Temperature" variable. This data is continuous numerical data with values, but SAS doesn't see it this way. I tried the following command from a Youtube video to try and correct this:

 

data WORK.IMPORT2; set WORK.IMPORT1;
* Convert  Temperature from character to numeric*
Temperature  = input(Temperature_n, bestw.);
run;

 

No errors show up when I run this, but in new data set all values of Temperature and Temperature_n are missing (showing as ".")


I am going to suggest going back a step or two. You have a data set Work.Import1. How was it created? The explanation likely lies with that step. The data set name sort of implies it was the result of reading an external file, and possibly using Proc Import. There may have been something in the choice of code used to read or import the file that could be addressed at an earlier step.

 

Reading data correctly the first time saves a lot of time "fixing" it later. And if something as simple as temperature is misbehaving I suspect that unless there were very few variables you may find other problems with other variables.

SSin
Obsidian | Level 7
Thank you for the suggestion. My data set contains numeric responses but there are some responses with "NA". May be that is why SAS is treating it as character variable. Now I am trying to drop those "NA"s. If you do not mind, could you please suggest one easiest way to drop them. Thank you very much again.
andreas_lds
Jade | Level 19

@SSin wrote:
Thank you for the suggestion. My data set contains numeric responses but there are some responses with "NA". May be that is why SAS is treating it as character variable. Now I am trying to drop those "NA"s. If you do not mind, could you please suggest one easiest way to drop them. Thank you very much again.

@Patrick  already explained how to get rid of "NA": by using ?? in the input function. The ?? take care of anything that can't be converted to a number, the function returns a missing values (shown as dot). If you want to remove observations having "NA" as value, then use

if Temperature_n = "NA" then delete;

before the input-function.

 

 

Astounding
PROC Star

It depends on what you mean by "drop".  The DELETE statement gets rid of the entire observation, but that would throw out valid values for other variables.  The original solution I gave you was:

 

temperature_n = input(temperature, 32.) ;

 

That works by giving TEMPERATURE_N a missing value when TEMPERATURE is "NA" (or any other invalid numeric).  The modification that @Patrick suggested, adding ??, will suppress messages in the log about the invalid values, and give you the same result.  Best of all would be:

 

if temperature ne "NA" then temperature_n = input(temperature, 32.) ;

 

That gives you a missing TEMPERATURE_N for "NA" but it will still give you a message in the log for other invalid numerics.  That would be valuable to know about, if there are values that could be corrected.  For example, if the units were combined and the TEMPERATURE variable contained "98.5F" a message in the log would point out values that are not "NA" but are still invalid numerics so you could inspect those and see if there is something you can do about them.

ballardw
Super User

@SSin wrote:
Thank you for the suggestion. My data set contains numeric responses but there are some responses with "NA". May be that is why SAS is treating it as character variable. Now I am trying to drop those "NA"s. If you do not mind, could you please suggest one easiest way to drop them. Thank you very much again.

Which reinforces my guess that Proc Import was likely involved. If any value in the rows used to guess the variable type looks like text then the result will be text. An important consideration is that Proc Import uses generic rules to "guess" as the variable type and length. And sometimes because of different reasons such as your case, guesses incorrectly. Two additional frequent problems when using Proc Import to read files of a common structure is that the variable types might be different because of the content the file and character variables get assigned different lengths. Either of which can cause problems if you need to combine the two sets for any use.

 

Use a data step to read the data. You can use the ?? modifier with the informat to suppress error messages or my preferred method to handle known behaviors, a custom informat such as

 

proc format library = work;
invalue mytemp
'N/A' = .
other = [f5.]
;
run;

data have;
  infile datalines truncover;
  input temperature mytemp.;
  datalines;
15.5
23.7
N/A
 
32.7
;
run;

Why the custom informat? Then I don't get errors encountering the expected value but would if there was something else like MAX. The informat modifier would hide the error message. This might mean that I would want to supply a specific numeric value for MAX for use in computations such as 100.

 

Another reason is that instead of a simple missing 'N/A' = . in the informat I could specify a special missing value from .A to .Z which would tell me, and I could create a format to display, an N/A was recorded which in some cases would be different than a simple missing (which might actually mean a data collection error if the field is required). The special missing would be treated the same as missing for any calculation.

 

And if your file is something like Excel, SAVE it to a CSV file before attempting to read.

SSin
Obsidian | Level 7
Thank you very much. Will try your new suggestion. Yes I was using Porc Import to read data in excel sheet.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 9167 views
  • 9 likes
  • 5 in conversation