BookmarkSubscribeRSS Feed
Eojes
Calcite | Level 5

Hi All - Thank you in advance for your response.

 

As a new user, I've successfully used the proc import function to import my excel data. A variable was imported as a character data type and I tried to convert it to numeric. I successfully converted the variable data type, but only in a manner that drops all other variables of interest that already have the correct data type. Trying to use the new_var=input(ori_var, informant) structure with the set and keep statements does not work. I've tried moving the new variable statement before and after the keep statement without success. Below is an example with the two locations I've tried adding the new variable language (commented out). Hope this makes sense! Ultimately, to solve the issue, I had to go back to the original excel file and delete the null data that was causing the issue... In the future I'd love to solve this within the SAS environment.

 

 

data user.base;
set work.base;
*Pb=input(pb_mg_kg,best.);
keep unique_ID siteID StateID CollDate Recreation Food_production Residential Urbanized_area Non_urbanized green_space forests State mg_kg;
*Pb=input(pb_mg_kg,best.);
run;

 

Thanks, again for some insight!

7 REPLIES 7
Reeza
Super User

You can use DBSASTYPE to force a type on a single column in Excel. Your code looks fine, except you didn't include the new variable in your KEEP statement?

 

It doesn't matter where the KEEP statement goes, and since it's the only other statement in the code the order of the statements don't matter. 

This should work as expected.

 

data user.base;
set work.base;

Pb=input(pb_mg_kg,best.);

keep unique_ID siteID StateID CollDate Recreation Food_production Residential Urbanized_area Non_urbanized green_space forests State mg_kg PB;

run;
Eojes
Calcite | Level 5
Hmm, I wonder why it wouldn't convert the data type for me. The log would identify the observations that it couldn't convert and indicated that it listed those as missing, but when I would run proc contents on the data file, the data type would still be listed as "char"...
Example of log comments:
NOTE: Invalid argument to function INPUT at line 444 column 8.
NOTE: Mathematical operations could not be performed at the following places. The results of
the operations have been set to missing values.
Reeza
Super User

Without seeing I can't comment, but a guess would be that you do have some data that isn't formatted the way you expect or is missing. You could add conditional statements to avoid the INPUT if this is the case, or add ?? to suppress errors to the log regarding invalid data.

 

The full log will explain what's happening.

 

The new variable will be numeric, you cannot change a variables type in a data step, so if there is already a variable with the name you're using (PB) then that code will not work. You need to create a new variable with a new name and type. If you post the log someone should be able to help you debug this. 

 

Basically, if it didn't work, you're still doing something wrong, because this is the correct methodology.

For example, if I wanted to turn AGE into a character variable this is how I would do it:

 

data want;
set sashelp.class;

age_char = put(age, z2.);

run;

proc contents data=want;
run;


 

 

 

ballardw
Super User

@Eojes wrote:
Hmm, I wonder why it wouldn't convert the data type for me. The log would identify the observations that it couldn't convert and indicated that it listed those as missing, but when I would run proc contents on the data file, the data type would still be listed as "char"...


Your comment "date type would still be listed as char" may indicate REUSE of an existing variable. The target variable name has to not exist in the data set.

 

Another problem with "missing" results is often the actual content of the variable. If you data has something like a measurement unit, for example 235mg then the Input using "best" sees the letters mg and says "not a numeric value". When you get that type of data then you have to use some sort of string function to remove the letters. Another common value type is currency. Dollar signs and commas will be interpreted as character and need slightly more complex approaches.

 

We would have to see some actual values to supply concrete examples in this case. Proc Freq on the character variables may help identify problem values. I generally run the first couple data sets in a new project through Proc Freq on all of the variables just to get a feel for potential issues.

 

Eojes
Calcite | Level 5

Hi - thank you for your replies.

 

The variable is unique, and the numeric data only had decimal information - no dollar signs or units. Most of the data for the variable had numeric values, except in a few instances where the word "null" had been used. After running the aforementioned code, in those instances the log said:

 

NOTE: Invalid argument to function INPUT at line 17 column 8.

Top5_LabID=N.S. SiteID=4108 StateID=TX Latitude= Longitude= CollDate=23JUN2010

LandCover1=Planted/Cultivated LandCover2=Pasture/Hay Top5_Depth_cm=N.S. mg_kg=<Null>

 

NOTE: Mathematical operations could not be performed at the following places. The results of

the operations have been set to missing values.

Each place is given by: (Number of times) at (Line):(Column).

16 at 17:8

**********************************

Proc contents following the run, still showed the mg_kg variable as a character type. I went to the original document and deleted the instances of "null" resolving the issue.

 

ballardw
Super User

 

 

 Most of the data for the variable had numeric values, except in a few instances where the word "null" had been used.

 

NOTE: Invalid argument to function INPUT at line 17 column 8.

Top5_LabID=N.S. SiteID=4108 StateID=TX Latitude= Longitude= CollDate=23JUN2010

LandCover1=Planted/Cultivated LandCover2=Pasture/Hay Top5_Depth_cm=N.S. mg_kg=<Null>

 

NOTE: Mathematical operations could not be performed at the following places. The results of

the operations have been set to missing values.

Each place is given by: (Number of times) at (Line):(Column).

16 at 17:8

**********************************

Proc contents following the run, still showed the mg_kg variable as a character type. I went to the original document and deleted the instances of "null" resolving the issue.

 


The orignal variabl mg_kg was character when you brought it into the data step from your orginal posted code. It will not change type after creation. Ever.

The Invalid data message is what you would expect anytime you try to read anything that does not look like a number to SAS for the given informat you are using.

 

At the top of this thread you posted you had been attempting

Pb=input(pb_mg_kg,best.);

From the list of variables that are in that note you are not apparently doing any conversion. If you do not want the message about invalid data then you would need to use a custom informat to avoid it in the conversion.

 

Please post the entire data step with any messages you want diagnosed. AND post them in a code box using the {i} menu icon.

 

And it really helps to provide actual example data.

 

Note that depending on your original data source you may have non-printable or non-visible characters sneak into your data.

Use the Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

That datasetp will clear up any questions about the actual contents of your data. make sure that your example data includes some of the values that are not converting.

 

And repeat: show the actual code you are running.

Tom
Super User Tom
Super User

To convert a character variable to a number you will need to make a NEW variable.

Pb=input(pb_mg_kg,comma32.)

If no longer want the variable with character values in the resulting dataset then use a DROP statement.

drop pb_mg_kg ;

If you want the NEW variable to use the same name as the old variable then you can also add a RENAME statement.

rename pb= pb_mg_kg ;

If you don't want to see notes about the inability to convert specifc strings in the LOG then don't convert those strings.

if pb_mg_kg ne '<null>' then Pb=input(pb_mg_kg,comma32.)

If you don't want to see notes about ANY invalid strings then add the ?? modifier to the informat.

Pb=input(pb_mg_kg,??comma32.)

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 2301 views
  • 0 likes
  • 4 in conversation