SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Converting variable data type while keeping other variables

Reply
New Contributor
Posts: 3

Converting variable data type while keeping other variables

[ Edited ]

Hi All - Thank you in advance for your response.

 

As a new user, I've successfully used the proc import function to import my excel data. A variable was imported as a character data type and I tried to convert it to numeric. I successfully converted the variable data type, but only in a manner that drops all other variables of interest that already have the correct data type. Trying to use the new_var=input(ori_var, informant) structure with the set and keep statements does not work. I've tried moving the new variable statement before and after the keep statement without success. Below is an example with the two locations I've tried adding the new variable language (commented out). Hope this makes sense! Ultimately, to solve the issue, I had to go back to the original excel file and delete the null data that was causing the issue... In the future I'd love to solve this within the SAS environment.

 

 

data user.base;
set work.base;
*Pb=input(pb_mg_kg,best.);
keep unique_ID siteID StateID CollDate Recreation Food_production Residential Urbanized_area Non_urbanized green_space forests State mg_kg;
*Pb=input(pb_mg_kg,best.);
run;

 

Thanks, again for some insight!

Super User
Posts: 17,863

Re: Converting variable data type while keeping other variables

You can use DBSASTYPE to force a type on a single column in Excel. Your code looks fine, except you didn't include the new variable in your KEEP statement?

 

It doesn't matter where the KEEP statement goes, and since it's the only other statement in the code the order of the statements don't matter. 

This should work as expected.

 

data user.base;
set work.base;

Pb=input(pb_mg_kg,best.);

keep unique_ID siteID StateID CollDate Recreation Food_production Residential Urbanized_area Non_urbanized green_space forests State mg_kg PB;

run;
New Contributor
Posts: 3

Re: Converting variable data type while keeping other variables

Hmm, I wonder why it wouldn't convert the data type for me. The log would identify the observations that it couldn't convert and indicated that it listed those as missing, but when I would run proc contents on the data file, the data type would still be listed as "char"...
Example of log comments:
NOTE: Invalid argument to function INPUT at line 444 column 8.
NOTE: Mathematical operations could not be performed at the following places. The results of
the operations have been set to missing values.
Super User
Posts: 17,863

Re: Converting variable data type while keeping other variables

Without seeing I can't comment, but a guess would be that you do have some data that isn't formatted the way you expect or is missing. You could add conditional statements to avoid the INPUT if this is the case, or add ?? to suppress errors to the log regarding invalid data.

 

The full log will explain what's happening.

 

The new variable will be numeric, you cannot change a variables type in a data step, so if there is already a variable with the name you're using (PB) then that code will not work. You need to create a new variable with a new name and type. If you post the log someone should be able to help you debug this. 

 

Basically, if it didn't work, you're still doing something wrong, because this is the correct methodology.

For example, if I wanted to turn AGE into a character variable this is how I would do it:

 

data want;
set sashelp.class;

age_char = put(age, z2.);

run;

proc contents data=want;
run;


 

 

 

Super User
Posts: 10,516

Re: Converting variable data type while keeping other variables


Eojes wrote:
Hmm, I wonder why it wouldn't convert the data type for me. The log would identify the observations that it couldn't convert and indicated that it listed those as missing, but when I would run proc contents on the data file, the data type would still be listed as "char"...


Your comment "date type would still be listed as char" may indicate REUSE of an existing variable. The target variable name has to not exist in the data set.

 

Another problem with "missing" results is often the actual content of the variable. If you data has something like a measurement unit, for example 235mg then the Input using "best" sees the letters mg and says "not a numeric value". When you get that type of data then you have to use some sort of string function to remove the letters. Another common value type is currency. Dollar signs and commas will be interpreted as character and need slightly more complex approaches.

 

We would have to see some actual values to supply concrete examples in this case. Proc Freq on the character variables may help identify problem values. I generally run the first couple data sets in a new project through Proc Freq on all of the variables just to get a feel for potential issues.

 

New Contributor
Posts: 3

Re: Converting variable data type while keeping other variables

Hi - thank you for your replies.

 

The variable is unique, and the numeric data only had decimal information - no dollar signs or units. Most of the data for the variable had numeric values, except in a few instances where the word "null" had been used. After running the aforementioned code, in those instances the log said:

 

NOTE: Invalid argument to function INPUT at line 17 column 8.

Top5_LabID=N.S. SiteID=4108 StateID=TX Latitude= Longitude= CollDate=23JUN2010

LandCover1=Planted/Cultivated LandCover2=Pasture/Hay Top5_Depth_cm=N.S. mg_kg=<Null>

 

NOTE: Mathematical operations could not be performed at the following places. The results of

the operations have been set to missing values.

Each place is given by: (Number of times) at (Line)Smiley SadColumn).

16 at 17:8

**********************************

Proc contents following the run, still showed the mg_kg variable as a character type. I went to the original document and deleted the instances of "null" resolving the issue.

 

Super User
Posts: 10,516

Re: Converting variable data type while keeping other variables


 

 

 Most of the data for the variable had numeric values, except in a few instances where the word "null" had been used.

 

NOTE: Invalid argument to function INPUT at line 17 column 8.

Top5_LabID=N.S. SiteID=4108 StateID=TX Latitude= Longitude= CollDate=23JUN2010

LandCover1=Planted/Cultivated LandCover2=Pasture/Hay Top5_Depth_cm=N.S. mg_kg=<Null>

 

NOTE: Mathematical operations could not be performed at the following places. The results of

the operations have been set to missing values.

Each place is given by: (Number of times) at (Line)Smiley SadColumn).

16 at 17:8

**********************************

Proc contents following the run, still showed the mg_kg variable as a character type. I went to the original document and deleted the instances of "null" resolving the issue.

 


The orignal variabl mg_kg was character when you brought it into the data step from your orginal posted code. It will not change type after creation. Ever.

The Invalid data message is what you would expect anytime you try to read anything that does not look like a number to SAS for the given informat you are using.

 

At the top of this thread you posted you had been attempting

Pb=input(pb_mg_kg,best.);

From the list of variables that are in that note you are not apparently doing any conversion. If you do not want the message about invalid data then you would need to use a custom informat to avoid it in the conversion.

 

Please post the entire data step with any messages you want diagnosed. AND post them in a code box using the {i} menu icon.

 

And it really helps to provide actual example data.

 

Note that depending on your original data source you may have non-printable or non-visible characters sneak into your data.

Use the Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

That datasetp will clear up any questions about the actual contents of your data. make sure that your example data includes some of the values that are not converting.

 

And repeat: show the actual code you are running.

Super User
Super User
Posts: 6,502

Re: Converting variable data type while keeping other variables

To convert a character variable to a number you will need to make a NEW variable.

Pb=input(pb_mg_kg,comma32.)

If no longer want the variable with character values in the resulting dataset then use a DROP statement.

drop pb_mg_kg ;

If you want the NEW variable to use the same name as the old variable then you can also add a RENAME statement.

rename pb= pb_mg_kg ;

If you don't want to see notes about the inability to convert specifc strings in the LOG then don't convert those strings.

if pb_mg_kg ne '<null>' then Pb=input(pb_mg_kg,comma32.)

If you don't want to see notes about ANY invalid strings then add the ?? modifier to the informat.

Pb=input(pb_mg_kg,??comma32.)
Ask a Question
Discussion stats
  • 7 replies
  • 251 views
  • 0 likes
  • 4 in conversation