BookmarkSubscribeRSS Feed
Citrine10
Obsidian | Level 7

Hi there I have the following dataset where the column name is called Text and the datatype of this column is numeric. How do I convert any column that has the name Text to a Character datatype?

Initial dataset:

Age Gender Text
3 F 45
4 M 63
4 M 88
7 M 90

I have use the below to identify the datatypes of the columns and I store this data in a table called myd

proc iml;
tClass = TableCreateFromDataSet("Work", "mycsv");
colnames = TableGetVarName(tClass);
type = TableGetVarType(tClass);
isNum = TableIsVarNumeric(tClass);
print (type // char(isNum))[c=colnames r={"Type" "IsNum"}];
create myd var {colnames type};
append;
close myd;

Then I perform a check:

proc sql;
create table test as 
select * ,
case when colnames like 'Text%' and Type <> 'C' then  'C'
else Type
end as NewType
from myd
;quit;

result:

COLNAMES TYPE NewType
String N C

Now that I can see the types and am able to identify which columns need an update, I am not sure how to instruct my initial dataset to convert my column called Text from an numeric to Character.

 

10 REPLIES 10
ballardw
Super User

I would first ask why a data set that you want a variable to be numeric was created as text. If you make sure that doesn't happen then you prevent problems instead of having to continuously fix them.

 

An extremely high percentage of data type problems like this arise from not controlling how data is brought into SAS. How did you read that data into a SAS data set in the first place? It may be easier to reread the data than to "fix" it afterward.

Citrine10
Obsidian | Level 7
hi, it can not be controlled as the are thousands of csv files loaded by various parties
mkeintz
PROC Star

Do you expect to report minimums, maximums, averages, variances, etc. of these numeric values masquerading as text?  Not even min or max in this list of stats can be reliably produced from numeric character values.

 

If there is future analysis to be done, then make the variable into a type that represents ongoing usage rather that the vagueries of what appears to be "crowd-sourced" data.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tom
Super User Tom
Super User

@Citrine10 wrote:
hi, it can not be controlled as the are thousands of csv files loaded by various parties

Can you explain more what the issue is?

It is very easy to control how variables are defined when reading a CSV file.  Those are just simple delimited text files, so just write a data step to read the file and you have full control over the names and types of the variables.

ballardw
Super User

@Citrine10 wrote:
hi, it can not be controlled as the are thousands of csv files loaded by various parties

It has nothing to do with who uploaded them, it is the process used to READ the CSV files. That is controlled by you or the recipients.

 

Show the code you are using to read ONE of these files. At least.

Kurt_Bremser
Super User

@Citrine10 wrote:
hi, it can not be controlled as the are thousands of csv files loaded by various parties

No issue at all. CSV files are read with DATA steps, where you have full control over how data is read.

Do not use PROC IMPORT, particularly not when you need consistency.

Tom
Super User Tom
Super User

If you want to convert a number into text you need decide how your want it formatted.

One way is to just use whatever format the variable already had assigned to.  This is really easy with VVALUE() function.

data want;
  set have;
  new_text = vvalue(text);
run;
Ksharp
Super User

Calling @Rick_SAS 

Kurt_Bremser
Super User

@Ksharp wrote:

Calling @Rick_SAS 


This is not a IML issue. The code here is only a very complicated replacement for a simple query against DICTIONARY.COLUMNS.

 

@Citrine10 :

proc sql;
create table test as
  select
    name,
    type,
    case 
      when name like 'Text%' and Type <> 'C' then  'C'
      else Type
    end as Newtype
  from dictionary.columns
  where libname = "WORK" and memname = "MYCSV"
;
quit;

 

 

RacheLGomez123
Fluorite | Level 6
The General Process for Software Updates
Step 1: Learn about the Availability of a New Release.
Step 2: Run the SAS System Evaluation Tool.
Step 3: Run a Deployment Registry Report.
Step 4: Assess the Impact of Upgrading the Software at Your Site.
Step 5: Request a New Product Release or Maintenance Release.
Regards,
Rachel Gomez

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1098 views
  • 0 likes
  • 7 in conversation