Hi there I have the following dataset where the column name is called Text and the datatype of this column is numeric. How do I convert any column that has the name Text to a Character datatype?
Initial dataset:
Age | Gender | Text |
3 | F | 45 |
4 | M | 63 |
4 | M | 88 |
7 | M | 90 |
I have use the below to identify the datatypes of the columns and I store this data in a table called myd
proc iml; tClass = TableCreateFromDataSet("Work", "mycsv"); colnames = TableGetVarName(tClass); type = TableGetVarType(tClass); isNum = TableIsVarNumeric(tClass); print (type // char(isNum))[c=colnames r={"Type" "IsNum"}]; create myd var {colnames type}; append; close myd;
Then I perform a check:
proc sql; create table test as select * , case when colnames like 'Text%' and Type <> 'C' then 'C' else Type end as NewType from myd ;quit;
result:
COLNAMES | TYPE | NewType |
String | N | C |
Now that I can see the types and am able to identify which columns need an update, I am not sure how to instruct my initial dataset to convert my column called Text from an numeric to Character.
I would first ask why a data set that you want a variable to be numeric was created as text. If you make sure that doesn't happen then you prevent problems instead of having to continuously fix them.
An extremely high percentage of data type problems like this arise from not controlling how data is brought into SAS. How did you read that data into a SAS data set in the first place? It may be easier to reread the data than to "fix" it afterward.
Do you expect to report minimums, maximums, averages, variances, etc. of these numeric values masquerading as text? Not even min or max in this list of stats can be reliably produced from numeric character values.
If there is future analysis to be done, then make the variable into a type that represents ongoing usage rather that the vagueries of what appears to be "crowd-sourced" data.
@Citrine10 wrote:
hi, it can not be controlled as the are thousands of csv files loaded by various parties
Can you explain more what the issue is?
It is very easy to control how variables are defined when reading a CSV file. Those are just simple delimited text files, so just write a data step to read the file and you have full control over the names and types of the variables.
@Citrine10 wrote:
hi, it can not be controlled as the are thousands of csv files loaded by various parties
It has nothing to do with who uploaded them, it is the process used to READ the CSV files. That is controlled by you or the recipients.
Show the code you are using to read ONE of these files. At least.
@Citrine10 wrote:
hi, it can not be controlled as the are thousands of csv files loaded by various parties
No issue at all. CSV files are read with DATA steps, where you have full control over how data is read.
Do not use PROC IMPORT, particularly not when you need consistency.
If you want to convert a number into text you need decide how your want it formatted.
One way is to just use whatever format the variable already had assigned to. This is really easy with VVALUE() function.
data want;
set have;
new_text = vvalue(text);
run;
Calling @Rick_SAS
@Ksharp wrote:
Calling @Rick_SAS
This is not a IML issue. The code here is only a very complicated replacement for a simple query against DICTIONARY.COLUMNS.
proc sql;
create table test as
select
name,
type,
case
when name like 'Text%' and Type <> 'C' then 'C'
else Type
end as Newtype
from dictionary.columns
where libname = "WORK" and memname = "MYCSV"
;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.