BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MAXISAS
Calcite | Level 5

Hi all,


I have a huge data base including (3000 columns, 500000 rows). Some of the columns are Numeric and some Character. For some reason I want to convert all numeric to characters in my data and retain the order and name of the columns. I know some stuff about how to convert columns one by one but I am looking for a program to do this in one shot. Any help would be appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Try just running two proc transpose steps.

data have ;

   length a b $10 c d 8 e f $10 ;

   input a--f ;

   label c='Middle var';

cards;

x y 1 2 z 4

run;

proc transpose data=have out=middle ;

  var _all_;

run;

proc transpose data=middle out=want (drop=_name_);

var col1;

run;

proc print; run;

View solution in original post

20 REPLIES 20
Jagadishkatam
Amethyst | Level 16

This is a untested code, you may try using arrays as below

data want;

  set have;

array nu(*)  _numeric_;

array ch(*) $ character1-characterxx;

do i = 1 to dim(nu);

if nu(i) ne . then ch(i)=put(n(i),best.);

end;

run;

Thanks,

Jag

Thanks,
Jag
Reeza
Super User

MAXISAS wrote:

For some reason I want to convert all numeric to characters in my data and retain the order and name of the columns.

Some reason?

Tom
Super User Tom
Super User

Try just running two proc transpose steps.

data have ;

   length a b $10 c d 8 e f $10 ;

   input a--f ;

   label c='Middle var';

cards;

x y 1 2 z 4

run;

proc transpose data=have out=middle ;

  var _all_;

run;

proc transpose data=middle out=want (drop=_name_);

var col1;

run;

proc print; run;

Ksharp
Super User

Tom,

OP said he have a huge table . It is not possible for proc transpose under that scenario .

I think SQL is a better choice.

data have ;
   length a b $10 c d 8 e f $10 ;
   input a--f ;
   label c='Middle var';
cards;
x y 1 2 z 4
;
run;

data _null_;
 set sashelp.vcolumn(keep=libname memname name type where=(libname='WORK' and memname='HAVE')) end=last;
 if _n_ eq 1 then call execute('proc sql; create table want as select ');
 if type='char' then call execute(name);
  else call execute('put('||strip(name)||',best32. -l) as '||strip(name));
 if not last then call execute(',');
  else call execute(' from have;quit;');
run;

Xia Keshan

data_null__
Jade | Level 19

You solutions ignores the underlying format associated with each variable you may need to address that deficiency in your program. 

Converting with format is where PROC TRANSPOSE has an advantage although the amount of data and variables may be too much as you suggest.  I reckon the OP needs to decide what is important and perhaps give more details about the conversion from numeric to character.  Maybe for a "some reason" conversion best 32 is adequate we may never know.

art297
Opal | Level 21

I agree with DN and think that the number of variables limitation can be overcome by inserting a counter. e.g.:

data have ;

   length a b $10 c d 8 e f $10 ;

   input a--f ;

   label c='Middle var';

cards;

x y 1 2 z 4

a b 3.3754 2.842 y 77.5

run;

data need;

  set have;

  counter+1;

run;

proc transpose data=need out=middle (where=(_name_ ne 'counter')) ;

  var _all_;

  by counter;

run;

data middle;

  set middle;

  col1=left(col1);

run;

proc transpose data=middle out=want (drop=counter _name_);

var col1;

by counter;

run;

naveen_srini
Quartz | Level 8

I found this on a random search. this may help?

MAXISAS
Calcite | Level 5

Thank you all for your generous helps.

MAXISAS
Calcite | Level 5

Hello again,

I am going to explain a little more about the data and why I want this conversion. As I said the database have almost 3000 columns and I dont know which one is numeric and which one is character (unless I go through them one by one). I was just hoping the code can figure it out itself and transform it.

The whole reason that I am doing this is that I have 20 databases all have the same 2988 variables (the name of variables among all 20 files are the same) and I want to append them together. The important point is that since there are some mistakes in the content of variables (for example variable 13), the type of variables are not the same. For example, variable 13 has numeric type in database 4 and has character type in database 17. when SAS wants to append them together, they put missing data if there is any mismatch. I just want to append all these 20 files without having any missing data even if the data are not matched (cause I can fixed them later). This was the reason I want to convert all my files to character and then append them to avoid loosing any data when I SAS append them together.

Thanks all you guys in advance for your helps,

Bests

Tom
Super User Tom
Super User

If they are actually in text files instead of a database then just read them in as text to begin with.

For example if they are all in CSV files then you could even possibly read them all in one data step.

data want ;

  infile '*.csv' dsd truncover lrecl=32000 ;

  length var1-var3000 $200 ;

  input var1-var3000 ;

run;

MAXISAS
Calcite | Level 5

Thanks Tom,

Actually I have a big file which is csv. and your answer worked. Is it possible to import first line of csv file as variable names in this code?

Thanks for your help

Bests

Tom
Super User Tom
Super User

Usually I just pull the first line into the editor and convert it to a LENGTH statement.  For example if it is comma delimited I change the commas to spaces. Then program looks like.

data want ;

  infile 'myfile.csv' dsd truncover lrecl=32000 ;

  length firstvar nextvar ..... lastvar $200 ;

  input firstvar -- lastvar  ;

run;

If that is too hard you can read the first line with a program and write the names to the log or another file.

data _null_;

  infile 'myfile.csv' obs=1 dsd truncover ;

  file 'myfile.names' ;

  length name $32 ;

  do until (name=' ' );

    input name @;

    put name;

  end;

run;

Peter_C
Rhodochrosite | Level 12

I know you have moved on from the subject line (converting numerics ) to reading from .csv, but just before trying to add value on INPUT consider that SAS makes it easy to convert numbers to strings.

data strings ;

set your.data;

file 'a strings file.txt' dsd lrecl=1000000;

Put (_all_)(:);

run ;

But you want to read......

Suppose you have file big.csv

you are happy to read all cols as string, but want to see their col names.

Have a go with

Data bigstrings(compress=yes);

Infile "big.csv" dsd lrecl=1000000 dlm=','  truncover ;

input (col1-col3000 )( :$200. ) ;

run;

You will see the column headings in thr first row.

That could be followed with a step to prepare "renaming" from that row....

data _null_ ;

Filename gencode "%sysfunc(pathname(work))/gencode.sas" ;

File gencode ;

set bigstrings( obs= 1 );

put 'proc datasets nolist lib=work; '

   / ' modify bigstrings ; rename ' ;

Array col(*) ;

Do c= 1 to dim(col) ;

if not nvalid( col(c), 'V7' ) then continue ;

Put col(c)= ;

End ;

Put ' ; run; quit; ' ;

Stop;

Run;

* now apply the renaming;

%include gencode ;

Ksharp
Super User

Hi  Arthur.T ,

It is easy for SQL.


 
data have ;
   length a b $10 c d 8 e f $10 ;
   input a--f ;
   format d z4.;
   label c='Middle var';
cards;
x y 1 2 z 4
;
run;

data _null_;
 set sashelp.vcolumn(keep=libname memname name type label format where=(libname='WORK' and memname='HAVE')) end=last;
 if _n_ eq 1 then call execute('proc sql; create table want as select ');
 if type='char' then call execute(strip(name)||' label="'||strip(label)||'"');
  else call execute('strip(put('||strip(name)||','||coalescec(strip(format),'best32.')||')) as '||strip(name)||' label="'||strip(label)||'"');
 if not last then call execute(',');
  else call execute(' from have;quit;');
run;

Xia Keshan

sas-innovate-2024.png

Today is the last day to save with the early bird rate! Register today for just $695 - $100 off the standard rate.

 

Plus, pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 20 replies
  • 15371 views
  • 13 likes
  • 10 in conversation