Solved: Re: How can I get excel column names as sas dataset variable names? if...

Linlin · Posted 01-17-2012 07:54 PM

Hi All,

There are more than 200 columns in my excel file. I want the excel column names as sas variable names and the first

row as sas labels. for the attached excel file, I want A,B,C,D as my sas dataset variables and Name, sex,height,weight

as their labels.

Thank you very much for your help!

art297 · Posted 01-18-2012 09:41 AM

Linlin,

Since the SAS assigned names in such cases are systematic, as are the Excel column names, can't you achieve what you want by using KSharp's proposed method and a slight bit of post-processing?

The following can probably be optimized (both code-wize and using proc datasets rather than a datastep), but it represents one way of accomplishing such renames:

/* create a sample dataset */

data have;

retain f1-f250 (250*1);

output;

run;

/* Build a renames macro variable */

proc sql noprint;

select

case

when input(substr(name,2),8.) < 27 then

name||"="||byte(input(substr(name,2),8.)+64)

when mod(input(substr(name,2),8.),26) eq 0 then

name||"="||

byte(floor((input(substr(name,2),8.)-1)/26)+64)||

byte(26+64)

else name||"="||

byte(floor((input(substr(name,2),8.)-1)/26)+64)||

byte(mod(input(substr(name,2),8.),26)+64)

end

into :renames separated by " "

from

dictionary.columns

where libname="WORK" and

memname="HAVE"

;

quit;

/* Rename the variables */

data want;

set have (rename=(&renames.));

run;

View solution in original post

Ksharp · Posted 01-17-2012 10:56 PM

I am afraid that you need to process it after importing.

There is an option getnames=no for proc import, but adding labels need you do it by hand.

Ksharp

Linlin · Posted 01-18-2012 08:30 AM

Hi Ksharp,

Thank you for your reply. When using "getnames=no" option , sas variables will be F1,F2 ....I want the variables as A,B,C D ......

art297 · Posted 01-18-2012 09:41 AM

Linlin,

Since the SAS assigned names in such cases are systematic, as are the Excel column names, can't you achieve what you want by using KSharp's proposed method and a slight bit of post-processing?

The following can probably be optimized (both code-wize and using proc datasets rather than a datastep), but it represents one way of accomplishing such renames:

/* create a sample dataset */

data have;

retain f1-f250 (250*1);

output;

run;

/* Build a renames macro variable */

proc sql noprint;

select

case

when input(substr(name,2),8.) < 27 then

name||"="||byte(input(substr(name,2),8.)+64)

when mod(input(substr(name,2),8.),26) eq 0 then

name||"="||

byte(floor((input(substr(name,2),8.)-1)/26)+64)||

byte(26+64)

else name||"="||

byte(floor((input(substr(name,2),8.)-1)/26)+64)||

byte(mod(input(substr(name,2),8.),26)+64)

end

into :renames separated by " "

from

dictionary.columns

where libname="WORK" and

memname="HAVE"

;

quit;

/* Rename the variables */

data want;

set have (rename=(&renames.));

run;

Linlin · Posted 01-18-2012 10:38 AM

Dear Art,

Thank you very very much!!! This is exactly what I want.

jeeth79usa · Posted 03-25-2012 06:45 PM

Hi,

I have a similar issue with getnames. I have to append the csv files to data sets on daily basis. I have a macro to do this using proc import. Works fine as of now. Strangely, Some of the files have column headers at the end of the csv (only for few days).

I would like to know if there is a way to read the column headers at the end of the files(dynamically, either first row of the file or last row of the file).

I know we can do this using infile, by creating my own variables. But, using infile, my code will be very lengthy, and I wont be able to use a macro (as each file structure is different). I need this to read only few files. Usually the column headers will be in the firstrow itself. Thans for your help in advance.

Here is my code for import step only.

proc import datafile=csvTest

out= work.&csvfile /*&libto..&csvfile*/

dbms = csv replace;

delimiter= ",";

/* mixed=yes ;*/

getnames=yes;

Ksharp · Posted 03-27-2012 04:52 AM

There is no such option for CSV engine.

But For XLS engine there is an option namerow= to define which row contains variable names.

Data Source	Syntax	Valid Value	Default Value	In PROC IMPORT?	In PROC EXPORT?
XLS	ENDCOL	Last column for data	Last column that contains data	Yes	No
	ENDNAMEROW	Last row for variable names	Same as NAMEROW	Yes	No
	ENDROW	Last row for data	Last row that contains data	Yes	No
	GETNAMES	Yes\|No	Yes	Yes	No
	NAMEROW	First row for variable names	First row that contains variable names	Yes	No
	NEWFILE	Yes\|No	No	No	Yes
	PUTNAMES	Yes\|No	Yes	No	Yes
	RANGE	NAME \| SHEET$UL:LR	First row	Yes	No
	SHEET	Sheet name	First sheet	Yes	Yes
	STARTCOL	First column for data	Last column that contains data	Yes	No
	STARTROW	First row for data	First row that contains data	Yes	No

Ksharp

jeeth79usa · Posted 03-27-2012 10:12 PM

Hi Sharp,

Thanks a lot for the information. Unfortunaly, I have to deal with CSV engine for now. I will defenitely keep the XLS option in the future.

I was told that there is a way to get the names from the csv files dynamically whether they are in the first row or the last row(column positions may change in the future), using _infile_ i.e. by reading the columns header line

I was planning to do something like this in 2 steps. Please let me know if you can provide an example for me or if you can let me know if I am dping it correctly.

filename readcsv path; /*for unix envt*/

data _null_;

if 'ID' then do;

if index(_infile_, 'readcsv') then do;

call symput(fountit,_N_);

call symput("inputstatement",trim(_infile_));

end;

run;

data test;

input &inputstatement;

if _n_ = &fountit then delete;

run;

Thanks in advance.

Tom · Posted 03-28-2012 01:01 AM

Here is code to input use PROC IMPORT and skip the first line with the column names.

proc import out=test file=tmpfile1 dbms=dlm replace ;
  delimiter=',';
  datarow=2;
  getnames=No;
run;

Here is code to read the names from the first row and generate LABEL statements.

filename rename temp;
data _null_;
  infile tmpfile1 truncover dlm=',' dsd ;
  file rename ;
  length label $256 ;
  do var=1 by 1 until (label=' ');
    input label @;
    if label=' ' then stop;
    put 'label var' var '=' label : $quote. ';';
  end;
run;

Here is an example of using PROC DATASETS to apply the labels.

proc datasets lib=work nolist;
  modify test;
%inc rename ;
  run;
quit;

Now if you want the variables names to be A to Z, AA to AZ, ... like Excel column names then you could generate rename statments to rename VAR1 to A, VAR2 to B, .... VAR27 to AA, etc.

Ksharp · Posted 03-28-2012 02:45 AM

What you are doing is to judge which row is a name row.

But I need a sample file or sample data to test. What data you have ?

Ksharp

jeeth79usa · Posted 03-28-2012 10:59 AM

Hello Sharp,

I have several csv files all of them with dirrefent names, and files structure.

Proc import will not work in my case as some files have date_time in charectar.

All the files have date_time and id as common variable. In the future new columns may be added very rarely.

after reading the files, as id is numeric, I am getting missing value. So, I am removing by saying if id = . then delete;

I would like to know how read the name row dynamically, wheather names are first or last row. read the row before bringing the data to SAS, and append the names based on comma dlm to the records.

date_time,id,address,abcdef,aeiou,test

2012-02-01 12:34:55,1234,jkhuc,uhibddeiui,iouyefbyj

2012-03-12 12:34:55,1234,jfccuc,uhibweddsiui,iouyffbyj

2012-02-12 12:34:55,1234,jkxehuc,uhibiui,iouybyj

2012-01-11 12:34:55,1234,jkhdduc,uhibideui,iouyfrbyj

2012-01-22 12:34:55,1234,jkhqqduc,uhiefbiui,ioudfgybyj

Tom · Posted 03-28-2012 02:00 PM

I am not sure what you mean. If your files are in random formats then PROC IMPORT is your friend. Otherwise you will need to spend time for each file deciding how many variables it has, the names of the variables, the type and length of the variables. PROC IMPORT will do that for you and get it right well over 80% of the time.

Now if you want to take a CSV file like your example and write a program to read it here are the steps I do.

1) Copy the row with the variable names into my program. Remove the commas prefix with INPUT and add formats based on what I think the type of the variable.

2) Insert the INPUT statement into a data step like this one.

data new ;

infile 'new.txt' dsd dlm=',' truncover firstobs=2 lrecl=10000 ;

input .... ;

run;

3) For some variables you might need to also add FORMAT and INFORMAT statements.

Here is the data step for the example data you posted.

data new ;

infile 'new.txt' dsd dlm=',' truncover firstobs=2 lrecl=10000 ;

informat date_time anydtdtm.;

format date_time datetime.;

length address $100 abcdef aeiou test $20 ;

input date_time id address abcdef aeiou test ;

run;

jeeth79usa · Posted 03-28-2012 03:40 PM

Hello Tom,

Thanks a bunch for both your responses. Sorry if I was not clear with my Q. Actually I did use proc import, and it did work for ~ 75% of the files(in my case). I have used the following macro to read the names dynamically. some files have the column headers at the EOF, instead of first row. This is the reason I was looking for an example to read the variable names dynamically regardless of top or bottom row, based on a common variable in all the files(in my case ID and date_time ). I have found a similar post in the blog, and I just have accommodate my variable name, read the entire column header, and use the _infile_ to read them dynamically so that I can accommodate the changes in the file structure in future.

https://communities.sas.com/message/13449#13449

Mycode:

%let path = "/sasdata/path/";

%let libto = library;

options mprint;

%macro Readcsv (csvfile=);

%let foundit = %sysfunc(exist(&path.&csvfile..csv));

%if &foundit %then %do;

filename csvTest "&path.&csvfile..csv";

data _null_;

infile csvTest;

input;

if _N_ eq 2 then do;

/* call symput("hasRows", "yes");*/

stop;

end;

run;

%end;

proc import datafile=&path.&csvfile..csv

out=&libto..&csvfile replace;

delimiter= ",";

getnames=yes;

scan=yes

mixed=yes;

run;

/* %end;*/

%mend Readcsv;

%Readcsv (csvfile= file_xyz);

%Readcsv (csvfile= file_abc);

There are some charecter constraints to deal with using proc import, and that is why I am reading the files using infile, and defining their lengths, formats and informats.

Ksharp · Posted 03-29-2012 01:57 AM

I think you could code something like :

filename x 'c:\x.csv';
data _null_;
infile x ;
input  date ?? yymmdd10. ; 
 if missing(date) then do;
                 if _n_ eq 1 then call symputx('start',_infile_);
                  else call symputx('end',_infile_);
                 stop;
                      end;
run;
%put start=&start end=&end;

Ksharp

How can I get excel column names as sas dataset variable names?

How can I get excel column names as sas dataset variable names?

How can I get excel column names as sas dataset variable names?

How can I get excel column names as sas dataset variable names?

How can I get excel column names as sas dataset variable names?

How can I get excel column names as sas dataset variable names?

Re: How can I get excel column names as sas dataset variable names? if the column names are at the end of files??

Re: How can I get excel column names as sas dataset variable names? if the column names are at the end of files??

Re: How can I get excel column names as sas dataset variable names? if the column names are at the end of files??

Re: How can I get excel column names as sas dataset variable names? if the column names are at the end of files??

Re: How can I get excel column names as sas dataset variable names? if the column names are at the end of files??

Re: How can I get excel column names as sas dataset variable names? if the column names are at the end of files??

Re: How can I get excel column names as sas dataset variable names? if the column names are at the end of files??

Re: How can I get excel column names as sas dataset variable names? if the column names are at the end of files??

Re: How can I get excel column names as sas dataset variable names? if the column names are at the end of files??

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away