BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Linlin
Lapis Lazuli | Level 10

Hi All,

There are more than 200 columns in my excel file.  I want the excel column names as sas variable names and the first

row as sas labels. for the attached excel file, I want A,B,C,D as my sas dataset variables and Name, sex,height,weight

as their labels.

Thank you very much for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Linlin,

Since the SAS assigned names in such cases are systematic, as are the Excel column names, can't you achieve what you want by using KSharp's proposed method and a slight bit of post-processing?

The following can probably be optimized (both code-wize and using proc datasets rather than a datastep), but it represents one way of accomplishing such renames:

/* create a sample dataset */

data have;

  retain f1-f250 (250*1);

  output;

run;

/* Build a renames macro variable */

proc sql noprint;

    select

      case

        when input(substr(name,2),8.) < 27 then

          name||"="||byte(input(substr(name,2),8.)+64)

              when mod(input(substr(name,2),8.),26) eq 0 then

          name||"="||

          byte(floor((input(substr(name,2),8.)-1)/26)+64)||

          byte(26+64)

        else  name||"="||

          byte(floor((input(substr(name,2),8.)-1)/26)+64)||

          byte(mod(input(substr(name,2),8.),26)+64)

        end

          into :renames separated by " "

            from

              dictionary.columns

                where libname="WORK" and

                      memname="HAVE"

  ;

quit;

/* Rename the variables */

data want;

  set have (rename=(&renames.));

run;

View solution in original post

13 REPLIES 13
Ksharp
Super User

I am afraid that you need to process it after importing.

There is an option getnames=no for proc import, but adding labels need you do it by hand.

Ksharp

Linlin
Lapis Lazuli | Level 10

Hi Ksharp,

Thank you for your reply. When using "getnames=no" option , sas variables will be F1,F2 ....I want the variables as A,B,C D ......

art297
Opal | Level 21

Linlin,

Since the SAS assigned names in such cases are systematic, as are the Excel column names, can't you achieve what you want by using KSharp's proposed method and a slight bit of post-processing?

The following can probably be optimized (both code-wize and using proc datasets rather than a datastep), but it represents one way of accomplishing such renames:

/* create a sample dataset */

data have;

  retain f1-f250 (250*1);

  output;

run;

/* Build a renames macro variable */

proc sql noprint;

    select

      case

        when input(substr(name,2),8.) < 27 then

          name||"="||byte(input(substr(name,2),8.)+64)

              when mod(input(substr(name,2),8.),26) eq 0 then

          name||"="||

          byte(floor((input(substr(name,2),8.)-1)/26)+64)||

          byte(26+64)

        else  name||"="||

          byte(floor((input(substr(name,2),8.)-1)/26)+64)||

          byte(mod(input(substr(name,2),8.),26)+64)

        end

          into :renames separated by " "

            from

              dictionary.columns

                where libname="WORK" and

                      memname="HAVE"

  ;

quit;

/* Rename the variables */

data want;

  set have (rename=(&renames.));

run;

Linlin
Lapis Lazuli | Level 10

Dear Art,

Thank you very very much!!! This is exactly what I want.

jeeth79usa
Calcite | Level 5

Hi,

I have a similar issue with getnames. I have to append the csv files to data sets on daily basis. I have a macro to do this using proc import. Works fine as of now. Strangely, Some of the files have  column headers at the end of the csv (only for few days).

I would like to know if there is a way to read the column headers at the end of the files(dynamically, either first row of the file or last row of the file).

I know we can do this using infile, by creating my own variables. But, using infile, my code will be very lengthy, and I wont be able to use a macro (as each file structure is different). I need this to read only few files. Usually the column headers will be in the firstrow itself. Thans for your help in advance.

Here is my code for import step only.

          proc import datafile=csvTest

            out= work.&csvfile /*&libto..&csvfile*/

            dbms = csv replace;

            delimiter= ",";

/*            mixed=yes ;*/

            getnames=yes;

Ksharp
Super User

There is no such option for CSV engine.

But For XLS engine there is an option namerow= to define which row contains variable names.

Data SourceSyntaxValid ValueDefault ValueIn PROC IMPORT?In PROC EXPORT?
XLSENDCOLLast column for dataLast column that contains dataYesNo
ENDNAMEROWLast row for variable namesSame as NAMEROWYesNo
ENDROWLast row for dataLast row that contains dataYesNo
GETNAMESYes|NoYesYesNo
NAMEROWFirst row for variable namesFirst row that contains variable namesYesNo
NEWFILEYes|NoNoNoYes
PUTNAMESYes|NoYesNoYes
RANGENAME | SHEET$UL:LRFirst rowYesNo
SHEETSheet nameFirst sheetYesYes
STARTCOLFirst column for dataLast column that contains dataYesNo
STARTROWFirst row for dataFirst row that contains dataYesNo

Ksharp

jeeth79usa
Calcite | Level 5

Hi Sharp,

Thanks a lot for the information. Unfortunaly, I have to deal with CSV engine for now. I will defenitely keep the XLS option in the future.

I was told that there is a way to get the names from the csv files dynamically whether they are in the first row or the last row(column positions may change in the future), using _infile_ i.e. by reading the columns header line

I was planning to do something like this in 2 steps. Please let me know if you can provide an example for me or if you can let me know if I am dping it correctly.

filename readcsv path; /*for unix envt*/

data _null_;

  if 'ID' then do;

   if index(_infile_, 'readcsv') then do;

call symput(fountit,_N_);

call symput("inputstatement",trim(_infile_));

end;

end;

run;

data test;

   input &inputstatement;

if _n_ = &fountit then delete;

run;

Thanks in advance.

Tom
Super User Tom
Super User

Here is code to input use PROC IMPORT and skip the first line with the column names.

proc import out=test file=tmpfile1 dbms=dlm replace ;

  delimiter=',';

  datarow=2;

  getnames=No;

run;

Here is code to read the names from the first row and generate LABEL statements.

filename rename temp;

data _null_;

  infile tmpfile1 truncover dlm=',' dsd ;

  file rename ;

  length label $256 ;

  do var=1 by 1 until (label=' ');

    input label @;

    if label=' ' then stop;

    put 'label var' var '=' label : $quote. ';';

  end;

run;

Here is an example of using PROC DATASETS to apply the labels.

proc datasets lib=work nolist;

  modify test;

%inc rename ;

  run;

quit;

Now if you want the variables names to be A to Z, AA to AZ, ... like Excel column names then you could generate rename statments to rename VAR1 to A, VAR2 to B, .... VAR27 to AA, etc.

Ksharp
Super User

What you are doing is to judge which row is a name row.

But I need a sample file or sample data to test. What data you have ?

Ksharp

jeeth79usa
Calcite | Level 5

Hello Sharp,

I have several csv files all of them with dirrefent names, and files structure.

Proc import will not work in my case as some files have date_time in charectar.

All the files have date_time and id as common variable. In the future new columns may be added very rarely.

after reading the files, as id is numeric, I am getting missing value. So, I am removing by saying if id = . then delete;

I would like to know how read the name row dynamically, wheather names are first or last row. read the row before bringing the data to SAS, and append the names based on comma dlm to the records.

date_time,id,address,abcdef,aeiou,test

2012-02-01 12:34:55,1234,jkhuc,uhibddeiui,iouyefbyj

2012-03-12 12:34:55,1234,jfccuc,uhibweddsiui,iouyffbyj

2012-02-12 12:34:55,1234,jkxehuc,uhibiui,iouybyj

2012-01-11 12:34:55,1234,jkhdduc,uhibideui,iouyfrbyj

2012-01-22 12:34:55,1234,jkhqqduc,uhiefbiui,ioudfgybyj

Tom
Super User Tom
Super User

I am not sure what you mean. If your files are in random formats then PROC IMPORT is your friend. Otherwise you will need to spend time for each file deciding how many variables it has, the names of the variables, the type and length of the variables.  PROC IMPORT will do that for you and get it right well over 80% of the time.

Now if you want to take a CSV file like your example and write a program to read it here are the steps I do.

1) Copy the row with the variable names into my program.  Remove the commas prefix with INPUT and add formats based on what I think the type of the variable.

2) Insert the INPUT statement into a data step like this one.

data new ;

   infile 'new.txt' dsd dlm=',' truncover firstobs=2 lrecl=10000 ;

   input .... ;

run;

3) For some variables you might need to also add FORMAT and INFORMAT statements.

Here is the data step for the example data you posted.

data new ;

  infile 'new.txt' dsd dlm=',' truncover firstobs=2 lrecl=10000 ;

   informat date_time anydtdtm.;

   format date_time datetime.;

   length address $100 abcdef aeiou test $20 ;

  input date_time id address abcdef aeiou test ;

run;

jeeth79usa
Calcite | Level 5

Hello Tom,

Thanks a bunch for both your responses. Sorry if I was not clear with my Q. Actually I did use proc import, and it did work for ~ 75% of the files(in my case). I have used the following macro to read the names dynamically. some files have the column headers at the EOF, instead of first row. This is the reason I was looking for an example to read the variable names dynamically regardless of top or bottom row, based on a common variable in all the files(in my case ID and date_time ). I have found a similar post in the blog, and I just have accommodate my variable name, read the entire column header, and use the _infile_ to read them dynamically so that I can accommodate the changes in the file structure in future.

https://communities.sas.com/message/13449#13449

Mycode:

%let path = "/sasdata/path/";

%let libto = library;

options mprint;

%macro Readcsv (csvfile=);

   %let foundit = %sysfunc(exist(&path.&csvfile..csv));

   %if &foundit %then %do;

      filename csvTest "&path.&csvfile..csv";

      data _null_;

         infile csvTest;

         input;

         if _N_ eq 2 then do;

/*            call symput("hasRows", "yes");*/

            stop;

         end;

      run;

         %end;

          proc import datafile=&path.&csvfile..csv

            out=&libto..&csvfile replace;

            delimiter= ",";

            getnames=yes;

            scan=yes

            mixed=yes;

         run;

/*   %end;*/

%mend Readcsv;

%Readcsv (csvfile= file_xyz);

%Readcsv (csvfile= file_abc);

There are some charecter constraints to deal with using proc import, and that is why I am reading the files using infile, and defining their lengths, formats and informats.

Ksharp
Super User

I think you could code something like :

filename x 'c:\x.csv';
data _null_;
infile x ;
input  date ?? yymmdd10. ; 
 if missing(date) then do;
                 if _n_ eq 1 then call symputx('start',_infile_);
                  else call symputx('end',_infile_);
                 stop;
                      end;
run;
%put start=&start end=&end;


Ksharp

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 15816 views
  • 4 likes
  • 5 in conversation