BookmarkSubscribeRSS Feed
wylamw1
Calcite | Level 5

Hi,

I'm trying to import a xls file which has a very strange format. Variable name starting in cell E4 and actual data starting from E33. I can specify range of data in import by the "Range" but how can I get SAS to read in variable name from cell E4? Instead of manually deleting other info in the xls, is there anyway to call SAS to read in variable name and data correctly?

4 REPLIES 4
Tom
Super User Tom
Super User

So the names are on the 4th row?
And the data starts on the 33rd row?

Read the data using GETNAMES='NO'.

Then read the column headers the same way.

Then use the column headers to make your own names for the the variables.

Let's make a example sheet with 4 columns that start in column "C". With the headers on row 4 and the data starting on row 9.

%let path=c:\downloads;
data example;
  infile cards dsd truncover ;
  input (col1-col6) (:$20.);
cards;
title1
title2
title3
ignore1,ignore2,var1,var2,var3,var4
skip
skip
skip
skip
skip,skip,1,2,3,4
skip,skip,5,6,7,8
;

proc export file="&path\example.xls" replace dbms=xls ;
  putnames='no';
run;

Now let's try to read the data and the names into separated datasets and then generate a RENAME statement to change the automatically generated names to the values in row 4.


proc import datafile="&path\example.xls" out=all dbms=xls;
  getnames='no';
run;

proc import datafile="&path\example.xls" out=names replace dbms=xls;
  getnames='no';
  range='$C4:';
run;
proc import datafile="&path\example.xls" out=data replace dbms=xls;
  getnames='no';
  range='$C9:';
run;

proc transpose data=names(obs=1) out=names_tall ;
  var _all_;
run;

proc sql noprint ;
  select catx('=',nliteral(_name_),nliteral(col1)) into :renames separated by ' ' from names_tall ;
quit;

data want;
  set data ;
  rename &renames ;
run;
Obs   var1                      var2                      var3                      var4

 1    1                         2                         3                         4
 2    5                         6                         7                         8

        Alphabetic List of Variables and Attributes

#    Variable    Type    Len    Format    Informat    Label

1    var1        Char     23    $23.      $23.        C
2    var2        Char     23    $23.      $23.        D
3    var3        Char     23    $23.      $23.        E
4    var4        Char     23    $23.      $23.        F
wylamw1
Calcite | Level 5

Thank you very much, Tom! Your code works well except that my variable name and the data do not align under the same column. Referring back to your data example provided, the variable name starts in Col C (line 4) but the actual data starts in Col D (line 9). To make things more complicated, I have numeric variable name. In addition, I also need to get data from Col A and C starting from line 9 as well, but names of these two variables are not specified in line 4....

Tom
Super User Tom
Super User

@wylamw1 wrote:

Thank you very much, Tom! Your code works well except that my variable name and the data do not align under the same column. Referring back to your data example provided, the variable name starts in Col C (line 4) but the actual data starts in Col D (line 9). To make things more complicated, I have numeric variable name. In addition, I also need to get data from Col A and C starting from line 9 as well, but names of these two variables are not specified in line 4....


Should be possible, if I understand what you are describing.

The last part is simple. Just read from column A and drop variable B and hard code the new names for A and C in the RENAME statement.

To handle "names" that are numbers you can try just using VALIDVARNAME=ANY and and use name literals in your SAS code. So to reference a variable named 123 your code would need to use '123'n.  The NLITERAL() function in my posted code already did that.  Or in the rename step add a letter or underscore so that name becomes valid SAS name.  For example if the "names" are years then  add 'YEAR' as the prefix so that variables are renamed to YEAR2000 instead of '2000'n .

To handle the matching of the name in column C to the variable in column D you might want to use a different method to match up the old and new names. You could use the output of PROC CONTENTS or DICTIONARY.COLUMNS SQL metadata table and figure out how the match on the VARNUM values.  Or read the names starting in column A also and then use the same PROC TRANSPOSE step to get the names from both datasets then matching starting at third observation from one to starting at fourth observation of the other.  So something like this:

proc transpose data=names(obs=1) out=names_tall ;
  var _all_;
run;

proc transpose data=data(obs=0) out=data_tall ;
  var _all_;
run;

data renames ;
  length oldname newname $32 ;
  set names_tall(firstobs=3 rename=(col1=newname));
  set data_tall(firstobs=4 rename=(_name_=oldname));
  if notname(newname)=1 then newname='_'||newname;
run;

proc sql noprint ;
  select catx('=',nliteral(oldname),nliteral(newname)) into :renames separated by ' ' from renames ;
quit;

 

wylamw1
Calcite | Level 5
Thanks, Tom, I'll test the code out! I really appreciate your help!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 510 views
  • 3 likes
  • 2 in conversation