DATA Step, Macro, Functions and more

how to replace . in label with # for Excel output?

Reply
Frequent Contributor
Posts: 91

how to replace . in label with # for Excel output?

SAS experts-  I'm asking for your help again...

I need to use PROC EXPORT to send SAS datasets to Excel, but Excel will not accept a variable label that contains a dot.  Instead, Excel wants the label to have # to produce a dot (log file below).  http://support.sas.com/kb/16/842.html

Currently, I am eliminating the dot on the way in from Oracle, but will have to solve this soon. My folks can deal with "E. Coli" not having a dot, but "Grain Size % in 2.76-3.91 um" requires it to make sense.

I don't want the # in my SAS labels, and am using the split255 macro for output. http://support.sas.com/kb/36/904.html

I have been messing with PROC DATASETS and trying to use the TRANSLATE function, but so far have been only able either to delete the labels, or make them "translate('label', '#', '.')"  (oops!)

Thanks so much for your help!

Wendy T

1    DATA TEST_DOT ;
2    INPUT NODOT ONEDOT TWODOT SEPDOT ;
3    LABEL NODOT='No Dot' ONEDOT='One.Dot' TWODOT='Two..Dots' SEPDOT='Sep.arated
3  !  . Dots' ;
4    DATALINES ;

NOTE: The data set WORK.TEST_DOT has 1 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      user cpu time       0.00 seconds
      system cpu time     0.00 seconds
      Memory                            247k
      OS Memory                         5232k
      Timestamp            4/30/2012  11:12:09 AM


6    ;
7    RUN ;
8
9    PROC EXPORT DATA=WORK.TEST_DOT OUTFILE="&data_req\tester.xlsx"
9  ! DBMS=EXCEL2007 REPLACE LABEL ;
10   RUN ;

ERROR: Bind parameters: Invalid column name: One.Dot
WARNING: File deletion failed for _IMEX_.TEST_DOT.DATA.
ERROR: Export unsuccessful.  See SAS Log for details.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE EXPORT used (Total process time):
      real time           1.24 seconds
      user cpu time       0.03 seconds
      system cpu time     0.18 seconds
      Memory                            527k
      OS Memory                         5744k
      Timestamp            4/30/2012  11:12:10 AM

11
12


13   DATA TEST_POUND ;
14     INPUT NODOT ONEDOT TWODOT SEPDOT ;
15       LABEL NODOT='No Dot' ONEDOT='One#Dot' TWODOT='Two##Dots'
15 ! SEPDOT='Sep#arated # Dots' ;
16     DATALINES ;

NOTE: The data set WORK.TEST_POUND has 1 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      user cpu time       0.00 seconds
      system cpu time     0.00 seconds
      Memory                            189k
      OS Memory                         5744k
      Timestamp            4/30/2012  11:12:10 AM


18   ;
19   RUN ;
20
21   PROC EXPORT DATA=WORK.TEST_POUND OUTFILE="&data_req\tester.xlsx"
21 ! DBMS=EXCEL2007 REPLACE LABEL ;
22   RUN ;

NOTE: "TEST_POUND" range/sheet was successfully created.
NOTE: PROCEDURE EXPORT used (Total process time):
      real time           0.99 seconds
      user cpu time       0.00 seconds
      system cpu time     0.04 seconds
      Memory                            479k
      OS Memory                         5744k
      Timestamp            4/30/2012  11:12:11 AM

PROC Star
Posts: 7,366

Re: how to replace . in label with # for Excel output?

Wendy,

If you can get away with exporting an xls rather than an xlsx file, you could just use the xls engine instead.  The following seems to accomplish exactly what you are looking for:

PROC EXPORT DATA=WORK.TEST_DOT OUTFILE="&data_req\tester.xls"

DBMS=xls REPLACE LABEL ;

RUN ;

Frequent Contributor
Posts: 91

Re: how to replace . in label with # for Excel output?

Art-

Drat!  I just tried DBMS=xls as you suggested, and had two problems.  The first problem is that a format I generated wasn't found.  Inconvenient, but not that much of a problem.

Unfortunately, the second problem is a deal-breaker.  Apparently, DBMS=XLS is restricted to a maximum of 65,536 lines of data, and almost everything I export is going to be larger (usually much larger).

Thanks for the idea!

Wendy

Respected Advisor
Posts: 4,662

Re: how to replace . in label with # for Excel output?

The problem seems to occur after the table headings are created in Excel. The following, where I create an empty table and then insert into it, worked for me.

data test;
INPUT NODOT ONEDOT TWODOT SEPDOT ;
LABEL NODOT='No Dot' ONEDOT='One.Dot' TWODOT='Two..Dots' SEPDOT='Sep.arated.Dots' ;
DATALINES ;
1 2 3 4
;

libname xl Excel "&sasforum.\Datasets\DotTest.xlsx" version=2007 scan_text=no;

proc sql dquote=ansi;
drop table xl.dotSet;
select catx(' ', name, 'as', quote(trim(label))) into :renames separated by ','
from dictionary.columns where libname='WORK' and memname='TEST';
create table xl.dotSet as
select &renames. from test where 0; /* creates an empty table */
insert into xl.dotSet select * from test;
quit;

libname xl clear;

PG
Frequent Contributor
Posts: 91

Re: how to replace . in label with # for Excel output?

PGStats-

I was able to run your code successfully for one dataset, but so far have been unable to get it to run inside the %split255 macro.

My connection just cut off for some reason, so I will have another try tomorrow.

Thanks for the idea!

Wendy T

Respected Advisor
Posts: 4,662

Re: how to replace . in label with # for Excel output?

Wendy, revisiting this problem, I realized that the tranlation of dots into pound characters is not only a workaround, but a feature of the import procedure. When you export column names containing pound characters to Excel, they appear as dots in Excel and when you import columns names containing dots from Excel they are translated to pound characters in the variable labels. Therefore, I think the best thing to do when you want to export columns names with dots, is to translate the names beforehand. This is what the following macro does.

%macro saveXlsx(dsn,xlsxFile,lib=WORK);
libname _xl Excel "&xlsxFile." version=2007;

proc sql noprint;
drop table _xl.&dsn.;
select catx(" ", name, "as", quote(translate(trim(label),"#",".")))
into :renames separated by ","
from dictionary.columns
where libname=upcase("&lib.") and memname=upcase("&dsn.");
proc sql dquote=ansi;
create table _xl.&dsn. as
select &renames. from &lib..&dsn.;
quit;

libname _xl clear;
%mend saveXlsx;

PG

PG
Frequent Contributor
Posts: 91

Re: how to replace . in label with # for Excel output?

PGStats-

Thanks so much for the macro!

Unfortunately, I have not explained my problem very well...

I have SAS datasets with labels (not variable names) that contain dots, and I need to output the SAS tables to Excel using the labels as headers (not the variable names).  I am using an adaptation of the %split255 macro to send the SAS datasets to Excel, as quite a few of the SAS datasets have more than 255 columns.  The %split255 macro uses PROC EXPORT, and so far, I have been unable to adapt that section of the macro to use the proc sql solution you suggested yesterday.

A little illustration may help.

dataset PERM.DATASET

Labels:         Station Name     Date of Collection     E. Coli     Secchi-in. 

Variables:     STA          SASDATE          EC          SEC

Using PROC EXPORT, I can choose either the name or label for the header row (and the names are not a viable option).  This is where I run into the 'feature' where Excel wants # in order to display . in the header row.

So, to get    Station Name     Date and Time     E. Coli     Secchi-in.   as the header row in Excel,

The SAS labels need to be  Station Name     Date and Time     E# Coli     Secchi-in#

However, I don't want my SAS labels to actually have # in PERM.DATASET1.

I'm trying to get to

WORK.DATASET1 with

Labels:         Station Name     Date and Time     E# Coli     Secchi-in# 

Variables:     STA               SASDATE          EC     SEC

So I can use PROC EXPORT with LABEL in the %split255 macro to output any dataset, regardless of width.

Sorry to be unclear, and thank you so much for your help!

Wendy T.

Respected Advisor
Posts: 4,662

Re: how to replace . in label with # for Excel output?

Hello, I think you can modify the split255 macro to call  the following macro :

%macro saveXlsx(dsn,xlsxFile,sheet=&dsn,lib=WORK);
libname _xl Excel "&xlsxFile." version=2007;

proc sql noprint;
drop table _xl.&sheet.;
select catx(" ", name, "as", quote(translate(trim(label),"#",".")))
into :renames separated by ","
from dictionary.columns
where libname=upcase("&lib.") and memname=upcase("&dsn.");
proc sql dquote=ansi;
create table _xl.&sheet. as
select &renames. from &lib..&dsn.;
quit;

libname _xl clear;
%mend saveXlsx;

Simply replace :

proc export data=subset&i outfile="&xlsname"

dbms=excel replace;

sheet="&sheetname&i";

run; quit;

in split255 with :

%saveXlsx(subset&i,&xlsname,sheet=&sheetname&i);

Hope it works!

PG

PG
Frequent Contributor
Posts: 91

Re: how to replace . in label with # for Excel output?

PGStats-

It worked beautifully... until I had labels too long to be SAS column names.  Drat.

SYMBOLGEN:  Macro variable RENAMES resolves to STATION as "Station Name",DATE

            as "Date Numeric",SASDATE as "Date",TIME as "Time",SJR_TYPE as

            "Sample Type",SUB_PROJ as "Subproject Name",WATERT as "Water

            Temp",WATERT_ as "Water Temp remark"

going well ... until...

NOTE: Line generated by the macro variable "RENAMES".

16     "Secchi-in remark",DEPTH10 as "Depth to 10% of Surface Light",DEPTH10_ as

16  !  "Depth to 10% of Surface Light remark",STAGE as "Stage",STAGE_ as "Stage

       ------------------------------------

       65

16  ! remark",BOD5 as "BOD",BOD5_ as "BOD remark",APPCOLOR as "Apparent

16  ! Color-Unfiltered",APPCOLOR_ as "Apparent

ERROR 65-58: Name 'Depth to 10% of Surface Light remark' is too long for a SAS

             name in this context.

...but once I looked at the LOG for awhile, your awesome catx() put me on to what will be my answer - changing 'as' to '=' will give me the business end of a LABEL statement!

Thus, I should be able to put the following code at the front of %split255, reset the pointers in the macro to TEMP, and I (hopefully) will have it!

proc sql noprint;

select catx(" ", NAME, "=", quote(translate(trim(label),"#",".")))

into :relabel separated by " "

from dictionary.columns

where libname=upcase("&lib.") and memname=upcase("&dsn.") AND LABEL CONTAINS '.' ;

quit ;

data temp ;

set &lib..&dsn ;

label &relabel ;

run ;

SYMBOLGEN:  Macro variable RELABEL resolves to PH = "pH-# dot Field" DO =

            "Diss# Oxy#" SECCHI_IN = "Secchi-in#"

1974  %PUT &RELABEL ;

PH = "pH-# dot Field" DO = "Diss# Oxy#" SECCHI_IN = "Secchi-in#"

1975  RUN ;

1976

I'll post when I get a working version.

Thank you!!!  I can't begin to tell you how much I appreciate your help!!!   :smileygrin:

Wendy T

Frequent Contributor
Posts: 91

Re: how to replace . in label with # for Excel output?

Many, many thanks to PGStats, who gave me the inspiration with the catx() above.

I am using this macro as part of a 'canned' water quality retrieval from our main Oracle database.  It is run in the early morning by batch jobs, and creates SAS datasets and exports subsets to Excel.  Macro variables used in the retrieval determine the locations, WQ parameters of interest, etc. and the current number of columns varies from 100 to 400 or so, depending on what is requested.

Currently, PROC EXPORT to Excel is limited to 255 columns, and the %split255 macro will split the output over as many tabs as needed in the same workbook.  Also, Excel does not accept column headers containing a dot.  Thus, the need to translate labels.  I also wanted to add and use row number as a column across tabs in the workbook, because the selections may not all have a common set of columns.

I'm sure this can be done better - I had to add ROW to the second LABEL statement to account for cases where none of the original labels contained dots.  Originally, I did not have the WHERE restriction in the SQL for the labels, but it was much slower, and I did not want to hit the limit on macro variable length.  I tinkered with a count of the labels containing . and some %if %then, but couldn't quite get it to work.

Thanks again for the help, and any suggestions for improvements would be very welcome!      

Wendy T

My additions to the %split255 macro:

I set some of the macro variables in the original to values I will typically use.

Locally define two macro variables (and set to null):

&ORLABEL to hold the original label values with .

&RELABEL to hold the transformed label values with #


Set the dataset to itself, and generate the row number with _N_ as ROW to use as a common column on all sheets, and LABEL ROW and &RELABEL to get #  labels for export to Excel.

Then run the original %split255 macro code with the LABEL option added to PROC EXPORT.

At the end of the original code, set the dataset to itself again, LABEL ROW and &ORLABEL to return . to the labels. Then drop ROW to have the original dataset again.

----------------

%macro split255(mylibref=,sasdsname=,tempds=,xlsname=,sheetname=,deleteit=,varcom=,comname=);

*set some of the macro variable values *;
%LET tempdir=D:\SASData\&SYSUSERID ;
%LET tempds=subgroup ;
%LET sheetname=datasheet ;

*define local macro variables for original and transformed labels*;
%LOCAL RELABEL ORLABEL ;
%LET RELABEL= ;
%LET ORLABEL= ;


*obtain values for original and transformed labels*;
proc sql noprint;

select catx(" ", NAME, "=", quote(trim(label)))
into Smiley Surprisedrlabel separated by " "
from dictionary.columns
where libname=upcase("&mylibref.") and memname=upcase("&sasdsname.") AND LABEL CONTAINS '.' ;
%PUT &ORLABEL ;

select catx(" ", NAME, "=", quote(translate(trim(label),"#",".")))
into :relabel separated by " "
from dictionary.columns
where libname=upcase("&mylibref.") and memname=upcase("&sasdsname.") AND LABEL CONTAINS '.' ;
%PUT &RELABEL ;
quit ;
RUN ;


*set dataset to itself  generate row number and add ROW label and relabel if needed* ;
DATA &mylibref..&sasdsname ; SET &mylibref..&sasdsname ;
ROW=_N_  ;
LABEL ROW='Row in SAS dataset' &RELABEL ;
RUN ;


*begin original macro code*;

proc sql;

create table testcols as

select * from dictionary.columns

where libname=upcase("&mylibref") and memname=upcase("&sasdsname");

run;

****** etc********


%end;
%mend loopit;
%loopit;

*end of original macro code ;

*set dataset back to itself then label ROW and back to original label(s) containing . *;
*label for ROW is only present to placehold if &ORLABEL is null* ;
*drop ROW to get original dataset* ;
DATA &mylibref..&sasdsname ; SET &mylibref..&sasdsname ;
LABEL ROW='Row in SAS dataset' &ORLABEL ;
DROP ROW ;
RUN ;

%mend split255;

then I use

%split255(mylibref=&DATA_LIB, sasdsname=&DSN_OVERALL_MATRIX, xlsname=&OUTDIR\&DSN_OVERALL_MATRIX..xlsx, deleteit=destroy, varcom=1, comname=ROW ); 

to use ROW as the common column on the individual tabs.

Respected Advisor
Posts: 4,662

Re: how to replace . in label with # for Excel output?

Very nice of you to wrap this up this way. Thanks!

PG

PG
Respected Advisor
Posts: 4,662

Re: how to replace . in label with # for Excel output?

As a complement of information, I just finished writing the ExcelLoad macro that allows you to export any size of dataset to a single Excel sheet. it is located at :

PG

PG
Ask a Question
Discussion stats
  • 11 replies
  • 2106 views
  • 6 likes
  • 3 in conversation