The SAS Output Delivery System and reporting techniques

Leading Zero's with CSV

Reply
Contributor
Posts: 27

Leading Zero's with CSV

Hi all,

I am looking at creating a series of excel files using SAS 9.1.3 on a mainframe (z/OS). Becuase of size issues and the fact that not much formatting is needed in the excel file I think that the CSV destination in ODS might be my best bet (The files produced using CSV are much smaller than other destinations). I've been playing around with it a little and the one deal breaker I see is losing leading zeros in numbers that need to be treated as text.

What options exist to preserve leading zero's when using CSV as my destination? Also what is the difference between ODS CSV and ODS TAGSETS.CSV?

I came across this solution on the website:

http://support.sas.com/rnd/base/ods/templateFAQ/Excel1.pdf

/* PROC TEMPLATE code that modifies the default CSV tagset by appending the ‘=’ */

proc template;

define tagset Tagsets.test;

parent=tagsets.csv;

define event data;

put "," / if !cmp( COLSTART , "1" );

put '=' """" / if cmp( TYPE , "string" );

put VALUE;

put """" / if cmp( TYPE , "string" );

end;

How would I apply this solution to  a variable in proc report? Are there any other options that people have used?

Thanks,

SAS Super FREQ
Posts: 8,868

Re: Leading Zero's with CSV

Hi:

  That Tech Support PDF file does show an alternate technique that does NOT need a custom TAGSET template. The method requires that you put an '=' sign in front of your quoted character variable, as shown in the Method 1 code below.

  You only need the TAGSET template solution shown in that paper if you are unwilling to put an '=' in front of the variable with the leading zero.

(BTW, TAGSETS.CSV is the tagset template that is the "production" name for the CSV destination. TAGSETS.CSV has had a nickname or alias created, so that you could invoke the destination by coding ODS CSV instead of ODS TAGSETS.CSV.)

  I would actually recommend 2 possible alternate approaches, that do not require a custom tagset template or altering your data (as shown in the Method 2 and Method 3 code below).

cynthia

data method1;
   set sashelp.class;   

   leadzero = catt('=','"000',age,'"');
run; 
           
title; footnote;
ods listing;
proc print data=method1;
var name age leadzero;
run;
       
** Method 1: Use data that has an = in front of value;
ods listing close;
ods csv file='c:\temp\uselead.csv';
          
proc report data=method1 nowd;
column name age leadzero;
run;
ods csv close;
  
** Method 2: Use TAGATTR with TAGSETS.EXCELXP and style=minimal;
ods tagsets.excelxp file='c:\temp\uselead_xp.xml'
    style=minimal;
          
proc report data=sashelp.class nowd;
  column name age height weight;
  define age / display
         style(column)={tagattr="00000"};
  define height / display
         style(column)={tagattr="00000"};
run;
ods tagsets.excelxp close;
          
** Method 3: Use ODS MSOFFICE2K with HTMLSTYLE;
ods msoffice2k file='c:\temp\uselead_mso.xls' style=minimal;
          
proc report data=sashelp.class nowd;
  column name age height weight;
  define age / display
         style(column)={htmlstyle="mso-number-format:00000"};
  define height / display
         style(column)={htmlstyle="mso-number-format:00000"};
run;
ods msoffice2k close;

Contributor
Posts: 27

Leading Zero's with CSV

Posted in reply to Cynthia_sas

Thanks Cynthia,

     Adding the '=' in front of the variable seems like a reasonable approach. I still don't understand what the tagset template solution detailed in the tech support paper is doing or how I would use it?

     I agree that your method 2 will work as I have used it before. My concern with using TAGSETS.EXCELXP is the size of the file it produces. I've been testing different approaches trying to decide which is the best. I created an .xml file using excelxp, the data has 20,000 observations and 17 columns and I had trouble getting my job to run to completion because of space issues. I created using HTML and I got the job to complete and it created a 14,000 kb file. I also created using CSV and it created 1,600 kb file.

     I have never used MSOFFICE2K, but I'm assuming it creates an XML file similar to excelxp? If so, I'm guessing I will run into the same size and space problems as with excelxp.

     This was only a test, in reality many of the files I need to create will be smaller but some will also be larger. I feel if I can solve the leading zero problem in CSV it will be a better aproach because of both the file size and CPU time to create the files.

SAS Super FREQ
Posts: 8,868

Re: Leading Zero's with CSV

Hi:

  ODS MSOFFICE2K creates an HTML file that conforms to the Office 2000 "flavor" of HTML. Microsoft designed their own set of HTML tags specifically for Office 2000. I would expect the HTML file to be slightly smaller than the XML file produced by TAGSETS.EXCELXP.

  The tagset template solution in the Tech Support paper is providing an alternative to writing a program that adds the = to your variable with the leading zeroes. A tagset template is a way to add "markup" information to SAS output results. The "markup" information will depend on what type of "markup" information you are trying to create. For example, if you want to have HTML markup information added to your SAS results, then you would use an HTML based destination; if you want to have XML markup information added to your SAS results, then you would use an XML based destination. A comma-delimited file created with ODS CSV is nothing more than your output results "marked up" with commas and quotes.

  For example, this shopping list, written on paper:

Sarah - books $50

Lee - bike stuff $25

Fred - sweater $25

would be easier to open in an Excel spreadsheet, if it looked like this and was saved as a CSV file:

Person, Gift,Limit

"Sarah","books",50

"Lee","bike stuff",25

"Fred",'Sweater",25

  And, the same list would be openable in an HTML file, if it looked like what is shown in the attached screen shot that shows the HTML markup next to the browser view of the HTML file. As you can see from the screenshot, the "data" pieces of information are the same in the paper list, the comma-delimited version and the HTML version. What changes is the markup information that has been added to make the list readable and renderable by software applications like Excel or a browser.

  A tagset template tells ODS what "markup information" should be added to your results file. The Tech Support solution is creating a customized version of the CSV tagset template, which adds the = for you, so you don't need a DATA step program.

  The program below creates a custom tagset template called TAGSETS.SILLY -- if I want to invoke this tagset template, as an ODS destination, I have to use one of these 2 methods (since I am not able to set an alias for my tagset template like the folks in R&D can do):

ods markup tagset=silly file='c:\temp\silly.txt'; OR

ods tagsets.silly file='c:\temp\silly.txt';

  When I use the SILLY tagset template, my result file looks like this:

~Name~Sex~Age~Height~Weight

~Philip~M~16~72~150

where the tilde character (~) is the "markup information" and the rest of the information is either the Header or Data information for my single observation. Silly example, but conveys the fact that a tagset template merely provides ODS with information about what kind of markup information to add to the result file created by a SAS procedure.

  I have not tried the template approach outlined in the Tech Support example. But looking at the code, I would imagine that it would put an = and double quotes around EVERY string variable in the dataset. That might not be what you want. Any other solution than the one shown will involve a steeper learning curve about tagset templates.

cynthia

Custom TAGSET.SILLY code:

ods path work.temp(update)

         sasuser.templat(update)

         sashelp.tmplmst(read);

                 

proc template;

  define tagset tagsets.silly;

    define event header;

      put "~" value ;

    end;

    define event data;

      put "~" value ;

    end;

    define event row;

      put nl;

    end;

  end;

run;

               

ods tagsets.silly file='c:\temp\silly1.txt';

    

proc print data=sashelp.class noobs;

  where age = 16;

run;

ods tagsets.silly close;

 

ods markup tagset=silly file='c:\temp\silly2.txt';

  

proc print data=sashelp.class noobs;

  where age = 16;

run;

ods markup close;


pix_html_browser_textpad.jpg
Contributor
Posts: 27

Leading Zero's with CSV

Posted in reply to Cynthia_sas

I tried putting an '=' in front of my data and using the template solution but it did not work. Basically it put an ' =""" ' in frontof evey variable for every observation. Maybe I am using an old version of the tagset?

SAS Super FREQ
Posts: 8,868

Re: Leading Zero's with CSV

Hi, I doubt that it is the tagset template that is an issue. The paper outlined an EITHER/OR approach. Either 1) you use a program to put an = in front of your data fields and then use regular ODS CSV

OR

2) you create a custom TAGSET template and then you do NOT have to put an = in front of your data fields. I would expect that things might get ugly.

  In my posting above, the method 1 approach does NOT use a custom tagset template -- it just uses regular old ODS CSV, after putting an = in front of the variable with the leading zeroes.

  Did you try to run a program similar to the method 1 approach shown above without using a custom tagset template?

cynthia

Contributor
Posts: 27

Re: Leading Zero's with CSV

Posted in reply to Cynthia_sas

Cynthia,

      Thank you for your responses it is appreciated. I tried putting an equal sign in front of my variable, policy #, that starts with two leading zero's. But it does not retain the leading zeros in excel. I tried using the custom tagset option with and without appending an '=' sign to the variable. But niether option worked.

    

     I think I might have a syntax error in my define statement because it puts an '=' sign followed by a string of double quotation marks in front of every observation. When looking at the tech support document it is hard to differentiate between single and double quotes, especially when they are strung together.

     I also tried appending and single quote mark in front of my variable and it sort of worked. When I open the excel file I see the single quote mark and the leading zero's. For example:

'0055000

     What I would like to see is:

0055000

edit: I should note that putting a single quote mark in front of a number I want excel to treat as text is my normal method when working in excel and it usually does not display the single quote mark.

     Are there any other approaches I can try with the CSV option or some variation of CSV?

Thanks,    

SAS Super FREQ
Posts: 8,868

Re: Leading Zero's with CSV

HI:

  As I said, if you use the tagset template approach, then EVERY character variable will have an = and double quotes around it -- not just your leading zero variable. This is probably NOT what you want. And if you tried a combo of method 1 (putting in your own =) and method 2 (using a custom tagset template) -- I don't even want to guess what the results might look like -- probably also not what you want. When I run the Tech Support code, shown at the bottom of the post and modified to have more data, this is what I get for a modified version of SASHELP.CLASS (only show partial rows):

 

"Obs","leadzero","x","Name","Sex","Age","Height","Weight"

"1",="00014",="00001",="Alfred",="M",14,69.0,112.5

"2",="00013",="00001",="Alice",="F",13,56.5, 84.0

"3",="00013",="00001",="Barbara",="F",13,65.3, 98.0

The attached screenshots show the view in Excel 2010 and the view of the file in Notepad. In the code, when you see """" , it is 4 double quotes, which resolve as shown above and in the screenshots. Peter's solution of using the letter O is very clever and you would only need 1 letter O to have your whole field treated as a character string by Excel.

Did you ever try my previously posted Method 1 code, unchanged, using SASHELP.CLASS? What happened in Excel with that result file? If you have been playing around with custom tagset templates, you may want to exit out of SAS and get back into a "clean" session before you submit the Method 1 code -- using my example and data. I still believe that is the simplest and most direct approach. At this point, you might want to open a track with Tech Support for more help.

cynthia

** the tagset template code I tested -- this is Method 2 code;

ods path work.tmp(update)

         sasuser.templat(update)

         sashelp.tmplmst(read);

                        

proc template;

  define tagset tagsets.test;

  parent=tagsets.csv;

    define event data;

      put "," / if !cmp(COLSTART, "1");

      put '=' """"/if cmp(TYPE,"string");

      put VALUE;

      put """" / if cmp(TYPE,"string");

    end;

  end;

run;

             

data one;

  length leadzero x $5;

  set sashelp.class;

  x='00001';

  leadzero = catt('000',age);

run;

               

ods markup file='c:\temp\leadzero_spec.csv' tagset=tagsets.test;

proc print data=one;

run;

ods markup close;


view_in_Excel2010.jpgview_in_Notepad.jpg
Contributor
Posts: 27

Re: Leading Zero's with CSV

Posted in reply to Cynthia_sas

Thanks Cyntia, your explanations and screen shots are helpful.

I found something interesting. When I ran your program, I got identical results. I had to add a rs=none to the ODS statetment, but that was the only change.

When I ran your program but switched the proc print to proc report I got very different results. Every observation had an '=' sign and double quotation mark in front of it (insted of double quotes around it). Like this

Proc Report.bmp

Not sure what is causing this, but I think this is the source of a lot of my confusion.

Thanks

SAS Super FREQ
Posts: 8,868

Leading Zero's with CSV

Hi:

  When I use my code in SAS 9.2 or SAS 9.3, with PROC REPORT instead of PROC PRINT, the PROC REPORT output looks the same as the PROC PRINT output. I would suggest that you work with Tech Support on this.

cynthia

Valued Guide
Posts: 2,177

Leading Zero's with CSV

if appearance is what is needed, consider prefixing relevant numeric variables with the letter O. A suitable format might be something like:

proc format ;

  picture znumbers 0-1e11 = '012345678901'(prefix='O' ) ;

run ;

%put %sysfunc( putn( 98765432, znumbers ));

This creates 12 character width with a letter O followed by the integer value with leading zeros

CSV format can be very dense and created quick quickly

Frequent Contributor
Posts: 143

Re: Leading Zero's with CSV

Posted in reply to Cynthia_sas

data method1;
   set sashelp.class;   

   leadzero = catt('=','"000',age,'"');
run; 
           
title; footnote;
ods listing;
proc print data=method1;
var name age leadzero;
run;
       
** Method 1: Use data that has an = in front of value;
ods listing close;
ods csv file='c:\temp\uselead.csv';
          
proc report data=method1 nowd;
column name age leadzero;
run;
ods csv close;

I used your example with success. In this case age is listed as a numeric.  In my situation I have values listed as a character.

data test;

length  REASON_FOR_DEFAULT $3;

set testing;

IF DFLT_RSN_CD = 'DOB' THEN REASON_FOR_DEFAULT = '001';

IF DFLT_RSN_CD = 'TTC' THEN REASON_FOR_DEFAULT = '002';

IF DFLT_RSN_CD = 'INC' THEN REASON_FOR_DEFAULT = 'INC';

run;

data test2;

set test;

REASON_FOR_DEFAULT = catt('=','"000',REASON_FOR_DEFAULT,'"');

run;

ods listing close;

ods csvall file='/XX/xx09/xxrs/sascode/reporting/test22.csv';    

proc report data=for_final nowd;

COLUMNS (_ALL_);

run;

ods csv close;

The output howver looks like this

REASON_FOR_DEFAULT
="0
="0
="0
="0
="0
="0
="0
="0
="0
="0
="0
="0

Does your solution only work when the target variable is numeric

Super User
Posts: 11,343

Re: Leading Zero's with CSV

Should create a new post referencing this. But you are getting the output displayed because you have declared the variable reason_for_default to be 3 characters in length:

length  REASON_FOR_DEFAULT $3;

Therefore anything past the third character is truncated from the result.

Ask a Question
Discussion stats
  • 12 replies
  • 3577 views
  • 3 likes
  • 5 in conversation