I've come accross a behavior that seems odd to me when generating xml using ODS TAGSETS.ExcelXP. The short description is I can't open the xml in Excel. The long description is below, but first:
SAS environment
Software: SAS (r) Proprietary Software Release 9.3 TS1M2, submitted via SAS EG 7.12 HF4
Machine: AIX, oslevel = 7100-03-03-1415
Windows environment
Software: Excel under MS Office Professional Plus 2010. Excel Version 14.0.6129.5000 (32 bit).
Machine: Windows 7 Enterprise, Service Pack 1 (64 bit); Intel Core i5-4310M CPU, 2.70 GHz, 8 GB RAM
Now the the specifics of the behaviour. I'll post SAS code below.
I have an 84 position character string in my SAS dataset. I want to transform my SAS dataset into an Excel file. Using Proc Print and tagset ExcelXP, I create xml which I then try to open in Excel. It won't open. I've traced the problem to the following line in the xml:
<Cell ss:StyleID="data__l1" ss:Index="64"><Data ss:Type="Number">000000000000000000000000000000000000000000000000000000000000000000000000000000000000</Data></Cell>
Apparently what's happening is that Excel when it sees a number that is 84 digits long, tries to convert the number into scientific notation. It can't seem to do it when the character string is all zeroes. At least that's my surmise. Whether my surmise is correct or not, when I change "Number" to "String", Excel opens the xml file without a glitch.
<Cell ss:StyleID="data__l1" ss:Index="64"><Data ss:Type="String">000000000000000000000000000000000000000000000000000000000000000000000000000000000000</Data></Cell>
Now, given that the data is defined as $80. in SAS, I would have expected the xml to have consistently contained "String". However, sometimes the xml for the above cell is "String" in one row and sometimes it is "Number" in another row. Apparently, SAS assigns the type based on the contents of the cell not on the SAS definition, and apparently SAS assigns the type on a row by row basis. Different rows will have different data types in the same column based on the contents of the cell. I guess I would have expected the type to be set based on the SAS data type. Yes, I know format is generally not perserved, particularly for dates, but for Number vs. String, I would have expected that the xml Type be set based on the SAS data type not by the cell contents on a row by row basis.
Of course it doesn't matter what I expect, I not being the center of the SAS universe, 🙂 but the xml rendered can't be opened by Excel, which is a problem.
So, am I doing something wrong? Is there some option or parameter that I should be using. Yes, I'm sure I could control things more using Proc Report instead of Proc Print, but this is just a quickie throwaway program where I'm dumping things into Excel for presentation purposes for a daily defect meeting. And, yes, I'm aware that I can just switch to the html tagset which doesn't have this problem, and then do the detailed formatting in Excel but I have to ask: Is this a bug? Shouldn't the xml generated by ExcelXP always be readable by Excel?
Code snippet (Proc Format, Proc Template, ODS code, and Proc Print code):
PROC FORMAT;
VALUE Highlight_Num . - .E = 'WhiteSmoke'
-9999999999 - -1 = 'Pink'
0 - 9999999998 = 'Red'
9999999999 - HIGH = 'Black'
;
VALUE Highlight_Reg LOW - HIGH = 'WhiteSmoke'
;
VALUE $Highlight_Reg LOW - HIGH = 'WhiteSmoke'
;
VALUE $Highlight_Cntl '^' = 'Black'
'ZZZZZZZ' = 'Black'
OTHER = 'WhiteSmoke'
;
VALUE $Highlight_Char '^' = 'Black'
'ZZZZZZZ' = 'Black'
'X' = 'Yellow'
'XX' = 'Yellow'
'XXX' = 'Yellow'
'XXXXX' = 'Yellow'
'XXXXXX' = 'Yellow'
'XXXXXXX' = 'Yellow'
'XXXXXXXX' = 'Yellow'
'XXXXXXXXX' = 'Yellow'
'XXXXXXXXXX' = 'Yellow'
'XXXXXXXXXXX' = 'Yellow'
'XXXXXXXXXXXX' = 'Yellow'
'XXXXXXXXXXXXX' = 'Yellow'
'XXXXXXXXXXXXXX' = 'Yellow'
'XXXXXXXXXXXXXXX' = 'Yellow'
'XXXXXXXXXXXXXXXX' = 'Yellow'
'XXXXXXXXXXXXXXXXX' = 'Yellow'
'XXXXXXXXXXXXXXXXXX' = 'Yellow'
;
RUN;
PROC TEMPLATE;
DEFINE STYLE styles.XLsansPrinter;
PARENT = styles.sansPrinter;
STYLE HEADER FROM HEADER /
font_face = "Calibri"
font_size = 8pt
foreground = Black
background = DarkGray
just = CENTER
vjust = BOTTOM
;
END;
RUN;
ODS TAGSETS.ExcelXP options(frozen_headers='Yes' FontName="Calibri" Size="8") FILE="&CompBase" PATH="&XMLlib." STYLE=styles.XLsansPrinter;
PROC PRINT DATA=WORK.&Comp_Base.2 (OBS=&PrintObs1);
VAR _OBS_
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR _TYPE_
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_PIN_N
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_TIN
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_ACCOUNT_NUMBER
TIP_TRADE_BASE_DT_MDCY
TIP_STATE_CODE
TIP_ID
TIP_ENHANCED_SPECIAL_CMT
TIP_DATE_OPENED
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_DATE_OPENED_MONTH
TIP_DATE_OPENED_DAY
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_DATE_REPORTED
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_DATE_REPORTED_MONTH
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_AMT1_QUAL
TIP_AMT2_QUAL
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_AMOUNT_1
TIP_AMOUNT_2
TIP_INQUIRY_AMOUNT
TIP_PUB_REC_AMOUNT
TIP_BKRPT_LIABILITY_AMT
TIP_SUBSCRIBER_ID
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_ACCT_TYPE_CD
TIP_ENHANCED_ACCT_TYPE_CD
TIP_TERMS
TIP_TERMS_V08
TIP_TERMS_FREQ_CODE
TIP_STATUS_CODE
TIP_ENHANCED_STATUS_CODE
TIP_ACCOUNT_COND_CODE
TIP_LEGAL_DESIGNATOR_CODE
TIP_ECOA_CODE
TIP_TR_BALANCE_DATE
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_TR_BAL_DATE_MONTH
TIP_TR_BAL_DATE_DAY
TIP_BALANCE_AMOUNT
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_PR_STATUS_DATE
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_R_STATUS_DATE_MONTH
TIP_PR_STATUS_DATE_DAY
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_PR_FILE_DATE
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_PR_FILE_DATE_MONTH
TIP_PR_FILE_DATE_DAY
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_INQUIRY_DATE
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_INQUIRY_DATE_MONTH
TIP_INQUIRY_DATE_DAY
TIP_AMOUNT_PAST_DUE
TIP_SCH_PMT_AMOUNT
TIP_ACT_PMT_AMOUNT
TIP_BALLOON_PMT_AMOUNT
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_DEFER_FUTURE_PAY_DATE
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_PAST_PMT_DATE_MONTH
TIP_FUT_PMT_DATE_MONTH
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_LAST_PMT_DATE
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_LAST_PMT_DATE_MONTH
TIP_COMPANY_ID
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_KOB_CODE
TIP_ORIG_CREDITOR_CODE
TIP_SEC_AGENCY_CODE
TIP_CII_IND
TIP_CCC_CODE
TIP_PORTFOLIO_CODE
TIP_MORTGAGE_ID
TIP_GRID_V08
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_CREDIT_LIMIT
TIP_HIGH_BALANCE
TIP_CHARGEOFF_AMOUNT
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_DELINQ_DATE_1
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_DELINQ_DATE_1_MONTH
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_DELINQ_DATE_2
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_DELINQ_DATE_2_MONTH
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_MAX_DELINQ_DATE
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_MAX_DELINQ_DATE_MONTH
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_MAX_DELINQ_CODE
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_30_DAY_DELINQ_CT
TIP_60_DAY_DELINQ_CT
TIP_90_DAY_DELINQ_CT
TIP_DEROG_CT
TIP_ORIGINAL_LOAN_AMT
/ STYLE(DATA) = [BACKGROUND=Highlight_Reg. font=("Calibri",8pt) ]
;
VAR TIP_SPECIAL_PAYMENT_CODE
TIP_TRADE_INCLUSION_FLAG
/ STYLE(DATA) = [BACKGROUND=$Highlight_Reg. font=("Calibri",8pt) ]
;
RUN;
ODS TAGSETS.ExcelXP CLOSE;
I did a bit more research. As it turns out any field that contains only numbers that is length 32 or more will prevent xml from opening in Excel. In other words, this isn't unique to fields containing just zeroes.
The code change to the ExcelXP tagset that solves the problem, at least in the sense of creating xml that can be opened in Excel, is shown below. I've just inserted the LENGTH check and the statement that follows where the Type is set to String. The rest of the code is the existing code in the ExcelXP tagset.
define event value_type;
set $format "General";
set $value strip(VALUE);
set $value ' ' /if cmp(value, ' ');
do /if $value;
eval $is_numeric prxmatch($number, $value);
do /if $is_numeric;
do /if LENGTH($value) >= 32;
set $type "String";
else;
set $type "Number";
set $value compress($value, $punctuation);
do /if index(value, "%") > 0;
set $format "Percent" /if index(value, "%") > 0;
/*putlog "Percent value:" $value;*/
do /if $convert_percentages;
eval $tmp inputn($value, $test_format)/100;
else;
eval $tmp inputn($value, $test_format);
done;
/*putlog "Percent value:" $tmp;*/
set $value $tmp;
else /if index(value, $currency) > 0;
set $format $currency_format /if index(value, $currency) > 0;
done;
done;
else;
set $type 'String';
done;
done;
/*putlog "TYPE!! " $attrs['type'] ":" ":" $type ":" ";" type;*/
do /if $attrs['type'];
set $type $attrs['type'];
set $type "DateTime" / if cmp($type, 'Datetime');
else /if ^cmp($type, "Number");
/* default to string for empty values*/
set $type "String" ;
/* only allow actual numbers to pay attention to this */
do /if $is_numeric;
set $type "Number" / if cmp(type, 'int');
set $type "Number" / if cmp(type, 'double');
set $type "String" / if cmp(type, 'string');
done;
done;
end;
Jim
Excel is finicky.
What version of tagsets are you using?
Ah. Good question.
NOTE: This is the Excel XP tagset (Compatible with SAS 9.1.3 and above, v1.130, 05/01/2011). Add options(doc='help') to the ods statement for more information.
Hmm. "v1.130, 05/01/2011" I see at http://support.sas.com/rnd/base/ods/odsmarkup/ that there is a "v1.131, 04/23/2015". Let me try downloading that and see what I get. No doubt I'll have to re-code and re-test. Such is life. 🙂
Jim
Rats. Nope. That didn't change anything. My tagset version is now as follows:
NOTE: This is the Excel XP tagset (Compatible with SAS 9.1.3 and above, v1.131, 04/23/2015). Add options(doc='help') to the ods statement for more information.
But the xml still comes out as:
<Cell ss:StyleID="data__l1" ss:Index="64"><Data ss:Type="Number">000000000000000000000000000000000000000000000000000000000000000000000000000000000000</Data></Cell>
Which causes poor Excel to choke.
I've got a simple but inelegant solution:
IF XML_Identifier = '<Cell ss:StyleID="data__l1" ss:Index="64"><Data ss:Type="Number">' THEN
DO;
XML_Identifier = '<Cell ss:StyleID="data__l1" ss:Index="64"><Data ss:Type="String">';
Entire_Record = XML_Identifier || Rest_of_Record;
END;
PUT Entire_Record;
Yeah, I know, weak code, but hey it works.
Jim
"Just because you have specified a character format in the SAS side of things, does not mean that Excel will respect the SAS format."
To be clear, it is SAS that sets the type as numeric even though the data is character. Excel is merely doing as it is told.
I'm not doing anything specialized. The xml generated by SAS for everyday, ordinary data with no special formatting doesn't work. To me, I would want to have the xml be functional in a case so basic as this, but perhaps that's me.
In any case, the TAGATTR should be a good work around and is certainly a better approach than my quick and dirty DATA step editing of the xml.
Jim
Can you create an excel file with an 84 digit number?
Excel mangles any numbers coming in over 16 digits.
Try saving any large numbers over 16 chars in length in a CSV/Text file and then open in Excel and examine the last set of digits.
Excel doesn't do as it's told IMO, it likes to change things
I think the expectation of a valid XML file is correct. However, I'm not sure I consider this a basic case, since 84 digit numbers aren't really common 🙂
You are correct; Excel will change longer numbers. However, I'm working with a 84 position string, not a number. Each positon is a status indicator and represents the status of the product for that month (each position is a month; 84 months is 7 years).
SAS, when it creates the xml, outputs the Type based on the contents of the cell. When a string is all zeroes, SAS will set the xml Type to numeric. In those cases where the length is 32 or more, Excel will not/cannot open the xml file.
Here is what SAS writes in terms of xml when the contents of a cell are all zeroes:
<Cell ss:StyleID="data__l1" ss:Index="64"><Data ss:Type="Number">000000000000000000000000000000000000000000000000000000000000000000000000000000000000</Data></Cell>
I wrote quickie logic to change the Type to a string, and, voila, Excel opens it just fine -- and displays it properly, i.e. as a string and not in scientific notation.
Any time there is a string consisting of all zeroes that is 32 or more positions long, your SAS xml will get an error when opened in Excel.
As @Cynthia_sas pointed out, you can probably correct the problem with a TAGATTR on a case by case basis. You might have to do this for multiple columns in multiple spreadsheets.
However, one could just fix the tagset. The fix isn't overly difficult. Basically, one checks the cell contents for characters other than zero. If there are non-zero characters, no problem; let SAS do what it has been doing all along. However, if there are no non-zero characters present and the string length is greater than or equal to 32, then don't use SAS's current logic, but rather output the Type as a String
Here's the code:
eval $non_zero prxparse('/[^0]/');
eval $other_than_zero prxmatch($non_zero, $value);
do /if not $other_than_zero and LENGTH($value) >= 32;
set $type "String";
else;
You insert the above code in the value_type event after SAS has made the determination that the value in the cell is in fact numeric. The code only affects those cases where the cell's contents are all zeroes and the length is 32 or more.
So, there's my fix for the non bug. 🙂
Jim
However, I'm working with a string, not a number, that is 84 positions long.
Unless I'm missing something, passing it as a string doesn't cause issues, only when it's numeric. The thing is when using tagsets you need to use tagattr to define the formats, not SAS formats. This isn't restricted to your specific case, but is very common when wanting to have currency or date formats applied.
if there are non-zero characters, no problem; let SAS do what it has been doing.
Is Excel actually interpreting large numbers in XML correctly?
Excel does display the characters without changing them, i.e. a full 84 digits.
There is a little green triangle thing in the upper left hand corner of the cell that says (if you mouse over it): "The number in this cell is formatted as text or preceded by an apostrophe." Since it is formatted as text, I can't perform mathematical operations on it, but these are indicators not numeric values, so I don't need the ability to do math with them.
Jim
Here's an example. I've done nothing within Excel except widening the columns so that all the data shows.
Jim
I know I can get it read in....just that it takes a workaround even for a frickin CSV file.
I did a bit more research. As it turns out any field that contains only numbers that is length 32 or more will prevent xml from opening in Excel. In other words, this isn't unique to fields containing just zeroes.
The code change to the ExcelXP tagset that solves the problem, at least in the sense of creating xml that can be opened in Excel, is shown below. I've just inserted the LENGTH check and the statement that follows where the Type is set to String. The rest of the code is the existing code in the ExcelXP tagset.
define event value_type;
set $format "General";
set $value strip(VALUE);
set $value ' ' /if cmp(value, ' ');
do /if $value;
eval $is_numeric prxmatch($number, $value);
do /if $is_numeric;
do /if LENGTH($value) >= 32;
set $type "String";
else;
set $type "Number";
set $value compress($value, $punctuation);
do /if index(value, "%") > 0;
set $format "Percent" /if index(value, "%") > 0;
/*putlog "Percent value:" $value;*/
do /if $convert_percentages;
eval $tmp inputn($value, $test_format)/100;
else;
eval $tmp inputn($value, $test_format);
done;
/*putlog "Percent value:" $tmp;*/
set $value $tmp;
else /if index(value, $currency) > 0;
set $format $currency_format /if index(value, $currency) > 0;
done;
done;
else;
set $type 'String';
done;
done;
/*putlog "TYPE!! " $attrs['type'] ":" ":" $type ":" ";" type;*/
do /if $attrs['type'];
set $type $attrs['type'];
set $type "DateTime" / if cmp($type, 'Datetime');
else /if ^cmp($type, "Number");
/* default to string for empty values*/
set $type "String" ;
/* only allow actual numbers to pay attention to this */
do /if $is_numeric;
set $type "Number" / if cmp(type, 'int');
set $type "Number" / if cmp(type, 'double');
set $type "String" / if cmp(type, 'string');
done;
done;
end;
Jim
Are you using 32 or 64 bit Excel?
@Reeza I'm using 32 bit Excel which is bit odd since I'm using 64 bit Windows, but that's my set up here at work.
@Cynthia_sas I'm sure the TAGATTR would have worked -- and it would have been a lot faster and a lot easier than trying to read through the rather lengthy tagset code. However, TAGATTR requires that I remember to use it each and every time for any field that might contain only numbers and be 32 or more in length. Fixing the tagset means that my xml files will always be operable even if I forget the TAGATTR. Relying on my memory is... well, let's just say it's not a best practice. 🙂
Jim
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.