- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have an original table with 2 columns: PLU, UPC. The PLU is a coupon and the UPcs are the items the coupon can be redeemed on. I need a table with 1 row for each PLU and the UPCs in their own individual columns, so I've used proc transpose. From there, I created new columns that concatenate each UPC column with a comma so that when I export into excel, I can create a list by concatenating all of the columns together (rather than having to add in a comma myself; it's already extremely cumbersome to begin with and I already have it all set up & ready to go).
The issue is, for SOME UPCs (13 digits or larger), when I concatenate with a comma, SAS automatically puts the UPC number in scientific notation. How can I stop this from happening?
Example:
PLU | col544 | col545 | _testcol1_ | _testcol2_ |
997039 | 942190214800 | 6839533032788 | ,942190214800 | , 6.839533E12 |
997264 | 7290010582877 | 7502257290010 | ,7.2900106E12 | ,7.5022573E12 |
Notice that the _testcol1_ value for PLU 997039 turns out exactly how I want it. The others are converting to scientific notation and I need that to NOT happen.
Thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That's because it's numeric. When you use CATX() it will convert the numbers to characters automatically, but it has to assume a format and it does so incorrectly in this case (and a lot of cases). So you need to explicitly provide the conversion or store them as characters in the first place.
It's probably easiest to convert them to character before the transpose and then use CATX to combine with the commas. CATX is smart enough to only include fields that are present.
combined = catx(", ", of upc1-upc10);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That's because it's numeric. When you use CATX() it will convert the numbers to characters automatically, but it has to assume a format and it does so incorrectly in this case (and a lot of cases). So you need to explicitly provide the conversion or store them as characters in the first place.
It's probably easiest to convert them to character before the transpose and then use CATX to combine with the commas. CATX is smart enough to only include fields that are present.
combined = catx(", ", of upc1-upc10);
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks! I was JUST about to reply to my own post saying I figured it out.
You're right, if I convert to a text value first and then concatenate, it keeps the entire number.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Note how you can use CATX instaed of adding hte commas manually.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
For now, I'm using a do loop. It takes a minute but it gives me what I need:
data redemption_UPCs_v3;
set redemption_UPCs_v2;
run;
%macro concatenation (howmany);
%do i=2 %to &howmany;
data redemption_UPCs_v3;
set redemption_UPCs_v3;
if UPC&i. ne '' then col_txt_&i. = (','||UPC&i.);
run;
%end;
%mend create;
%concatenation(918);
Thanks so much for your help!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@AMandak wrote:
Yes I tried that, but I think there are too many columns because it doesn't work for every PLU. I think if there are 20+ UPCs, it just won't concatenate. But I'll remember that for future reference.
I don't think that's correct, but I guess as long as you have something that works for you.
You do have to make sure the string is long enough in the first place possibly and may need a loop for conversion depending on when you do that, before or after the transpose.
data test;
array test(50);
array _t(50) $10.;
length catted_var $3000.;
do i=1 to 50;
test(i) = floor(rand('uniform')*10000)+1;
_t(i) = put(test(i), z10.);
end;
catted_var = catx(", ", of _t1-_t50);
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@AMandak wrote:
...
The issue is, for SOME UPCs (13 digits or larger), when I concatenate with a comma, SAS automatically puts the UPC number in scientific notation.
Do not store the UPC codes as numbers! 13 digits is very close to the maximum number of digits that SAS floating point numbers can represent exactly.
121 data _null_; 122 x=constant('exactint'); 123 put x= comma23. ; 124 run; x=9,007,199,254,740,992
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To expand on this, the magic number is 15 digits. 16 digit and larger integers cannot be assured of being stored correctly in the 8 byte floating point notation that SAS uses. This will happen without warning.
Tom
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Tom
Is there a way to change 8 byte floating point notation that SAS uses? The proc freq stmt totals a number in scientific notation. Is there a way to modify the format output that will remove scientific notation?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you just want to collapse do it directly from the source and skip the PROC TRANPOSE.
If you have read the raw UBC codes as numbers instead of strings then you might be able to convert it using the Z format to make them all have the same number of characters.
data have ;
length plu $10 ubcn 8 ;
input plu ubcn ;
ubc = put(ubcn,z13.);
cards;
997039 942190214800
997039 6839533032788
997264 7290010582877
997264 7502257290010
;
data want ;
do until(last.plu);
set have ;
by plu ;
length ubc_rollup $2000 ;
ubc_rollup=catx(', ',ubc_rollup,ubc);
end;
drop ubcn ubc;
run;
proc print;
run;