Hi Community,
I work with SAS on z/OS, and I was wondering if anyone had a recommendation on setting up a scan for non-printable EBCDIC characters from an incoming dataset.
Thanks for any input.
Like this?
data T;
length STR1 STR2 STR3 $256;
do I=0 to 255;
STR1=cats(STR1,byte(I));
end;
STR2=prxchange('s/[^[:graph:]]//',-1,STR1);
STR3=prxchange("s/[^ a-i j-r ~-z {-I }-R S-Z 0-9 \x4A-\x4E \x5A-\x5F \x6A-\x6F \x79-\x7F &\/\-\\]//",-1,STR1);
putlog STR1= / STR2= / STR3=;
run;
The posix character class can't deal with EBCDIC it seems, but spelling out the characters to keep works.
STR1= € ‚ƒ„
…† -‡ˆ‰Š‹ŒŽ‘’""•–—˜™š›œž âäàáãåçñ¢.<(+|&éêëèíîïìß!$*);^-/ÂÄÀÁÃÅÇѦ,%_>?øÉÊËÈÍÎÏÌ`:#@'="Øabcdefg
hi«»ðýþ±°jklmnopqrªºæ¸Æ¤µ~stuvwxyz¡¿Ð[Þ®¬£¥·©§¶¼½¾Ý¨¯]´×{ABCDEFGHIôöòóõ}JKLMNOPQR¹ûüùúÿ\÷STUVWXYZ²ÔÖÒÓÕ0123456789³ÛÜÙÚŸ
STR2=âäàáãåçñ¢.<(+|&éêëèíîïìß!$*);^-/ÂÄÀÁÃÅÇѦ,%_>?øÉÊËÈÍÎÏÌ`:#@'="Øabcdefghi«»ðýþ±°jklmnopqrªºæ¸Æ¤µ~stuvwxyz¡¿Ð[Þ®¬£¥·©§¶¼½¾Ý¨¯]´×{ABCDEFGH
Iôöòóõ}JKLMNOPQR¹ûüùúÿ\÷STUVWXYZ²ÔÖÒÓÕ0123456789³ÛÜÙÚ
STR3=¢.<(+&!$*);^-/¦,%_>?`:#@'="abcdefghijklmnopqr~stuvwxyz{ABCDEFGHI}JKLMNOPQR\STUVWXYZ0123456789
Like this?
data T;
length STR1 STR2 STR3 $256;
do I=0 to 255;
STR1=cats(STR1,byte(I));
end;
STR2=prxchange('s/[^[:graph:]]//',-1,STR1);
STR3=prxchange("s/[^ a-i j-r ~-z {-I }-R S-Z 0-9 \x4A-\x4E \x5A-\x5F \x6A-\x6F \x79-\x7F &\/\-\\]//",-1,STR1);
putlog STR1= / STR2= / STR3=;
run;
The posix character class can't deal with EBCDIC it seems, but spelling out the characters to keep works.
STR1= € ‚ƒ„
…† -‡ˆ‰Š‹ŒŽ‘’""•–—˜™š›œž âäàáãåçñ¢.<(+|&éêëèíîïìß!$*);^-/ÂÄÀÁÃÅÇѦ,%_>?øÉÊËÈÍÎÏÌ`:#@'="Øabcdefg
hi«»ðýþ±°jklmnopqrªºæ¸Æ¤µ~stuvwxyz¡¿Ð[Þ®¬£¥·©§¶¼½¾Ý¨¯]´×{ABCDEFGHIôöòóõ}JKLMNOPQR¹ûüùúÿ\÷STUVWXYZ²ÔÖÒÓÕ0123456789³ÛÜÙÚŸ
STR2=âäàáãåçñ¢.<(+|&éêëèíîïìß!$*);^-/ÂÄÀÁÃÅÇѦ,%_>?øÉÊËÈÍÎÏÌ`:#@'="Øabcdefghi«»ðýþ±°jklmnopqrªºæ¸Æ¤µ~stuvwxyz¡¿Ð[Þ®¬£¥·©§¶¼½¾Ý¨¯]´×{ABCDEFGH
Iôöòóõ}JKLMNOPQR¹ûüùúÿ\÷STUVWXYZ²ÔÖÒÓÕ0123456789³ÛÜÙÚ
STR3=¢.<(+&!$*);^-/¦,%_>?`:#@'="abcdefghijklmnopqr~stuvwxyz{ABCDEFGHI}JKLMNOPQR\STUVWXYZ0123456789
That's great. I was hoping someone would reply back using a PRX function.
@ChrisNZ you're missing one character from STR1 do you know why? Which one?
This is a good example for using SUBSTR on the LEFT or better still COLLATE to gen the entire string
substr(str1,i+1,1)=byte(I);
collate(0,256);
Thanks data_null_!
If you want to remove everything that is non-printable COMPRESS would be easy enough.
compress(str1,,'kw')
@data_null__ Thanks for your input.
1- It doesnt matter a space is missing from the example output.That's not the point.
I must use collate more though, Thanks for the reminder.
2- STR4=compress(STR1,,'kw'); gives the same resut as the posix expression in STR2. Not good.
@ChrisNZ wrote:
@data_null__ Thanks for your input.
1- It doesnt matter a space is missing from the example output.That's not the point.
I must use collate more though, Thanks for the reminder.
2- STR4=compress(STR1,,'kw'); gives the same resut as the posix expression in STR2. Not good.
1. I didn't say it was the point I just ask if you which and why.
2. I guess it depends on your definition of printable. Seems like your are defining printable characters as characters on the keyboard "type-able" and I could understand where that might be useful but I don't think that's what is being asked.
@data_null__ EBCDIC has 95 printable characters and that's it. Not the 190 that compress() lets through. If not, I want to learn more. 🙂
I would call this a defect actually. Why on earth does the compress function on Z/OS use the posix character list?
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.