Hi Community,
I work with SAS on z/OS, and I was wondering if anyone had a recommendation on setting up a scan for non-printable EBCDIC characters from an incoming dataset.
Thanks for any input.
Like this?
data T;
length STR1 STR2 STR3 $256;
do I=0 to 255;
STR1=cats(STR1,byte(I));
end;
STR2=prxchange('s/[^[:graph:]]//',-1,STR1);
STR3=prxchange("s/[^ a-i j-r ~-z {-I }-R S-Z 0-9 \x4A-\x4E \x5A-\x5F \x6A-\x6F \x79-\x7F &\/\-\\]//",-1,STR1);
putlog STR1= / STR2= / STR3=;
run;
The posix character class can't deal with EBCDIC it seems, but spelling out the characters to keep works.
STR1= € ‚ƒ„
…† -‡ˆ‰Š‹ŒŽ‘’""•–—˜™š›œž âäàáãåçñ¢.<(+|&éêëèíîïìß!$*);^-/ÂÄÀÁÃÅÇѦ,%_>?øÉÊËÈÍÎÏÌ`:#@'="Øabcdefg
hi«»ðýþ±°jklmnopqrªºæ¸Æ¤µ~stuvwxyz¡¿Ð[Þ®¬£¥·©§¶¼½¾Ý¨¯]´×{ABCDEFGHIôöòóõ}JKLMNOPQR¹ûüùúÿ\÷STUVWXYZ²ÔÖÒÓÕ0123456789³ÛÜÙÚŸ
STR2=âäàáãåçñ¢.<(+|&éêëèíîïìß!$*);^-/ÂÄÀÁÃÅÇѦ,%_>?øÉÊËÈÍÎÏÌ`:#@'="Øabcdefghi«»ðýþ±°jklmnopqrªºæ¸Æ¤µ~stuvwxyz¡¿Ð[Þ®¬£¥·©§¶¼½¾Ý¨¯]´×{ABCDEFGH
Iôöòóõ}JKLMNOPQR¹ûüùúÿ\÷STUVWXYZ²ÔÖÒÓÕ0123456789³ÛÜÙÚ
STR3=¢.<(+&!$*);^-/¦,%_>?`:#@'="abcdefghijklmnopqr~stuvwxyz{ABCDEFGHI}JKLMNOPQR\STUVWXYZ0123456789
Like this?
data T;
length STR1 STR2 STR3 $256;
do I=0 to 255;
STR1=cats(STR1,byte(I));
end;
STR2=prxchange('s/[^[:graph:]]//',-1,STR1);
STR3=prxchange("s/[^ a-i j-r ~-z {-I }-R S-Z 0-9 \x4A-\x4E \x5A-\x5F \x6A-\x6F \x79-\x7F &\/\-\\]//",-1,STR1);
putlog STR1= / STR2= / STR3=;
run;
The posix character class can't deal with EBCDIC it seems, but spelling out the characters to keep works.
STR1= € ‚ƒ„
…† -‡ˆ‰Š‹ŒŽ‘’""•–—˜™š›œž âäàáãåçñ¢.<(+|&éêëèíîïìß!$*);^-/ÂÄÀÁÃÅÇѦ,%_>?øÉÊËÈÍÎÏÌ`:#@'="Øabcdefg
hi«»ðýþ±°jklmnopqrªºæ¸Æ¤µ~stuvwxyz¡¿Ð[Þ®¬£¥·©§¶¼½¾Ý¨¯]´×{ABCDEFGHIôöòóõ}JKLMNOPQR¹ûüùúÿ\÷STUVWXYZ²ÔÖÒÓÕ0123456789³ÛÜÙÚŸ
STR2=âäàáãåçñ¢.<(+|&éêëèíîïìß!$*);^-/ÂÄÀÁÃÅÇѦ,%_>?øÉÊËÈÍÎÏÌ`:#@'="Øabcdefghi«»ðýþ±°jklmnopqrªºæ¸Æ¤µ~stuvwxyz¡¿Ð[Þ®¬£¥·©§¶¼½¾Ý¨¯]´×{ABCDEFGH
Iôöòóõ}JKLMNOPQR¹ûüùúÿ\÷STUVWXYZ²ÔÖÒÓÕ0123456789³ÛÜÙÚ
STR3=¢.<(+&!$*);^-/¦,%_>?`:#@'="abcdefghijklmnopqr~stuvwxyz{ABCDEFGHI}JKLMNOPQR\STUVWXYZ0123456789
That's great. I was hoping someone would reply back using a PRX function.
@ChrisNZ you're missing one character from STR1 do you know why? Which one?
This is a good example for using SUBSTR on the LEFT or better still COLLATE to gen the entire string
substr(str1,i+1,1)=byte(I);
collate(0,256);
Thanks data_null_!
If you want to remove everything that is non-printable COMPRESS would be easy enough.
compress(str1,,'kw')
@data_null__ Thanks for your input.
1- It doesnt matter a space is missing from the example output.That's not the point.
I must use collate more though, Thanks for the reminder.
2- STR4=compress(STR1,,'kw'); gives the same resut as the posix expression in STR2. Not good.
@ChrisNZ wrote:
@data_null__ Thanks for your input.
1- It doesnt matter a space is missing from the example output.That's not the point.
I must use collate more though, Thanks for the reminder.
2- STR4=compress(STR1,,'kw'); gives the same resut as the posix expression in STR2. Not good.
1. I didn't say it was the point I just ask if you which and why.
2. I guess it depends on your definition of printable. Seems like your are defining printable characters as characters on the keyboard "type-able" and I could understand where that might be useful but I don't think that's what is being asked.
@data_null__ EBCDIC has 95 printable characters and that's it. Not the 190 that compress() lets through. If not, I want to learn more. 🙂
I would call this a defect actually. Why on earth does the compress function on Z/OS use the posix character list?
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.