About Maplefin

Maplefin · ‎09-18-2023

Thanks a lot! Very helpful and innovative solution.

Maplefin · ‎08-14-2023

Thanks for your reply, I have got the answer. The latest version of SAS does not support multibyte for PRX functions. Maybe SAS Viya can do it? But I don't have the license, I can not try it in SAS Viya.

Maplefin · ‎08-11-2023

The lastest version of SAS base does not support multibyte for PRX?

Maplefin · ‎08-11-2023

Yeah , I read the post and got inspired, so I tried the codes in it. But got just errors. My sas version is 9.04.01M7P080520.

Maplefin · ‎08-11-2023

As the title shows, I'm wondering how to match the multi-byte characters using perl regular expression. Encoding of my SAS is UTF-8. Now I need to match a series of multi-byte characters. But I found that single-byte characters matching is well supported, meanwhile multi-byte characters matching seems not. For example, if I want to match all the printable ASCII characters, it works fine. data _null_; text="€123"; pos=prxmatch("/[\x20-\x7E]/",text); put pos=; run; /*Results as below*/ pos=4; But when it comes to multi-byte characters, things changed. data _null_; text='à'; len=length(text); put len=; /* match latin small letter with acute */ pos1=prxmatch('/\x{C3A0}/', text); /* 'à': U+00E0 */ put pos1=; pos2=prxmatch('/\xC3A0/', text); /* 'à': U+00E0 */ put pos2=; pos3=prxmatch('/\xC3\xA0/', text); /* 'à': U+00E0 */ put pos3=; pos4=prxmatch('/\xC3\xA1/', text); /* 'à': U+00E0 */ put pos4=; run; /*Results*/ len=2 pos1=2 pos2=0 pos3=1 pos4=0 The character I entered is a double-byte character, I can't get the right position when considering double-byte match. Are perl regular functions designed for single-byte character only? If so, how can I complete multi-byte characters matching? For example, I want to match all the characters in the range [U+4E00-U+9FA5] (UTF8 range: E4B880-E9BEA5). How to write the code?

Maplefin · ‎08-10-2023

Does this really work? I tried your code in my SAS (Encoding=UTF8), but failed in matching Latin-1 supplement. It seems that SAS didn't recognize the specifed range "\x{0080}-\x{00FF}" at all. I got error logs as below: ERROR: Invalid [] range "}-\x" before HERE mark in regex m/[\x{0080}-\x << HERE {00FF}]/ ERROR: The regular expression passed to the function PRXMATCH contains a syntax error. My SAS version is 9.04.01M7P080520.

Maplefin · ‎12-15-2020

Thanks a lot! I tried not specifying ORDER option, just let SAS choose the default option and got the right result. The default for ORDER option is ORDER=FORMATTED? But in another case, I have some observations, and the values of ARMCD are "TR". I want to let values "TR" show above "RT". Though I sort the datasets by descending armcd, and if I don't specify the value for ORDER= option, I get the unwanted output that values "RT" show above "TR". I just want to get the output sorted same as my origin dataset with suppressed repetitive values.

Maplefin · ‎12-15-2020

Thanks a lot! I tried using ORDER=INTERNAL option and it works. I got the result I want. I think you're right that ORDER option is a global option and not "within-group". I thought ORDER option similar to PROC SORT before. When applying to multiple variables, there are priorities. In PROC SORT procedure, if we specify multiple variables, first sort A, then B and so on. So I thought ORDER option would work like this, first suppress repetitive values of ARMCD, then suppress repetitive SUBJID within suppressed ARMCD, then PERIOD. But it doesn't seem to be like this. I'm wondering how to specify ORDER option to get the output sorted same as my dataset and suppress repetitive values meanwhile.

Maplefin · ‎12-13-2020

Hi, I have some problems about Proc report outputs. I want to use ORDER option to suppress some repetitive values, but got a unexpected output. Here's my codes: data x1; informat subject armcd period $20.; input subject armcd period; cards; C007 RT 2 C007 RT 2 C010 RT 1 C010 RT 1 C010 RT 1 C010 RT 1 C010 RT 2 C010 RT 2 C010 RT 2 C010 RT 2 C010 RT 2 ; proc sort; by subject armcd period; run; proc report data=x1; column subject armcd period; define subject/order order=data; define armcd/order order=data; define period/order order=data; run; I thought the output would be: C007 RT 2 C010 RT 1 2 I was surprised I got: C007 RT 2 C010 RT 2 1 So, why the output would be like this?Why the value "1" is showing below the value "2"?Could someone explain to me? Any help will be greatly appreciated.

Maplefin · ‎07-30-2020

Thanks a lot! It will be helpful.

Maplefin · ‎07-30-2020

Hi,I'm recently learning perl regular expression. I wonder how to change the case of single character in strings.I have read SASHELP document, knowing \u,\U,\l,\L can be used to change the case.The example in SASHELP confused me: data _null_; x = 'MCLAUREN'; x = prxchange("s/(MC)/\u\L$1/i", -1, x); put x=; run; SAS writes the following output to the log: x=McLAUREN what's the rules under this?Such as "ABC", I want to switch this string into "aBc".How to complete the right perl regular expression?I tried to imitate the code: data _null_; x="ABC"; y=prxchange("s/(abc)/\l\u\l$1/i",-1,x); put x= y=; run; but got the wrong result: x=ABC y=aBC So,how to use perl regular expression to get the results I want? In another situation, I want to change "ADAM" into "ADaM".

Online Status	Offline
Date Last Visited	‎09-18-2023 06:47 AM

Re: How to match the multi-byte characters in SAS?

Re: How to match the multi-byte characters in SAS?

Re: How to match the multi-byte characters in SAS?

Re: How to match the multi-byte characters in SAS?

How to match the multi-byte characters in SAS?

Re: PRX Functions to Support Multibyte Characters

Re: Proc report define statement order options output problems

Re: Proc report define statement order options output problems

Proc report define statement order options output problems

Re: prxchange how to change the case of specified character in a strin...

Re: How to match the multi-byte characters in SAS?

Re: How to match the multi-byte characters in SAS?

Re: How to match the multi-byte characters in SAS?

Re: How to match the multi-byte characters in SAS?

Re: Proc report define statement order options output problems

Re: How to match the multi-byte characters in SAS?

Re: How to match the multi-byte characters in SAS?

Re: How to match the multi-byte characters in SAS?

Re: How to match the multi-byte characters in SAS?

How to match the multi-byte characters in SAS?

Re: PRX Functions to Support Multibyte Characters

Re: Proc report define statement order options output problems

Re: Proc report define statement order options output problems

Proc report define statement order options output problems

Re: prxchange how to change the case of specified character in a strin...

prxchange how to change the case of specified character in a string?