Dear all,
I am trying to split the variable based on (),[],{},' '," " , and then separate strings among them into &COMPANY_NAME._inB, strings outside the them into &COMPANY_NAME._noB.
%MACRO CompnayNameM(no=,Company_name=);
data step9.Patstat_total_hrm_Step3;
set step9.Patstat_total_hrm_Step3;
RegExID = prxparse('/<.*>|\[.*\]|\(.*\)|\{.*\}|".*"|''.*''/');
start=1;
stop=length(&COMPANY_NAME._Step23);
call prxnext(RegExID, start, stop, &COMPANY_NAME._Step23, pos, length);
do while (pos > 0);
&COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2);
&COMPANY_NAME._noB = prxchange('s/<.*>|\[.*\]|\(.*\)|\{.*\}|".*"|''.*''/ /', -1, &COMPANY_NAME._Step23);
call prxnext(RegExID, start, stop, &COMPANY_NAME._Step23, pos, length);
end;
drop RegExID pos length start stop;
run;
proc sql;
create table PATSTAT3&no. as
select distinct
&COMPANY_NAME.,
&COMPANY_NAME._Step23,
&COMPANY_NAME._inB,
&COMPANY_NAME._noB
from step9.Patstat_total_hrm_Step3
where &COMPANY_NAME._inB ne ''
;
quit;
%MEND CompnayNameM;
%CompnayNameM(no=1,Company_name=HRM_L2)
%CompnayNameM(no=2,Company_name=PERSON_NAME)
%CompnayNameM(no=3,Company_name=HRM_L2_B)
%CompnayNameM(no=4,Company_name=PERSON_NAME_B)
run;
However, I get the result as
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=''AKTAL-LTD PERSON_CTRY_CODE=KZ PERSON_NAME=''AKTAL-LTD' SECTOR=COMPANY PERSON_ADDRESS=
HRM_L2_Step23=''AKTAL-LTD PERSON_NAME_Step23=''AKTAL-LTD' HRM_L2_B= HRM_L2_address=
chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=
PERSON_NAME_B_Step23= RegExID=1 start=3 stop=11 pos=0 length=0 HRM_L2_inB='AKTAL-LTD
HRM_L2_noB=AKTAL-LTD _ERROR_=1 _N_=2035185
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=CIRCULAR PLANE SAW'' PERSON_CTRY_CODE=RU PERSON_NAME=OOO 'CIRCULAR PLANE SAW'' SECTOR=COMPANY
PERSON_ADDRESS= HRM_L2_Step23=CIRCULAR PLANE SAW'' PERSON_NAME_Step23=OOO 'CIRCULAR PLANE SAW''
HRM_L2_B= HRM_L2_address= chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address=
chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23= PERSON_NAME_B_Step23= RegExID=1 start=21 stop=20 pos=0
length=0 HRM_L2_inB=' HRM_L2_noB=CIRCULAR PLANE SAW _ERROR_=1 _N_=2115372
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=JS. CO. ORMETO-YUMZ'' PERSON_CTRY_CODE=RU PERSON_NAME=JS. CO. 'ORMETO-YUMZ'' SECTOR=COMPANY
PERSON_ADDRESS= HRM_L2_Step23=JS. CO. ORMETO-YUMZ'' PERSON_NAME_Step23=JS. CO. 'ORMETO-YUMZ''
HRM_L2_B= HRM_L2_address= chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address=
chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23= PERSON_NAME_B_Step23= RegExID=1 start=22 stop=21 pos=0
length=0 HRM_L2_inB=' HRM_L2_noB=JS. CO. ORMETO-YUMZ _ERROR_=1 _N_=2120410
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU ''KVARTS PERSON_CTRY_CODE=RU
PERSON_NAME=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU ''KVARTS SECTOR=COMPANY PERSON_ADDRESS=
HRM_L2_Step23=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU ''KVARTS
PERSON_NAME_Step23=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU ''KVARTS HRM_L2_B= HRM_L2_address=
chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=
PERSON_NAME_B_Step23= RegExID=1 start=47 stop=52 pos=0 length=0 HRM_L2_inB='KVARTS
HRM_L2_noB=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU KVARTS _ERROR_=1 _N_=2131108
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=AZOVZAGAL''MASH JOINT STOCK COMPANY PERSON_CTRY_CODE=UA
PERSON_NAME='AZOVZAGAL''MASH' JOINT STOCK COMPANY SECTOR=COMPANY PERSON_ADDRESS=
HRM_L2_Step23=AZOVZAGAL''MASH JOINT STOCK COMPANY
PERSON_NAME_Step23='AZOVZAGAL''MASH' JOINT STOCK COMPANY HRM_L2_B= HRM_L2_address=
chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=
PERSON_NAME_B_Step23= RegExID=1 start=12 stop=35 pos=0 length=0 HRM_L2_inB='MASH JOINT STOCK COMPANY
HRM_L2_noB=AZOVZAGAL MASH JOINT STOCK COMPANY _ERROR_=1 _N_=2245500
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=CHERKAS''KE HIMVOLOKNO JOINT STOCK COMPANY PERSON_CTRY_CODE=UA
PERSON_NAME='CHERKAS''KE HIMVOLOKNO' JOINT STOCK COMPANY SECTOR=COMPANY PERSON_ADDRESS=
HRM_L2_Step23=CHERKAS''KE HIMVOLOKNO JOINT STOCK COMPANY
PERSON_NAME_Step23='CHERKAS''KE HIMVOLOKNO' JOINT STOCK COMPANY HRM_L2_B= HRM_L2_address=
chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=
PERSON_NAME_B_Step23= RegExID=1 start=10 stop=42 pos=0 length=0
HRM_L2_inB='KE HIMVOLOKNO JOINT STOCK COMPANY HRM_L2_noB=CHERKAS KE HIMVOLOKNO JOINT STOCK COMPANY
_ERROR_=1 _N_=2245623
According to this result, I conclude the problematic variable as
'CIRCULAR PLANE SAW''
''AKTAL-LTD'
DR. REDDY''S LABORATORIES LTD.
ZAKRYTOE AKTSIONERNOE OBSCHESTVO ''SKY LTD.''
''TETRA LTD.' FIRM', LIMITED-LIABILITY COMPANY
''TETRA LTD.' ' FIRM', LIMITED-LIABILITY COMPANY
I expect to put all strings and characters which included in the outermost single quotation marks into &COMPANY_NAME._inB,
and the rest into &COMPANY_NAME._noB
Name | &COMPANY_NAME._inB | &COMPANY_NAME._noB |
CIRCULAR PLANE SAW'' | CIRCULAR PLANE SAW' | |
'AKTAL-LTD' | AKTAL-LTD | |
DR. REDDY''S LABORATORIES LTD. | DR. REDDY S LABORATORIES LTD. | |
ZAKRYTOE AKTSIONERNOE OBSCHESTVO ''SKY LTD.'' | SKY LTD.' | ZAKRYTOE AKTSIONERNOE OBSCHESTVO |
'TETRA LTD.' FIRM', LIMITED-LIABILITY COMPANY | TETRA LTD.' FIRM | , LIMITED-LIABILITY COMPANY |
'TETRA LTD.' ' FIRM', LIMITED-LIABILITY COMPANY | TETRA LTD.' FIRM | , LIMITED-LIABILITY COMPANY |
I have tried to use the online regex tester to fix it but does not work. Could you please give me some suggestions about this?
thanks in advance.
What is the difference between this post and your previous one ?
Have you tried %superq macro function ?
Dear @Shmuel ,
I get the current result(and NOTE) by using the previous code. I expect to update the previous code, thereby fixing the errors.
I do not How to use the %superq macro function, Could you please give me more suggestions about this?
Few remarks:
1) In your code you are overriding your input by the output.
Better way is to assign a different name, in order not to force restart from beginning.
data step9.Patstat_total_hrm_Step3;
set step9.Patstat_total_hrm_Step3;
2) It is difficult to consult without test data.
I assumed it is the same as in your previous post.
3) Debug your code out of a macro program, by:
%let company_name = <any name to test> ;
data check;
set step9.Patstat_total_hrm_Step3;
RegExID = prxparse('/<.*>|\[.*\]|\(.*\)|\{.*\}|".*"|''.*''/');
start=1;
stop=length(&COMPANY_NAME._Step23);
call prxnext(RegExID, start, stop, &COMPANY_NAME._Step23, pos, length);
do while (pos > 0);
&COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2);
&COMPANY_NAME._noB = prxchange('s/<.*>|\[.*\]|\(.*\)|\{.*\}|".*"|''.*''/ /', -1, &COMPANY_NAME._Step23);
call prxnext(RegExID, start, stop, &COMPANY_NAME._Step23, pos, length);
end;
drop RegExID pos length start stop;
run;
proc sql;
create table PATSTAT3&no. as
select distinct
&COMPANY_NAME.,
&COMPANY_NAME._Step23,
&COMPANY_NAME._inB,
&COMPANY_NAME._noB
from check /* step9.Patstat_total_hrm_Step3 */
where &COMPANY_NAME._inB ne ''
;
quit;
4) You got notes/errors, in different observations, like:
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.HRM_L2=''AKTAL-LTD PERSON_CTRY_CODE=KZ PERSON_NAME=''AKTAL-LTD' SECTOR=COMPANY PERSON_ADDRESS=HRM_L2_Step23=''AKTAL-LTD PERSON_NAME_Step23=''AKTAL-LTD' HRM_L2_B= HRM_L2_address=chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=PERSON_NAME_B_Step23= RegExID=1 start=3 stop=11 pos=0 length=0 HRM_L2_inB='AKTAL-LTDHRM_L2_noB=AKTAL-LTD _ERROR_=1 _N_=2035185
the only line with substr function is:
do while (pos > 0);
&COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2);
where 3rd argument is defined as length-2, while, according to the note, length=0 and length-2 is negative, an invalid length.
You should check those observation (for example _N_=2035185) why is length=0. Is there no company name ?
Add code to deal with those observations.
5) You assigned argument no= in the macro but never used it (&no does not exist in your code).
What you intended to do with it ?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.