BookmarkSubscribeRSS Feed
Alexxxxxxx
Pyrite | Level 9

Dear all,

 

I am trying to split the variable based on  (),[],{},' '," " , and then separate strings among them into &COMPANY_NAME._inB, strings outside the them into &COMPANY_NAME._noB.

 

%MACRO CompnayNameM(no=,Company_name=);
	data step9.Patstat_total_hrm_Step3;
		set step9.Patstat_total_hrm_Step3;
		RegExID = prxparse('/<.*>|\[.*\]|\(.*\)|\{.*\}|".*"|''.*''/');
		start=1;
		stop=length(&COMPANY_NAME._Step23);
		call prxnext(RegExID, start, stop, &COMPANY_NAME._Step23, pos, length);
		do while (pos > 0);
			&COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2);
			&COMPANY_NAME._noB = prxchange('s/<.*>|\[.*\]|\(.*\)|\{.*\}|".*"|''.*''/ /', -1, &COMPANY_NAME._Step23);
			call prxnext(RegExID, start, stop, &COMPANY_NAME._Step23, pos, length);
		end;
		drop RegExID pos length start stop;
	run;

	proc sql;
		create table PATSTAT3&no. as
		select distinct
		&COMPANY_NAME.,
		&COMPANY_NAME._Step23,
		&COMPANY_NAME._inB,
		&COMPANY_NAME._noB
		from step9.Patstat_total_hrm_Step3
		where &COMPANY_NAME._inB ne ''
		;
	quit;

%MEND CompnayNameM;
%CompnayNameM(no=1,Company_name=HRM_L2)
%CompnayNameM(no=2,Company_name=PERSON_NAME)
%CompnayNameM(no=3,Company_name=HRM_L2_B)
%CompnayNameM(no=4,Company_name=PERSON_NAME_B)
run;

However, I get the result as

NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=''AKTAL-LTD PERSON_CTRY_CODE=KZ PERSON_NAME=''AKTAL-LTD' SECTOR=COMPANY PERSON_ADDRESS=
HRM_L2_Step23=''AKTAL-LTD PERSON_NAME_Step23=''AKTAL-LTD' HRM_L2_B= HRM_L2_address=
chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=
PERSON_NAME_B_Step23= RegExID=1 start=3 stop=11 pos=0 length=0 HRM_L2_inB='AKTAL-LTD
HRM_L2_noB=AKTAL-LTD _ERROR_=1 _N_=2035185
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=CIRCULAR PLANE SAW'' PERSON_CTRY_CODE=RU PERSON_NAME=OOO 'CIRCULAR PLANE SAW'' SECTOR=COMPANY
PERSON_ADDRESS= HRM_L2_Step23=CIRCULAR PLANE SAW'' PERSON_NAME_Step23=OOO 'CIRCULAR PLANE SAW''
HRM_L2_B= HRM_L2_address= chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address=
chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23= PERSON_NAME_B_Step23= RegExID=1 start=21 stop=20 pos=0
length=0 HRM_L2_inB=' HRM_L2_noB=CIRCULAR PLANE SAW _ERROR_=1 _N_=2115372
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=JS. CO. ORMETO-YUMZ'' PERSON_CTRY_CODE=RU PERSON_NAME=JS. CO. 'ORMETO-YUMZ'' SECTOR=COMPANY
PERSON_ADDRESS= HRM_L2_Step23=JS. CO. ORMETO-YUMZ'' PERSON_NAME_Step23=JS. CO. 'ORMETO-YUMZ''
HRM_L2_B= HRM_L2_address= chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address=
chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23= PERSON_NAME_B_Step23= RegExID=1 start=22 stop=21 pos=0
length=0 HRM_L2_inB=' HRM_L2_noB=JS. CO. ORMETO-YUMZ _ERROR_=1 _N_=2120410
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU ''KVARTS PERSON_CTRY_CODE=RU
PERSON_NAME=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU ''KVARTS SECTOR=COMPANY PERSON_ADDRESS=
HRM_L2_Step23=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU ''KVARTS
PERSON_NAME_Step23=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU ''KVARTS HRM_L2_B= HRM_L2_address=
chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=
PERSON_NAME_B_Step23= RegExID=1 start=47 stop=52 pos=0 length=0 HRM_L2_inB='KVARTS
HRM_L2_noB=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU KVARTS _ERROR_=1 _N_=2131108
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=AZOVZAGAL''MASH JOINT STOCK COMPANY PERSON_CTRY_CODE=UA
PERSON_NAME='AZOVZAGAL''MASH' JOINT STOCK COMPANY SECTOR=COMPANY PERSON_ADDRESS=
HRM_L2_Step23=AZOVZAGAL''MASH JOINT STOCK COMPANY
PERSON_NAME_Step23='AZOVZAGAL''MASH' JOINT STOCK COMPANY HRM_L2_B= HRM_L2_address=
chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=
PERSON_NAME_B_Step23= RegExID=1 start=12 stop=35 pos=0 length=0 HRM_L2_inB='MASH JOINT STOCK COMPANY
HRM_L2_noB=AZOVZAGAL MASH JOINT STOCK COMPANY _ERROR_=1 _N_=2245500
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=CHERKAS''KE HIMVOLOKNO JOINT STOCK COMPANY PERSON_CTRY_CODE=UA
PERSON_NAME='CHERKAS''KE HIMVOLOKNO' JOINT STOCK COMPANY SECTOR=COMPANY PERSON_ADDRESS=
HRM_L2_Step23=CHERKAS''KE HIMVOLOKNO JOINT STOCK COMPANY
PERSON_NAME_Step23='CHERKAS''KE HIMVOLOKNO' JOINT STOCK COMPANY HRM_L2_B= HRM_L2_address=
chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=
PERSON_NAME_B_Step23= RegExID=1 start=10 stop=42 pos=0 length=0
HRM_L2_inB='KE HIMVOLOKNO JOINT STOCK COMPANY HRM_L2_noB=CHERKAS KE HIMVOLOKNO JOINT STOCK COMPANY
_ERROR_=1 _N_=2245623

According to this result, I conclude the problematic variable as

 

'CIRCULAR PLANE SAW''
''AKTAL-LTD'
DR. REDDY''S LABORATORIES LTD.
ZAKRYTOE AKTSIONERNOE OBSCHESTVO ''SKY LTD.''
''TETRA LTD.' FIRM', LIMITED-LIABILITY COMPANY
''TETRA LTD.' ' FIRM', LIMITED-LIABILITY COMPANY

I expect to put all strings and characters which included in the outermost single quotation marks into  &COMPANY_NAME._inB, and the rest into &COMPANY_NAME._noB

Name&COMPANY_NAME._inB&COMPANY_NAME._noB
CIRCULAR PLANE SAW''CIRCULAR PLANE SAW' 
'AKTAL-LTD'AKTAL-LTD 
DR. REDDY''S LABORATORIES LTD. DR. REDDY S LABORATORIES LTD.
ZAKRYTOE AKTSIONERNOE OBSCHESTVO ''SKY LTD.''SKY LTD.'ZAKRYTOE AKTSIONERNOE OBSCHESTVO
'TETRA LTD.' FIRM', LIMITED-LIABILITY COMPANYTETRA LTD.' FIRM, LIMITED-LIABILITY COMPANY
'TETRA LTD.' ' FIRM', LIMITED-LIABILITY COMPANYTETRA LTD.' FIRM, LIMITED-LIABILITY COMPANY

I have tried to use the online regex tester to fix it but does not work. Could you please give me some suggestions about this?

thanks in advance.

 

 

3 REPLIES 3
Shmuel
Garnet | Level 18

What is the difference between this post and your previous one ?

Have you tried %superq macro function ?

Alexxxxxxx
Pyrite | Level 9

Dear @Shmuel ,

 

I get the current result(and NOTE) by using the previous code. I expect to update the previous code, thereby fixing the errors.

 

I do not How to use the %superq macro function, Could you please give me more suggestions about this?

Shmuel
Garnet | Level 18

Few remarks:

 

1) In your code you are overriding your input by the output.

    Better way is to assign a different name, in order not to force restart from beginning.

   

data    step9.Patstat_total_hrm_Step3;
  set   step9.Patstat_total_hrm_Step3;

 

2) It is difficult to consult without test data.

    I assumed it is the same as in your previous post.

 

3) Debug your code out of a macro program, by:

    

%let company_name = <any name to test> ;
data check;
 set step9.Patstat_total_hrm_Step3;
     RegExID = prxparse('/<.*>|\[.*\]|\(.*\)|\{.*\}|".*"|''.*''/');
     start=1;
     stop=length(&COMPANY_NAME._Step23);
     call prxnext(RegExID, start, stop, &COMPANY_NAME._Step23, pos, length);
     do while (pos > 0);
       &COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2);
       &COMPANY_NAME._noB = prxchange('s/<.*>|\[.*\]|\(.*\)|\{.*\}|".*"|''.*''/ /', -1, &COMPANY_NAME._Step23);
       call prxnext(RegExID, start, stop, &COMPANY_NAME._Step23, pos, length);
    end;
drop RegExID pos length start stop;
run;

proc sql;
     create table PATSTAT3&no. as
     select distinct
         &COMPANY_NAME.,
         &COMPANY_NAME._Step23,
         &COMPANY_NAME._inB,
         &COMPANY_NAME._noB
     from check                     /* step9.Patstat_total_hrm_Step3 */
     where &COMPANY_NAME._inB ne ''
  ;
quit;

   4) You got notes/errors, in different observations,  like:

NOTE: Invalid third argument to function SUBSTR at line 2 column 106.HRM_L2=''AKTAL-LTD PERSON_CTRY_CODE=KZ PERSON_NAME=''AKTAL-LTD' SECTOR=COMPANY PERSON_ADDRESS=HRM_L2_Step23=''AKTAL-LTD PERSON_NAME_Step23=''AKTAL-LTD' HRM_L2_B= HRM_L2_address=chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=PERSON_NAME_B_Step23= RegExID=1 start=3 stop=11 pos=0 length=0 HRM_L2_inB='AKTAL-LTDHRM_L2_noB=AKTAL-LTD _ERROR_=1 _N_=2035185

the only line with substr function is:

do while (pos > 0);
    &COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2);

where 3rd argument is defined as length-2,  while, according to the note, length=0 and length-2 is negative, an invalid length.

You should check those observation (for example _N_=2035185) why is length=0. Is there no company name ?

Add code to deal with those observations.

 

5) You assigned argument no= in the macro but never used it (&no does not exist in your code).

     What you intended to do with it ?

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1137 views
  • 0 likes
  • 2 in conversation