BookmarkSubscribeRSS Feed
Alexxxxxxx
Pyrite | Level 9

Dear all,

 

I am trying to split the variable based on  (),[],{},' '," " , and then separate strings among them into &COMPANY_NAME._inB, strings outside the them into &COMPANY_NAME._noB.

 

%MACRO CompnayNameM(no=,Company_name=);
	data step9.Patstat_total_hrm_Step3;
		set step9.Patstat_total_hrm_Step3;
		RegExID = prxparse('/<.*>|\[.*\]|\(.*\)|\{.*\}|".*"|''.*''/');
		start=1;
		stop=length(&COMPANY_NAME._Step23);
		call prxnext(RegExID, start, stop, &COMPANY_NAME._Step23, pos, length);
		do while (pos > 0);
			&COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2);
			&COMPANY_NAME._noB = prxchange('s/<.*>|\[.*\]|\(.*\)|\{.*\}|".*"|''.*''/ /', -1, &COMPANY_NAME._Step23);
			call prxnext(RegExID, start, stop, &COMPANY_NAME._Step23, pos, length);
		end;
		drop RegExID pos length start stop;
	run;

	proc sql;
		create table PATSTAT3&no. as
		select distinct
		&COMPANY_NAME.,
		&COMPANY_NAME._Step23,
		&COMPANY_NAME._inB,
		&COMPANY_NAME._noB
		from step9.Patstat_total_hrm_Step3
		where &COMPANY_NAME._inB ne ''
		;
	quit;

%MEND CompnayNameM;
%CompnayNameM(no=1,Company_name=HRM_L2)
%CompnayNameM(no=2,Company_name=PERSON_NAME)
%CompnayNameM(no=3,Company_name=HRM_L2_B)
%CompnayNameM(no=4,Company_name=PERSON_NAME_B)
run;

However, I get the result as

NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=''AKTAL-LTD PERSON_CTRY_CODE=KZ PERSON_NAME=''AKTAL-LTD' SECTOR=COMPANY PERSON_ADDRESS=
HRM_L2_Step23=''AKTAL-LTD PERSON_NAME_Step23=''AKTAL-LTD' HRM_L2_B= HRM_L2_address=
chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=
PERSON_NAME_B_Step23= RegExID=1 start=3 stop=11 pos=0 length=0 HRM_L2_inB='AKTAL-LTD
HRM_L2_noB=AKTAL-LTD _ERROR_=1 _N_=2035185
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=CIRCULAR PLANE SAW'' PERSON_CTRY_CODE=RU PERSON_NAME=OOO 'CIRCULAR PLANE SAW'' SECTOR=COMPANY
PERSON_ADDRESS= HRM_L2_Step23=CIRCULAR PLANE SAW'' PERSON_NAME_Step23=OOO 'CIRCULAR PLANE SAW''
HRM_L2_B= HRM_L2_address= chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address=
chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23= PERSON_NAME_B_Step23= RegExID=1 start=21 stop=20 pos=0
length=0 HRM_L2_inB=' HRM_L2_noB=CIRCULAR PLANE SAW _ERROR_=1 _N_=2115372
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=JS. CO. ORMETO-YUMZ'' PERSON_CTRY_CODE=RU PERSON_NAME=JS. CO. 'ORMETO-YUMZ'' SECTOR=COMPANY
PERSON_ADDRESS= HRM_L2_Step23=JS. CO. ORMETO-YUMZ'' PERSON_NAME_Step23=JS. CO. 'ORMETO-YUMZ''
HRM_L2_B= HRM_L2_address= chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address=
chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23= PERSON_NAME_B_Step23= RegExID=1 start=22 stop=21 pos=0
length=0 HRM_L2_inB=' HRM_L2_noB=JS. CO. ORMETO-YUMZ _ERROR_=1 _N_=2120410
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU ''KVARTS PERSON_CTRY_CODE=RU
PERSON_NAME=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU ''KVARTS SECTOR=COMPANY PERSON_ADDRESS=
HRM_L2_Step23=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU ''KVARTS
PERSON_NAME_Step23=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU ''KVARTS HRM_L2_B= HRM_L2_address=
chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=
PERSON_NAME_B_Step23= RegExID=1 start=47 stop=52 pos=0 length=0 HRM_L2_inB='KVARTS
HRM_L2_noB=OBSCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTYU KVARTS _ERROR_=1 _N_=2131108
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=AZOVZAGAL''MASH JOINT STOCK COMPANY PERSON_CTRY_CODE=UA
PERSON_NAME='AZOVZAGAL''MASH' JOINT STOCK COMPANY SECTOR=COMPANY PERSON_ADDRESS=
HRM_L2_Step23=AZOVZAGAL''MASH JOINT STOCK COMPANY
PERSON_NAME_Step23='AZOVZAGAL''MASH' JOINT STOCK COMPANY HRM_L2_B= HRM_L2_address=
chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=
PERSON_NAME_B_Step23= RegExID=1 start=12 stop=35 pos=0 length=0 HRM_L2_inB='MASH JOINT STOCK COMPANY
HRM_L2_noB=AZOVZAGAL MASH JOINT STOCK COMPANY _ERROR_=1 _N_=2245500
NOTE: Invalid third argument to function SUBSTR at line 2 column 106.
HRM_L2=CHERKAS''KE HIMVOLOKNO JOINT STOCK COMPANY PERSON_CTRY_CODE=UA
PERSON_NAME='CHERKAS''KE HIMVOLOKNO' JOINT STOCK COMPANY SECTOR=COMPANY PERSON_ADDRESS=
HRM_L2_Step23=CHERKAS''KE HIMVOLOKNO JOINT STOCK COMPANY
PERSON_NAME_Step23='CHERKAS''KE HIMVOLOKNO' JOINT STOCK COMPANY HRM_L2_B= HRM_L2_address=
chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=
PERSON_NAME_B_Step23= RegExID=1 start=10 stop=42 pos=0 length=0
HRM_L2_inB='KE HIMVOLOKNO JOINT STOCK COMPANY HRM_L2_noB=CHERKAS KE HIMVOLOKNO JOINT STOCK COMPANY
_ERROR_=1 _N_=2245623

According to this result, I conclude the problematic variable as

 

'CIRCULAR PLANE SAW''
''AKTAL-LTD'
DR. REDDY''S LABORATORIES LTD.
ZAKRYTOE AKTSIONERNOE OBSCHESTVO ''SKY LTD.''
''TETRA LTD.' FIRM', LIMITED-LIABILITY COMPANY
''TETRA LTD.' ' FIRM', LIMITED-LIABILITY COMPANY

I expect to put all strings and characters which included in the outermost single quotation marks into  &COMPANY_NAME._inB, and the rest into &COMPANY_NAME._noB

Name&COMPANY_NAME._inB&COMPANY_NAME._noB
CIRCULAR PLANE SAW''CIRCULAR PLANE SAW' 
'AKTAL-LTD'AKTAL-LTD 
DR. REDDY''S LABORATORIES LTD. DR. REDDY S LABORATORIES LTD.
ZAKRYTOE AKTSIONERNOE OBSCHESTVO ''SKY LTD.''SKY LTD.'ZAKRYTOE AKTSIONERNOE OBSCHESTVO
'TETRA LTD.' FIRM', LIMITED-LIABILITY COMPANYTETRA LTD.' FIRM, LIMITED-LIABILITY COMPANY
'TETRA LTD.' ' FIRM', LIMITED-LIABILITY COMPANYTETRA LTD.' FIRM, LIMITED-LIABILITY COMPANY

I have tried to use the online regex tester to fix it but does not work. Could you please give me some suggestions about this?

thanks in advance.

 

 

3 REPLIES 3
Shmuel
Garnet | Level 18

What is the difference between this post and your previous one ?

Have you tried %superq macro function ?

Alexxxxxxx
Pyrite | Level 9

Dear @Shmuel ,

 

I get the current result(and NOTE) by using the previous code. I expect to update the previous code, thereby fixing the errors.

 

I do not How to use the %superq macro function, Could you please give me more suggestions about this?

Shmuel
Garnet | Level 18

Few remarks:

 

1) In your code you are overriding your input by the output.

    Better way is to assign a different name, in order not to force restart from beginning.

   

data    step9.Patstat_total_hrm_Step3;
  set   step9.Patstat_total_hrm_Step3;

 

2) It is difficult to consult without test data.

    I assumed it is the same as in your previous post.

 

3) Debug your code out of a macro program, by:

    

%let company_name = <any name to test> ;
data check;
 set step9.Patstat_total_hrm_Step3;
     RegExID = prxparse('/<.*>|\[.*\]|\(.*\)|\{.*\}|".*"|''.*''/');
     start=1;
     stop=length(&COMPANY_NAME._Step23);
     call prxnext(RegExID, start, stop, &COMPANY_NAME._Step23, pos, length);
     do while (pos > 0);
       &COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2);
       &COMPANY_NAME._noB = prxchange('s/<.*>|\[.*\]|\(.*\)|\{.*\}|".*"|''.*''/ /', -1, &COMPANY_NAME._Step23);
       call prxnext(RegExID, start, stop, &COMPANY_NAME._Step23, pos, length);
    end;
drop RegExID pos length start stop;
run;

proc sql;
     create table PATSTAT3&no. as
     select distinct
         &COMPANY_NAME.,
         &COMPANY_NAME._Step23,
         &COMPANY_NAME._inB,
         &COMPANY_NAME._noB
     from check                     /* step9.Patstat_total_hrm_Step3 */
     where &COMPANY_NAME._inB ne ''
  ;
quit;

   4) You got notes/errors, in different observations,  like:

NOTE: Invalid third argument to function SUBSTR at line 2 column 106.HRM_L2=''AKTAL-LTD PERSON_CTRY_CODE=KZ PERSON_NAME=''AKTAL-LTD' SECTOR=COMPANY PERSON_ADDRESS=HRM_L2_Step23=''AKTAL-LTD PERSON_NAME_Step23=''AKTAL-LTD' HRM_L2_B= HRM_L2_address=chflagHRM_L2_Bv23=0 PERSON_NAME_B= PERSON_NAME_address= chflagPERSON_NAME_Bv23=0 HRM_L2_B_Step23=PERSON_NAME_B_Step23= RegExID=1 start=3 stop=11 pos=0 length=0 HRM_L2_inB='AKTAL-LTDHRM_L2_noB=AKTAL-LTD _ERROR_=1 _N_=2035185

the only line with substr function is:

do while (pos > 0);
    &COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2);

where 3rd argument is defined as length-2,  while, according to the note, length=0 and length-2 is negative, an invalid length.

You should check those observation (for example _N_=2035185) why is length=0. Is there no company name ?

Add code to deal with those observations.

 

5) You assigned argument no= in the macro but never used it (&no does not exist in your code).

     What you intended to do with it ?

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 659 views
  • 0 likes
  • 2 in conversation