'Dear all,
when I use the following MACRO codes, I only get the observations which can create a '&COMPANY_NAME._inB' variable.
do you know why I can not keep all observations?
Data Step9.Datastream_gb_Step3;
Set Step9.Datastream_gb_Step23;
Run;
%MACRO CompnayNameM(no=,Company_name=);
data Step9.Datastream_gb_Step3;
format &Company_name._Step23 &Company_name._inB &Company_name._noB;
set Step9.Datastream_gb_Step3;
RegExID = prxparse('/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/');
start=1;
call prxnext(RegExID, start, length(&COMPANY_NAME._Step23), &COMPANY_NAME._Step23, pos, length);
&COMPANY_NAME._noB=prxchange('s/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/ /', -1, &COMPANY_NAME._Step23);
do while (pos > 0);
&COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2);
output;
call prxnext(RegExID, start, length(&COMPANY_NAME._Step23), &COMPANY_NAME._Step23, pos, length);
end;
drop RegExID start pos length;
run;
%MEND CompnayNameM;
%CompnayNameM(no=1,Company_name=NAME)
%CompnayNameM(no=2,Company_name=COMPANY_NAME)
%CompnayNameM(no=3,Company_name=COMPANY_NAME_1)
%CompnayNameM(no=4,Company_name=COMPANY_NAME___SHORT)
%CompnayNameM(no=5,Company_name=PREVIOUS_NAME)
run;
do you know why?
I think that is the question to be answered.
look at your format
@Alexxxxxxx wrote:
Dear Reeza,
thanks for your advice.
Could you please explain it in detail? Where should I add the ELSE output? I have added an example in attachment. Could you please have a look?
Do you understand this code, or is it someone else's you're trying to modify?
You have a DO loop that only gets entered if POS is greater than 0. Your OUTPUT statement is in the DO loop. If you don't enter the DO loop you don't get any output. If you still need an output, you have to add a condition to deal with that.
You really really should be commenting your code as well, it helps a lot, especially while learning.
Are you asking why some input observations are not reflected in the output dataset?
Perhaps it is because of your DO/WHILE construct?
Try switching to a DO UNTIL construct instead. That way the first result is always output whether a match is found or not.
Dear Tom,
thanks for your advice.
Do you meaning
Data want;
Set Step9.Datastream_gb_Step23;
Run;
%MACRO CompnayNameM(no=,Company_name=);
data want;
format &Company_name._Step23 &Company_name._inB &Company_name._noB;
set want;
RegExID = prxparse('/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/');
start=1;
call prxnext(RegExID, start, length(&COMPANY_NAME._Step23), &COMPANY_NAME._Step23, pos, length);
&COMPANY_NAME._noB=prxchange('s/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/ /', -1, &COMPANY_NAME._Step23);
do until (pos > 0);
&COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2);
output;
call prxnext(RegExID, start, length(&COMPANY_NAME._Step23), &COMPANY_NAME._Step23, pos, length);
end;
drop RegExID start pos length;
run;
%MEND CompnayNameM;
%CompnayNameM(no=1,Company_name=NAME)
%CompnayNameM(no=2,Company_name=COMPANY_NAME)
%CompnayNameM(no=3,Company_name=COMPANY_NAME_1)
%CompnayNameM(no=4,Company_name=COMPANY_NAME___SHORT)
%CompnayNameM(no=5,Company_name=PREVIOUS_NAME)
run;
, however, I get
1470 %MEND CompnayNameM;
1471
1472 %CompnayNameM(no=1,Company_name=NAME)
NOTE: Invalid third argument to function SUBSTR at line 2 column 193.
NOTE: Invalid third argument to function SUBSTR at line 2 column 193.
NOTE: Invalid third argument to function SUBSTR at line 2 column 193.
NOTE: Invalid third argument to function SUBSTR at line 2 column 193.
NOTE: Invalid third argument to function SUBSTR at line 2 column 193.
NOTE: Invalid third argument to function SUBSTR at line 2 column 193.
NOTE: Invalid third argument to function SUBSTR at line 2 column 193.
NOTE: Invalid third argument to function SUBSTR at line 2 column 193.
NOTE: Invalid third argument to function SUBSTR at line 2 column 193.
....
I have added an example in attachment. Could you please have a look?
Clean up the logic a little. You can also eliminate the duplicate calls to search.
data want;
set have ;
&COMPANY_NAME._noB=prxchange('s/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/ /'
, -1, &COMPANY_NAME._Step23);
pos=0;
do n=1 by 1 until (pos = 0);
call prxnext('/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/'
,pos+1, length(&COMPANY_NAME._Step23), &COMPANY_NAME._Step23, pos, length);
if pos=0 then call missing(&COMPANY_NAME._inB);
else &COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2);
if pos or n=1 then output;
end;
drop pos length;
run;
I am not sure why you had the FORMAT statement and why it was BEFORE the SET statement?
Also why were you reading and writing the same dataset? That will make it hard to debug what is happening.
Dear @Tom
thanks for your advice.
by the code, I get
1453 %MACRO CompnayNameM(no=,Company_name=);
1454
1455 data step9.Patstat_gb_hrm_Step3;
1456 set step9.Patstat_gb_hrm_Step3;
1457 &COMPANY_NAME._noB=prxchange('s/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/ /', -1,
1457! &COMPANY_NAME._Step23);
1458 pos=0;
1459 do n=1 by 1 until (pos = 0);
1460 call prxnext('/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/',pos+1, length(&COMPANY_NAME._Step23),
1460! &COMPANY_NAME._Step23, pos, length);
1461 if pos=0 then call missing(&COMPANY_NAME._inB);
1462 else &COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2);
1463 if pos or n=1 then output;
1464 end;
1465 drop pos length;
1466 run;
1467
1468 proc sql;
1469 create table PATSTAT&no. as
1470 select distinct
1471 &COMPANY_NAME.,
1472 &COMPANY_NAME._Step23,
1473 &COMPANY_NAME._inB,
1474 &COMPANY_NAME._noB
1475 from step9.Patstat_gb_hrm_Step3
1476 where &COMPANY_NAME._inB ne ''
1477 ;
1478 quit;
1479
1480 %MEND CompnayNameM;
1481
1482 %CompnayNameM(no=1,Company_name=HRM_L2)
NOTE: Line generated by the invoked macro "COMPNAYNAMEM".
1 data step9.Patstat_gb_hrm_Step3; set step9.Patstat_gb_hrm_Step3;
1 ! &COMPANY_NAME._noB=prxchange('s/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/ /', -1,
1 ! &COMPANY_NAME._Step23); pos=0; do n=1 by 1 until (pos = 0); call prxnext
-------
134
WARNING 134-185: Argument #2 is an expression, which cannot be updated by the PRXNEXT subroutine call.
NOTE: Line generated by the invoked macro "COMPNAYNAMEM".
2 ('/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/',pos+1, length(&COMPANY_NAME._Step23),
---------------------------------------
135
2 ! &COMPANY_NAME._Step23, pos, length); if pos=0 then call missing(&COMPANY_NAME._inB);
2 ! else &COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2); if
ERROR 135-185: Attempt to change the value of the constant '/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/' in
the PRXNEXT subroutine call.
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
2:2 2:199
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set STEP9.PATSTAT_GB_HRM_STEP3 may be incomplete. When this step was stopped there
were 0 observations and 25 variables.
WARNING: Data set STEP9.PATSTAT_GB_HRM_STEP3 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
ERROR: The following columns were not found in the contributing tables: HRM_L2_inB, HRM_L2_noB.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
1483 %CompnayNameM(no=2,Company_name=PERSON_NAME)
NOTE: Line generated by the invoked macro "COMPNAYNAMEM".
1 data step9.Patstat_gb_hrm_Step3; set step9.Patstat_gb_hrm_Step3;
1 ! &COMPANY_NAME._noB=prxchange('s/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/ /', -1,
1 ! &COMPANY_NAME._Step23); pos=0; do n=1 by 1 until (pos = 0); call prxnext
-------
134
WARNING 134-185: Argument #2 is an expression, which cannot be updated by the PRXNEXT subroutine call.
NOTE: Line generated by the invoked macro "COMPNAYNAMEM".
2 ('/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/',pos+1, length(&COMPANY_NAME._Step23),
---------------------------------------
135
2 ! &COMPANY_NAME._Step23, pos, length); if pos=0 then call missing(&COMPANY_NAME._inB);
2 ! else &COMPANY_NAME._inB = substr(&COMPANY_NAME._Step23, pos+1, length-2); if
ERROR 135-185: Attempt to change the value of the constant '/<\w*>|\[\w*\]|\(\w*\)|"\w*"|''\w*''/' in
the PRXNEXT subroutine call.
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
2:2 2:199
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set STEP9.PATSTAT_GB_HRM_STEP3 may be incomplete. When this step was stopped there
were 0 observations and 25 variables.
WARNING: Data set STEP9.PATSTAT_GB_HRM_STEP3 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
ERROR: The following columns were not found in the contributing tables: PERSON_NAME_inB,
PERSON_NAME_noB.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
1484 run;
besides,
when I face the value like
HARDY(FRNS.)'A'
it cannot be processed by the code
I expect to get
name | COMPANY_NAME_inB | COMPANY_NAME_noB |
HARDY(FRNS.)'A' | FRNS. | HARDY |
HARDY(FRNS.)'A' | A | HARDY |
However, I only get
name | COMPANY_NAME_inB | COMPANY_NAME_noB |
HARDY(FRNS.)'A' | A | HARDY(FRNS.) |
Could you please give me some suggestions about these two problem?
Because you only get to the Output statement if the RegEx returns a match.
I start to feel that this is may-be one of these cases where one should first get the design approach right and only use macro coding if it can't be done with data steps and Proc's.
I suggest you post representative sample data (a tested SAS data step creating the data), show us the desired result for this sample data and then explain us what you're trying to implement/the logic from the have to the want data.
@Alexxxxxxx wrote:
Could you please explain it for me in detail?
Sorry, this was my answer to your initial question. I've missed all the discussion that already happened.
But looking into your current code: It looks over complicated and I still believe instead of us trying to help you fix this code it would be better if you provide sample data, desired outcome and explanation what you want to achieve so we can help you to come up with a leaner and cleaner coding approach.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.