BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Alexxxxxxx
Pyrite | Level 9

Dear all,

I am trying to find the company abbreviation that appeared first, by using the SMALLEST function. So, I expect to get the smallest number of POS_: when the POS not equals 0.

However, by using the following code,

data HAVE;
input NAME :& $800.;
cards;
!MAGINETHIS RENOVATIONS
!OBAC
#9060-0495 QUEBEC
3M INNOVATIVE PROPERTIES CO NIPPON TELEGRAPH & TELEPHONE CORP 
A. STUCKI CO A DELAWARE CORP
AMANA CO LP A LTD PARTNERSHIP UNDER THE LAWS OF USA-DELAWARE
FEDERAL-HOFFMAN INC D.B.A. FEDERAL CARTRIDGE CO
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) 
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY 
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) A LTD IN UK 
CRANE PLASTICS CO LTD PARTNERSHIP
ABCCC DEF ALME
run; 

data want;
set have;
patternID = prxparse('/ AB /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_AB,LEN_AB);
		if POS_AB  =0 then POS_AB  =.;
		if LEN_AB  =0 then LEN_AB  =.;
		patternID = prxparse('/ AG /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_AG,LEN_AG);
		if POS_AG  =0 then POS_AG  =.;
		if LEN_AG  =0 then LEN_AG  =.;
		patternID = prxparse('/ AS /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_AS,LEN_AS);
		if POS_AS  =0 then POS_AS  =.;
		if LEN_AS  =0 then LEN_AS  =.;
		patternID = prxparse('/ BV /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_BV,LEN_BV);
		if POS_BV  =0 then POS_BV  =.;
		if LEN_BV  =0 then LEN_BV  =.;
		patternID = prxparse('/ BV IO /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_BV_IO,LEN_BV_IO);
		if POS_BV_IO  =0 then POS_BV_IO  =.;
		if LEN_BV_IO  =0 then LEN_BV_IO  =.;
		patternID = prxparse('/ CO /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_CO,LEN_CO);
		if POS_CO=0 then POS_CO=.;
		if LEN_CO=0 then LEN_CO=.;
		patternID = prxparse('/ CO LTD /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_CO_LTD,LEN_CO_LTD);
		if POS_CO_LTD=0 then POS_CO_LTD=.;
		if LEN_CO_LTD=0 then LEN_CO_LTD=.;
		patternID = prxparse('/ CO BV /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_CO_BV,LEN_CO_BV);
		if POS_CO_BV=0 then POS_CO_BV=.;
		if LEN_CO_BV=0 then LEN_CO_BV=.;
		patternID = prxparse('/ CO KG /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_CO_KG,LEN_CO_KG);
		if POS_CO_KG=0 then POS_CO_KG=.;
		if LEN_CO_KG=0 then LEN_CO_KG=.;
		patternID = prxparse('/ CO LP /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_CO_LP,LEN_CO_LP);
		if POS_CO_LP =0 then POS_CO_LP =.;
		if LEN_CO_LP =0 then LEN_CO_LP =.;
		patternID = prxparse('/ CO LC /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_CO_LC,LEN_CO_LC);
		if POS_CO_LC =0 then POS_CO_LC =.;
		if LEN_CO_LC =0 then LEN_CO_LC =.;
		patternID = prxparse('/ CO LP /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_CO_LP,LEN_CO_LP);
		if POS_CO_LP =0 then POS_CO_LP =.;
		if LEN_CO_LP =0 then LEN_CO_LP =.;
		patternID = prxparse('/ CO INC /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_CO_INC,LEN_CO_INC);
		if POS_CO_INC =0 then POS_CO_INC =.;
		if LEN_CO_INC =0 then LEN_CO_INC =.;
		patternID = prxparse('/ CO LLC /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_CO_LLC,LEN_CO_LLC);
		if POS_CO_LLC =0 then POS_CO_LLC =.;
		if LEN_CO_LLC =0 then LEN_CO_LLC =.;
		patternID = prxparse('/ CORP /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_CORP,LEN_CORP);
		if POS_CORP =0 then POS_CORP =.;
		if LEN_CORP =0 then LEN_CORP =.;
		patternID = prxparse('/ COOP /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_COOP,LEN_COOP);
		if POS_COOP =0 then POS_COOP =.;
		if LEN_COOP =0 then LEN_COOP =.;
		patternID = prxparse('/ CV /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_CV,LEN_CV);
		if POS_CV  =0 then POS_CV  =.;
		if LEN_CV  =0 then LEN_CV  =.;
		patternID = prxparse('/ CY /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_CY,LEN_CY);
		if POS_CY  =0 then POS_CY  =.;
		if LEN_CY  =0 then LEN_CY  =.;
		patternID = prxparse('/ DOO /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_DOO,LEN_DOO);
		if POS_DOO  =0 then POS_DOO  =.;
		if LEN_DOO  =0 then LEN_DOO  =.;
		patternID = prxparse('/ EG /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_EG,LEN_EG);
		if POS_EG  =0 then POS_EG  =.;
		if LEN_EG  =0 then LEN_EG  =.;
		patternID = prxparse('/ HB /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_HB,LEN_HB);
		if POS_HB  =0 then POS_HB  =.;
		if LEN_HB  =0 then LEN_HB  =.;
		patternID = prxparse('/ IB /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_IB,LEN_IB);
		if POS_IB  =0 then POS_IB  =.;
		if LEN_IB  =0 then LEN_IB  =.;
		patternID = prxparse('/ INC /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_INC,LEN_INC);
		if POS_INC =0 then POS_INC =.;
		if LEN_INC =0 then LEN_INC =.;
		patternID = prxparse('/ INC LTD /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_INC_LTD,LEN_INC_LTD);
		if POS_INC_LTD=0 then POS_INC_LTD=.;
		if LEN_INC_LTD=0 then LEN_INC_LTD=.;
		patternID = prxparse('/ INK /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_INK,LEN_INK);
		if POS_INK =0 then POS_INK =.;
		if LEN_INK =0 then LEN_INK =.;
		patternID = prxparse('/ INK CORP/');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_INK_CORP,LEN_INK_CORP);
		if POS_INK_CORP =0 then POS_INK_CORP =.;
		if LEN_INK_CORP =0 then LEN_INK_CORP =.;
		patternID = prxparse('/ KB /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_KB,LEN_KB);
		if POS_KB  =0 then POS_KB  =.;
		if LEN_KB  =0 then LEN_KB  =.;
		patternID = prxparse('/ KS /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_KS,LEN_KS);
		if POS_KS  =0 then POS_KS  =.;
		if LEN_KS  =0 then LEN_KS  =.;
		patternID = prxparse('/ KG /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_KG,LEN_KG);
		if POS_KG  =0 then POS_KG  =.;
		if LEN_KG  =0 then LEN_KG  =.;
		patternID = prxparse('/ KK /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_KK,LEN_KK);
		if POS_KK  =0 then POS_KK  =.;
		if LEN_KK  =0 then LEN_KK  =.;
		patternID = prxparse('/ LC /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_LC,LEN_LC);
		if POS_LC  =0 then POS_LC  =.;
		if LEN_LC  =0 then LEN_LC  =.;
		patternID = prxparse('/ LP /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_LP,LEN_LP);
		if POS_LP  =0 then POS_LP  =.;
		if LEN_LP  =0 then LEN_LP  =.;
		patternID = prxparse('/ LTD /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_LTD,LEN_LTD);
		if POS_LTD =0 then POS_LTD =.;
		if LEN_LTD =0 then LEN_LTD =.;
		patternID = prxparse('/ LTD LP /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_LTD_LP,LEN_LTD_LP);
		if POS_LTD_LP =0 then POS_LTD_LP =.;
		if LEN_LTD_LP =0 then LEN_LTD_LP =.;
		patternID = prxparse('/ LLP /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_LLP,LEN_LLP);
		if POS_LLP =0 then POS_LLP =.;
		if LEN_LLP =0 then LEN_LLP =.;
		patternID = prxparse('/ LLLP /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_LLLP,LEN_LLLP);
		if POS_LLLP =0 then POS_LLLP =.;
		if LEN_LLLP =0 then LEN_LLLP =.;
		patternID = prxparse('/ LLC /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_LLC,LEN_LLC);
		if POS_LLC =0 then POS_LLC =.;
		if LEN_LLC =0 then LEN_LLC =.;
		patternID = prxparse('/ MBH /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_MBH,LEN_MBH);
		if POS_MBH =0 then POS_MBH =.;
		if LEN_MBH =0 then LEN_MBH =.;
		patternID = prxparse('/ NA /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_NA,LEN_NA);
		if POS_NA  =0 then POS_NA  =.;
		if LEN_NA  =0 then LEN_NA  =.;
		patternID = prxparse('/ NV /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_NV,LEN_NV);
		if POS_NV  =0 then POS_NV  =.;
		if LEN_NV  =0 then LEN_NV  =.;
		patternID = prxparse('/ NV IO /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_NV_IO,LEN_NV_IO);
		if POS_NV_IO = 0 then POS_NV_IO =.;
		if LEN_NV_IO = 0 then LEN_NV_IO =.;
		patternID = prxparse('/ NL /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_NL,LEN_NL);
		if POS_NL  =0 then POS_NL  =.;
		if LEN_NL  =0 then LEN_NL  =.;
		patternID = prxparse('/ SA /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SA,LEN_SA);
		if POS_SA  =0 then POS_SA  =.;
		if LEN_SA  =0 then LEN_SA  =.;
		patternID = prxparse('/ SA DE CV /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SA_DEC,LEN_SA_DEC);
		if POS_SA_DEC =0 then POS_SA_DEC =.;
		if LEN_SA_DEC =0 then LEN_SA_DEC =.;
		patternID = prxparse('/ SAS /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SAS,LEN_SAS);
		if POS_SAS =0 then POS_SAS =.;
		if LEN_SAS =0 then LEN_SAS =.;
		patternID = prxparse('/ SAS DE CV /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SAS_DEC,LEN_SAS_DEC);
		if POS_SAS_DEC=0 then POS_SAS_DEC=.;
		if LEN_SAS_DEC=0 then LEN_SAS_DEC=.;
		patternID = prxparse('/ SAB /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SAB,LEN_SAB);
		if POS_SAB =0 then POS_SAB =.;
		if LEN_SAB =0 then LEN_SAB =.;
		patternID = prxparse('/ SRO /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SRO,LEN_SRO);
		if POS_SRO =0 then POS_SRO =.;
		if LEN_SRO =0 then LEN_SRO =.;
		patternID = prxparse('/ SRL /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SRL,LEN_SRL);
		if POS_SRL =0 then POS_SRL =.;
		if LEN_SRL =0 then LEN_SRL =.;
		patternID = prxparse('/ SAL /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SAL,LEN_SAL);
		if POS_SAL =0 then POS_SAL =.;
		if LEN_SAL =0 then LEN_SAL =.;
		patternID = prxparse('/ SAU /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SAU,LEN_SAU);
		if POS_SAU =0 then POS_SAU =.;
		if LEN_SAU =0 then LEN_SAU =.;
		patternID = prxparse('/ SPA /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SPA,LEN_SPA);
		if POS_SPA =0 then POS_SPA =.;
		if LEN_SPA =0 then LEN_SPA =.;
		patternID = prxparse('/ SPZOO /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SPZOO,LEN_SPZOO);
		if POS_SPZOO =0 then POS_SPZOO =.;
		if LEN_SPZOO =0 then LEN_SPZOO =.;
		patternID = prxparse('/ SPOL /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SPOL,LEN_SPOL);
		if POS_SPOL =0 then POS_SPOL =.;
		if LEN_SPOL =0 then LEN_SPOL =.;
		patternID = prxparse('/ SNC /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SNC,LEN_SNC);
		if POS_SNC =0 then POS_SNC =.;
		if LEN_SNC =0 then LEN_SNC =.;
		patternID = prxparse('/ SCA /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SCA,LEN_SCA);
		if POS_SCA =0 then POS_SCA =.;
		if LEN_SCA =0 then LEN_SCA =.;
		patternID = prxparse('/ SL /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SL,LEN_SL);
		if POS_SL  =0 then POS_SL  =.;
		if LEN_SL  =0 then LEN_SL  =.;
		patternID = prxparse('/ SC /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SC,LEN_SC);
		if POS_SC  =0 then POS_SC  =.;
		if LEN_SC  =0 then LEN_SC  =.;
		patternID = prxparse('/ SP /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_SP,LEN_SP);
		if POS_SP  =0 then POS_SP  =.;
		if LEN_SP  =0 then LEN_SP  =.;
		patternID = prxparse('/ PC /');
		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_PC,LEN_PC);
		if POS_PC  =0 then POS_PC  =.;
		if LEN_PC  =0 then LEN_PC  =.;


		array x[*] POS_: ;
		do k=1 to patternID;
			POSITION = smallest(k, of x[*]);
			if POSITION >0 then output;
		end;
		drop k;

	run;

 the result shows

 

NOTE: Argument 1 to function
      SMALLEST(40,.,.,.,.,.,25,.,.,.,.,.,.,.,57,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.) at
      line 5415 column 24 is invalid.
NAME=3M INNOVATIVE PROPERTIES CO NIPPON TELEGRAPH & TELEPHONE CORP patternID=40 POS_AB=. LEN_AB=.
POS_AG=. LEN_AG=. POS_AS=. LEN_AS=. POS_BV=. LEN_BV=. POS_BV_IO=. LEN_BV_IO=. POS_CO=25 LEN_CO=4
POS_CO_LTD=. LEN_CO_LTD=. POS_CO_BV=. LEN_CO_BV=. POS_CO_KG=. LEN_CO_KG=. POS_CO_LP=. LEN_CO_LP=.
POS_CO_LC=. LEN_CO_LC=. POS_CO_INC=. LEN_CO_INC=. POS_CO_LLC=. LEN_CO_LLC=. POS_CORP=57 LEN_CORP=6
POS_COOP=. LEN_COOP=. POS_CV=. LEN_CV=. POS_CY=. LEN_CY=. POS_DOO=. LEN_DOO=. POS_EG=. LEN_EG=.
POS_HB=. LEN_HB=. POS_IB=. LEN_IB=. POS_INC=. LEN_INC=. POS_INC_LTD=. LEN_INC_LTD=. POS_INK=.
LEN_INK=. POS_INK_CORP=. LEN_INK_CORP=. POS_KB=. LEN_KB=. POS_KS=. LEN_KS=. POS_KG=. LEN_KG=. POS_KK=.
LEN_KK=. POS_LC=. LEN_LC=. POS_LP=. LEN_LP=. POS_LTD=. LEN_LTD=. POS_LTD_LP=. LEN_LTD_LP=. POS_LLP=.
LEN_LLP=. POS_LLLP=. LEN_LLLP=. POS_LLC=. LEN_LLC=. POS_MBH=. LEN_MBH=. POS_NA=. LEN_NA=. POS_NV=.
LEN_NV=. k=41 POSITION=. _ERROR_=1 _N_=1
NOTE: Argument 1 to function
      SMALLEST(40,.,.,.,.,.,10,.,.,.,.,.,.,.,24,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.) at
      line 5415 column 24 is invalid.
NAME=A. STUCKI CO A DELAWARE CORP patternID=40 POS_AB=. LEN_AB=. POS_AG=. LEN_AG=. POS_AS=. LEN_AS=.
POS_BV=. LEN_BV=. POS_BV_IO=. LEN_BV_IO=. POS_CO=10 LEN_CO=4 POS_CO_LTD=. LEN_CO_LTD=. POS_CO_BV=.
LEN_CO_BV=. POS_CO_KG=. LEN_CO_KG=. POS_CO_LP=. LEN_CO_LP=. POS_CO_LC=. LEN_CO_LC=. POS_CO_INC=.
LEN_CO_INC=. POS_CO_LLC=. LEN_CO_LLC=. POS_CORP=24 LEN_CORP=6 POS_COOP=. LEN_COOP=. POS_CV=. LEN_CV=.
POS_CY=. LEN_CY=. POS_DOO=. LEN_DOO=. POS_EG=. LEN_EG=. POS_HB=. LEN_HB=. POS_IB=. LEN_IB=. POS_INC=.
LEN_INC=. POS_INC_LTD=. LEN_INC_LTD=. POS_INK=. LEN_INK=. POS_INK_CORP=. LEN_INK_CORP=. POS_KB=.
LEN_KB=. POS_KS=. LEN_KS=. POS_KG=. LEN_KG=. POS_KK=. LEN_KK=. POS_LC=. LEN_LC=. POS_LP=. LEN_LP=.
POS_LTD=. LEN_LTD=. POS_LTD_LP=. LEN_LTD_LP=. POS_LLP=. LEN_LLP=. POS_LLLP=. LEN_LLLP=. POS_LLC=.
LEN_LLC=. POS_MBH=. LEN_MBH=. POS_NA=. LEN_NA=. POS_NV=. LEN_NV=. k=41 POSITION=. _ERROR_=1 _N_=2
...

...
NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      334 at 5415:24
NOTE: Mathematical operations could not be performed at the following places. The results of the
      operations have been set to missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      9 at 5415:24
NOTE: There were 9 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 17 observations and 81 variables.
NOTE: DATA statement used (Total process time):
      real time           0.07 seconds
      cpu time            0.06 seconds

 

How can I fix it?

Could you please give me some suggestions about this?

 

Thanks in advance.

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Note that you extract exactly the same information with this data step:

data WANT;
  set HAVE;
  NAME=compbl(NAME);
  length PATTERN $10;
  do PATTERN = 'AB','AG','AS','BV','CO','CO LTD','CO LP','CORP','INC','LP','LTD';
    POS= index(strip(NAME)||' ' ,' '||trim(PATTERN)||' ' );
    if POS then output;
  end;
run;
NAME PATTERN POS
3M INNOVATIVE PROPERTIES CO NIPPON TELEGRAPH & TELEPHONE CORP CO 25
3M INNOVATIVE PROPERTIES CO NIPPON TELEGRAPH & TELEPHONE CORP CORP 57
A. STUCKI CO A DELAWARE CORP CO 10
A. STUCKI CO A DELAWARE CORP CORP 24
AMANA CO LP A LTD PARTNERSHIP UNDER THE LAWS OF USA-DELAWARE CO 6
AMANA CO LP A LTD PARTNERSHIP UNDER THE LAWS OF USA-DELAWARE CO LP 6
AMANA CO LP A LTD PARTNERSHIP UNDER THE LAWS OF USA-DELAWARE LP 9
AMANA CO LP A LTD PARTNERSHIP UNDER THE LAWS OF USA-DELAWARE LTD 14
FEDERAL-HOFFMAN INC D.B.A. FEDERAL CARTRIDGE CO CO 45
FEDERAL-HOFFMAN INC D.B.A. FEDERAL CARTRIDGE CO INC 16
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) CO 3
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY CO 3
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) A LTD IN UK CO 3
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) A LTD IN UK LTD 53
CRANE PLASTICS CO LTD PARTNERSHIP CO 15
CRANE PLASTICS CO LTD PARTNERSHIP CO LTD 15
CRANE PLASTICS CO LTD PARTNERSHIP LTD 18

 

 

 

View solution in original post

8 REPLIES 8
ballardw
Super User

From the documentation for the function:

Details

If k is missing, less than zero, or greater than the number of values, the result is a missing value and _ERROR_ is set to 1. Otherwise, if k is greater than the number of nonmissing values, the result is a missing value but _ERROR_ is not set to 1.

 

So if the K you request has is larger than the number of non-missing of x(*) in this part of your code that is what happens:

		do k=1 to patternID;
			POSITION = smallest(k, of x[*]);
			if POSITION >0 then output;
		end;

 you might try testing for a valid value of K such as

		do k=1 to patternID;
			if k le n(of x[*]) then POSITION = smallest(k, of x[*]);
			if POSITION >0 then output;
		end;

 

Alexxxxxxx
Pyrite | Level 9

Hello @ballardw ,

 

Thanks to your code.

 

However, I get 480 observations from 12 observations. Do you know why?

NOTE: There were 12 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 480 observations and 121 variables.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.04 seconds

thanks for your attention to this matter.

 

 

 

Alexxxxxxx
Pyrite | Level 9

Hello @ballardw

 

Thanks to your code.

However, I get 480 observations from 12 observations. 

 

NOTE: There were 12 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 480 observations and 121 variables.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.04 seconds

 

Do you know why? How can I fix it?

ballardw
Super User

@Alexxxxxxx wrote:

Hello @ballardw

 

Thanks to your code.

However, I get 480 observations from 12 observations. 

 

NOTE: There were 12 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 480 observations and 121 variables.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.04 seconds

 

Do you know why? How can I fix it?


You get multiple output observations for some records because your OUTPUT statement in in a loop. Each time the condition is true ins  side that loop it outputs a record.

Because you don't reset the tested variable it can create multiple output records when you don't expect them.

FreelanceReinh
Jade | Level 19

Hi @Alexxxxxxx,

 

You have a duplicate pattern: The code block

		call prxsubstr(patternID,cat(strip(compbl(NAME)),' '),POS_CO_LP,LEN_CO_LP);
		if POS_CO_LP =0 then POS_CO_LP =.;
		if LEN_CO_LP =0 then LEN_CO_LP =.;

occurs twice. This is why you have 60 patternIDs (your log must be from a different code version with 40), but only 59 POS_... variables in array X, hence the "invalid argument" notes for each observation in HAVE in the last (60th) iteration of the DO loop.

 

So, the first step is to delete the above duplicate code block.

ChrisNZ
Tourmaline | Level 20

Note that you extract exactly the same information with this data step:

data WANT;
  set HAVE;
  NAME=compbl(NAME);
  length PATTERN $10;
  do PATTERN = 'AB','AG','AS','BV','CO','CO LTD','CO LP','CORP','INC','LP','LTD';
    POS= index(strip(NAME)||' ' ,' '||trim(PATTERN)||' ' );
    if POS then output;
  end;
run;
NAME PATTERN POS
3M INNOVATIVE PROPERTIES CO NIPPON TELEGRAPH & TELEPHONE CORP CO 25
3M INNOVATIVE PROPERTIES CO NIPPON TELEGRAPH & TELEPHONE CORP CORP 57
A. STUCKI CO A DELAWARE CORP CO 10
A. STUCKI CO A DELAWARE CORP CORP 24
AMANA CO LP A LTD PARTNERSHIP UNDER THE LAWS OF USA-DELAWARE CO 6
AMANA CO LP A LTD PARTNERSHIP UNDER THE LAWS OF USA-DELAWARE CO LP 6
AMANA CO LP A LTD PARTNERSHIP UNDER THE LAWS OF USA-DELAWARE LP 9
AMANA CO LP A LTD PARTNERSHIP UNDER THE LAWS OF USA-DELAWARE LTD 14
FEDERAL-HOFFMAN INC D.B.A. FEDERAL CARTRIDGE CO CO 45
FEDERAL-HOFFMAN INC D.B.A. FEDERAL CARTRIDGE CO INC 16
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) CO 3
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY CO 3
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) A LTD IN UK CO 3
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) A LTD IN UK LTD 53
CRANE PLASTICS CO LTD PARTNERSHIP CO 15
CRANE PLASTICS CO LTD PARTNERSHIP CO LTD 15
CRANE PLASTICS CO LTD PARTNERSHIP LTD 18

 

 

 

ChrisNZ
Tourmaline | Level 20

What you may actually want though, is this:

data WANT;
  set HAVE;   
  NAME=compbl(NAME);
  length PATTERN $10;
  PATTERNS= 'AB,AG,AS,BV,CO LTD,CO LP,CO,CORP,INC,LP,LTD';
  %* Note that the order matters: CO comes after CO LTD or CO LP;
  MINPOS=999;
  do PATTERNO = 1 to countw(PATTERNS,',');
    PATTERN= scan(PATTERNS,PATTERNO,',');
    POS    = index(strip(NAME)||' ' ,' '||trim(PATTERN)||' ' );  
    if 0<POS<MINPOS then do;      
      MINPOS    =POS;
      MINPATTERN=PATTERN;
    end;
  end;
  keep NAME MIN:;
run;
NAME MINPOS MINPATTERN
!MAGINETHIS RENOVATIONS 999  
!OBAC 999  
#9060-0495 QUEBEC 999  
3M INNOVATIVE PROPERTIES CO NIPPON TELEGRAPH & TELEPHONE CORP 25 CO
A. STUCKI CO A DELAWARE CORP 10 CO
AMANA CO LP A LTD PARTNERSHIP UNDER THE LAWS OF USA-DELAWARE 6 CO LP
FEDERAL-HOFFMAN INC D.B.A. FEDERAL CARTRIDGE CO 16 INC
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) 3 CO
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY 3 CO
3M CO (MINNESOTA MINING AND MANUFACTURING COMPANY) A LTD IN UK 3 CO
CRANE PLASTICS CO LTD PARTNERSHIP 15 CO LTD
ABCCC DEF ALME 999  
Alexxxxxxx
Pyrite | Level 9
Dear ChrisNZ,

Many Thanks

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1971 views
  • 1 like
  • 4 in conversation