DATA Step, Macro, Functions and more

Datastep lookup with FIND using second dataset

Reply
Occasional Contributor
Posts: 19

Datastep lookup with FIND using second dataset

Thanks for helping..

I have a simple FIND and lookup program that I can't seem to code in EG.

There are two files. The goal is to use multiple columns from table US_DATA to scan/find matching data from a single free form text field in table EU_DATA and return matching item #.

 

Program needs scan all OBS of EU_DATA field "NAME OF ITEM" using the first record and four columns starting with CONVERT in US_DATA.

If match is found on all 4 US_DATA columns then output field ITEM CODE along with CONVERT fields.

Then repeat with next US_DATA record.

 

Real issue here is that the field NAME OF ITEM is in a free format...so I cannot do a simple substring function. The data could be anywhere in the string.

 

I've tried moving all CONVERT fields to macros variables, then doing lookup with a macro call FINDIT... only to realize that my macro variable is set to the last record only. So SYMPUTX is processing ALL my records before the MACRO call.

 

I need to read one US_DATA record, move CONVERT data to a variable that I can then loop out to MACRO for lookup...

then repeat for the next US_DATA record.

 

this should be easy but I'm stuck..

 

 

my code is below along with samples of data...

thanks in advance for your help...

 

 

%macro findit;

 

data test;

set qcdata.EU_DATA;

if (find(eu_name,"&type",'i') and

find(eu_name,"&count",'i') and

find(eu_name,"&tpi",'i') and

find(eu_name,"&binder",'i'))

ge 1 then list;

 

else delete;

 

run;

quit;

%put type= &type;

%put us_type= &us_type;

%put;

%put count= &count;

%put us_count= &us_count;

%put;

%put plies=&plies;

%put us_plies=&us_plies;

%put;

%put tpi= &tpi;

%put us_tpi_dir= &us_tpi_dir;

%put;

%put binder= &binder;

%put us_binder= &us_binder;

%put;

%MEND FINDIT;

 

 

 

 

 

DATA WORK.SEARCH_VAR;

SET qcdata.us_data;

convertbinder=translate(convertbinder,'','&');

call symputx ('type',converttype);

call symputx ('count',convertcount);

call symputx ('plies',convertplies);

call symputx ('tpi',converttpi);

call symputx ('binder',convertbinder);

call symputx ('us_type',type);

call symputx ('us_count',count);

call symputx ('us_plies',plies);

call symputx ('us_tpi_dir',tpi_dir);

call symputx ('us_binder',binder);

call execute('%FINDIT');

run;

quit;

 

 

 

US_DATA SAMPLE:

ConvertTypeConvertCountConvertPliesConvertTPI
EC4341X2S150
EC52.81X2Z040
EC5111X2S118
EC5111X2S158
EC5111X2S158
EC5111X2S158
EC5111X2S189
EC55.51X2S189
EC55.51X2S189
EC6341X2S150
EC61361X2S118
EC61361X2S118
EC61361X2S118
EC6681X2S118
EC6681X2S118
EC6681X2S118
EC6681X2S118
EC6681X2S118
EC7221X24.5S
EC91461X2S150

 

EU_DATA SAMPLE:

ITEM CODENAME OF ITEM
FI00798CHAVERGALEC5,5+PES50=11TEX
FI01477EC511X2ITGKR
FI01512EC511X2S180273ITG
FI01509EC511X2S180273MW
FI01527EC511X2S180620-1CULIM
FI01488EC511X2S180620DMITG
FI01457EC511Z32AB7NAG3K
FI01416EC511Z36273FULLT
FI01455EC511Z40331TGI
FI01434EC511Z40620AGYSTEA
FI01467EC511Z40620-1PORCHDM
FI01474EC511Z40620DMPORCH
FI01466EC511Z40BD2Y52CPIC
FI01459EC511Z40BFCPIC
FI01419EC511Z40BFSTF-C
FI01438EC511Z40DJ420JUSHI
FI01470EC511Z40S3GUANG
FI01409EC511Z40Y5CPIC
FI01531EC511Z40Y53BD1CPIC
FI01450EC511Z40Y6TAISH
FI01480EC52,8X2ITGKR
FI01486EC52,8X2S180620-1ITG
FI01476EC55,5X2ITGKR
FI01478EC55,5X2S180620-1AITG
FI01469EC55,5Z32PHPORCH
FI01415EC55,5Z36273FULLT
FI01431EC55,5Z3627KLFULLT
FI01429EC55,5Z40321TGI
FI01454EC55,5Z40331TGI
FI01411EC55,5Z40620-1PORCHDM
FI01468EC55,5Z40622PORCH
FI01528EC55,5Z40S3GUANG
FI01532EC55,5Z40Y53BD2CPIC
FI01449EC55,5Z40Y6TAISH
Super User
Super User
Posts: 9,599

Re: Datastep lookup with FIND using second dataset

Post test data in the form of a datastep, and show what you want out at the end.  From what I can gather here a simple:

data _null_;
  set qcdata.us_data;
  call execute(cat('data want;  
                      if (find(eu_name,"',converttype,'") and find(eu_name,"',convertcount,'") and
                          find(eu_name,"',convertplies,'") and find(eu_name,"',converttpi,'")) then list;
                      else delete;
                    run;'));
run;
                  

Its difficult to see without input and output, but the above will generate a datastep for each row in us_data.

Occasional Contributor
Posts: 19

Re: Datastep lookup with FIND using second dataset

thank you very much...I see how this produces each data step/OBS(see log below)...but how do I then use this output to read the EU_DATA file?

 

 

NOTE: CALL EXECUTE generated line.

1 + data want; if (find(eu_name,"ARCK ") and find(eu_name,"32.5") and

find(eu_name," ") and find(eu_name,"Z040")) then list; else delete; run;

NOTE: Numeric values have been converted to character values at the places given by: (Line)Smiley SadColumn).

1:45 1:71 1:121 1:147

NOTE: Variable eu_name is uninitialized.

NOTE: The data set WORK.WANT has 0 observations and 1 variables.

NOTE: DATA statement used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

 

2 + data want; if (find(eu_name,"EC4 ") and find(eu_name,"34 ") and

find(eu_name," ") and find(eu_name,"Z040")) then list; else delete; run;

NOTE: Numeric values have been converted to character values at the places given by: (Line)Smiley SadColumn).

2:45 2:71 2:121 2:147

NOTE: Variable eu_name is uninitialized.

NOTE: The data set WORK.WANT has 0 observations and 1 variables.

NOTE: DATA statement used (Total process time):

real time 0.00 seconds

cpu time 0.00 seconds

 

Super User
Super User
Posts: 9,599

Re: Datastep lookup with FIND using second dataset

I am not sure what you mean by "but how do I then use this output to read the EU_DATA file?"  The datastep gets created for each line in us_data, then those datasteps get executed to create the dataset with the filtered information.  At them moment each iteration overwrites the previous (as they are all called want), if you changed it to:

data _null_;
  set qcdata.us_data;
  call execute(cat('data want',strip(put(_n_,best.)),';  
                      if (find(eu_name,"',converttype,'") and find(eu_name,"',convertcount,'") and
                          find(eu_name,"',convertplies,'") and find(eu_name,"',converttpi,'")) then list;
                      else delete;
                    run;'));
run;

 

You would see a dataset for each iteration.

Occasional Contributor
Posts: 19

Re: Datastep lookup with FIND using second dataset

sorry for the confusion...and I appreciate your help...

 

I see how the data step gets created...but now I need to use that data step to read another file....EU_DATA.

 

EU_DATA contains the field EU_NAME referenced in the IF statement.

 

I don't know how to branch out of the code you sent me to then read the EU_DATA file then iterate for the next data step...

 

would this work by adding the SET statement to your code?

 

Data _null_;

66 set qcdata.us_data;

67 call execute(cat('data want; set qcdata.EU_DATA;

68 if (find(eu_name,"',converttype,'") and find(eu_name,"',convertcount,'") and

69 find(eu_name,"',convertplies,'") and find(eu_name,"',converttpi,'")) then list;

70 else delete;

71 run;'));

72 run;

 

 

hope that makes sense...

Ask a Question
Discussion stats
  • 4 replies
  • 85 views
  • 0 likes
  • 2 in conversation