BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gzr2mz39
Quartz | Level 8

I have a macro that takes two values (both are text): pattern and serial.
I have a data set with these two variables (i.e. patterns).
How would I go about reading in, for example, the first 20 rows, then the next 20 rows, etc.
Each time a group of records is inputted the results would be stored or outputted where check equals 1 or 2.

The data that I am applying this macro to is in a separate data set (i.e. all_data).

 

%macro prx(pattern,serial);
b=prxparse("&pattern");
if prxmatch(b,serial_number)>0 then do;
check=1;
serial=&serial;
if (length(serial) = length(serial_number)) then check=2;
end;
%mend;

 

Example values and how they would fit into the macro:
pattern=066957
serial=0669578
%prx(/^066957[\d{1}]/,"0669578");

 

Here's an example of the records that I would like to feed into the macro (only 2 columns):
Serial        Pattern
B266056   B26605
00308110  0030811

 

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

This is one messy post.

So you want to generate matching code from the smaller pattern table it seems. 

If only you gave proper data in/ data out examples.

Like this?

 

data PATTERNS;
  PATTERN='066957 '; SERIAL ='0669578 '; output;
  PATTERN='B26605 '; SERIAL ='B26056 '; output;
  PATTERN='0030811'; SERIAL ='00308110'; output;
run;

data HAVE;
  SERIAL_NUMBER ='0669578 '; output;
  SERIAL_NUMBER ='B266056 '; output;  
  SERIAL_NUMBER ='0038110'; output;
run;

data _null_;
  retain CODE;
  length CODE $32000;
  set PATTERNS;
  CODE=cats( CODE
           , 'if prxmatch("/^', PATTERN, '\d{1}/", SERIAL_NUMBER)>0 then do;'
           , '  CHECK=1; '
           , '  if (', length(SERIAL), ' = length(SERIAL_NUMBER)) then CHECK=2; '
           , 'end;'
           );
  call symputx('code',CODE);
run;
 
data WANT;
  set HAVE;
  &code.;
run; 

 

SERIAL_NUMBER CHECK
669578 2
B266056 1
38110 .

View solution in original post

9 REPLIES 9
ChrisNZ
Tourmaline | Level 20

1. Why read groups of 20 records? Please clarify this point.

2. Any good reason to use a RegEx instead if the index function? It's a lot slower.

3. Variable SERIAL_NUMBER is used but never defined.

gzr2mz39
Quartz | Level 8

1) 20 is just an example. The data set I'm searching for specific patterns has over 700,000 serial numbers. So, I want to run the macro for about 20-30 patterns at a time.

2) The total number of patterns I'm searching for is about 1,000. In addition to the example I provided, I have some other regular expressions that I'm interested in as wellMaybe I'm mistaken, but RegEx is better at finding specific patterns than the index function? What I'm currently interested in is finding out is how to pipe a list of values through a macro and capture the results.

3) I'm using the macro to find various patterns within SERIAL_NUMBER. The SERIAL_NUMBER variable is found in the data set with over 700,000 records.

ChrisNZ
Tourmaline | Level 20

This is still not very clear.

Here is a canvas. What's missing?


%macro prx(pattern,serial);
  PRX=prxparse("&pattern");
  if prxmatch(PRX,SERIAL_NUMBER)>0 then do;
    CHECK=1;
    if length("&serial") = length(serial_number) then CHECK=2;
  end;
%mend;

data HAVE;
  PATTERN='066957 ';  SERIAL_NUMBER ='0669578 ';  output;
  PATTERN='B26605 ';  SERIAL_NUMBER ='B266056 ';  output;
  PATTERN='0030811';  SERIAL_NUMBER ='00308110';  output;
run;

data WANT;
  set HAVE;
  %prx(/^066957\d/, 0669578);
  drop PRX;
run;

 

gzr2mz39
Quartz | Level 8

asset_data has serial_number (see below).
There are over 700,000 records in asset_data.
I'm interested in the records where check equals 1 or 2.
My goal is to avoid calling %prx over and over again.
So, instead of calling %prx repeatedly how can I have %prx automatically take pattern and serial from patterns_data (see below).

 

data patterns_data;
PATTERN='066957 '; serial ='0669578 '; output;
PATTERN='B26605 '; serial ='B266056 '; output;
PATTERN='0030811'; serial ='00308110'; output;
run;

 

%macro prx(pattern,serial);
b=prxparse("&pattern");
if prxmatch(b,serial_number)>0 then do;
check=1;
serial=&serial;
if (length(serial) = length(serial_number)) then check=2;
end;
%mend;

 

data WANT;
set asset_data;

%prx(/^066957[\d{1}]/,"0669578");
%prx(/^B26605[\d{1}]/,"B266056");

/*etc.*/

run;

 

Thank you.

ChrisNZ
Tourmaline | Level 20

Come on. This doesn;t make sense.

 

Your RegEx in 

 

data WANT;
set asset_data;

%prx(/^066957[\d{1}]/,"0669578");
%prx(/^B26605[\d{1}]/,"B266056");

/*etc.*/

run;

 

is static.

 

The result will be the same for all records in asset_data as the records are not used.

 

 

Tom
Super User Tom
Super User

Don't pipe data through macros. Use datasets to hold data.

You could make a macro that you tell it WHICH records you want.

%macro process
(pattern_dsn=  /* Name of Pattern dataset */
,records_dsn= /* Name of Data to Test */
,out_dsn=  /* Name of dataset to create */
,first_obs=  /* First pattern number */
,n_obs=  /* Number of patterns */
);

data subset ;
  set &pattern_dsn (firstobs=&first_obs obs=&n_obs);
run;
 
....
%mend ;

As to the process of building Reg Ex strings from the patterns do that with DATA step code and NOT with macro code.

BrunoMueller
SAS Super FREQ

Macro processing happens before your code is compiled and executed. So it is not suitable for what you want to do.

 

Have a look at Proc FCMP, this allows you to write your own functions to be used in the DATA Step.

gzr2mz39
Quartz | Level 8

OK, this sounds promising.

So, how would I get two variables (i.e. "pattern" and "serial") from patterns_data in order to look for patterns in serial_number (which is located in asset_data)?

b=prxparse(/^pattern[\d{1}]/);
if prxmatch(b,serial_number)>0 then do;
check=1;
if (length(serial) = length(serial_number)) then check=2;
end;

ChrisNZ
Tourmaline | Level 20

This is one messy post.

So you want to generate matching code from the smaller pattern table it seems. 

If only you gave proper data in/ data out examples.

Like this?

 

data PATTERNS;
  PATTERN='066957 '; SERIAL ='0669578 '; output;
  PATTERN='B26605 '; SERIAL ='B26056 '; output;
  PATTERN='0030811'; SERIAL ='00308110'; output;
run;

data HAVE;
  SERIAL_NUMBER ='0669578 '; output;
  SERIAL_NUMBER ='B266056 '; output;  
  SERIAL_NUMBER ='0038110'; output;
run;

data _null_;
  retain CODE;
  length CODE $32000;
  set PATTERNS;
  CODE=cats( CODE
           , 'if prxmatch("/^', PATTERN, '\d{1}/", SERIAL_NUMBER)>0 then do;'
           , '  CHECK=1; '
           , '  if (', length(SERIAL), ' = length(SERIAL_NUMBER)) then CHECK=2; '
           , 'end;'
           );
  call symputx('code',CODE);
run;
 
data WANT;
  set HAVE;
  &code.;
run; 

 

SERIAL_NUMBER CHECK
669578 2
B266056 1
38110 .

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1200 views
  • 0 likes
  • 4 in conversation