BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mishka1
Fluorite | Level 6

Base SAS 9.4--

I have an annoying compliance task that I'm trying to automate.

I need to find an account number (number stored as text or as a number) anywhere it appears in any dataset and in any variable in a library.

 

For example, I will need to find the value 1234567891 anywhere it appears in an entire library. It could be in one dataset in a text varible in the middle of a paragraph such as "This account, 1234567891, needs to be closed" or it could be stored in a different or same dataset in a variable that stores account numbers as numbers.

 

So far, I have generated a list of the datasets and could do something similar with each variable in each dataset using the following code. I was then going to create a loop to look for the account number in each variable in each dataset. When I found a match, it would write the "dataset, variable and obs number" to a table each time and move on to the next value to search.

I am not skilled enough to even get close. Is anyone able to assist with the looping logic?

Proc SQL noprint; /*** Create list of all the datasets to look through. ***/

Select 'work.' || memname as ViewName 

into :TableNames separated by ' '

From sashelp.vmember

Where UPCASE(LIBNAME)='WORK' 

and UPCASE(MEMTYPE)='DATA' 

;quit;

%put &TableNames; 

 

 

Thank you so much!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Oh. You have low version SAS.

Try ANY() function .

 



data a;
input a : $40. b c;
cards;
Thisaccount,1234567891 1234567891 1234567891
wwrew 232442 343
;
run;
data b;
input aa : $40. bb xxxx : $20.;
cards;
qweqwe 233 2323
Thisaccount,1234567891 1234567891 1234567891
;
run;
data c;
input c : $40. bbbbbb;
cards;
Thisaccount 1234567891
Thisaccount,1234567891 2324
;
run;

proc iml;
dsn='work.'+datasets('work');
do i=1 to nrow(dsn);
use (dsn[i]);
 read all var _char_ into x[c=vnames];
 flag=(find(x,'1234567891')^=0);
 do j=1 to nrow(flag);
  if any(flag[j,]=1) then do;
    loc=loc(flag[j,]);
    temp=vnames[loc];
    vname=vname//temp;
    obs=obs//repeat(j,nrow(temp));
    table=table//repeat(dsn[i],nrow(temp));
  end;
 end;
 
 read all var _num_ into x[c=vnames];
 flag=(x=1234567891);
 do j=1 to nrow(flag);
  if any(flag[j,]=1) then do; 
    loc=loc(flag[j,]);
    temp=vnames[loc]; 
    vname=vname//temp;
    obs=obs//repeat(j,nrow(temp));
    table=table//repeat(dsn[i],nrow(temp));
  end;
 end;
close (dsn[i]);
end;

create want var {table vname obs};
append;
close;
quit;

proc print noobs;run;

View solution in original post

13 REPLIES 13
snoopy369
Barite | Level 11

The approach you need is probably to have nested macros.

 

First, write a macro that takes 'variable' and 'value' parameters, checks that variable for that value, and outputs a row with the needed information if it maches.

 

Second, write a macro that looks at the dictionary.columns table (or sashelp.vcolumn) and makes calls to that macro, taking parameters of libname and memname, and then contains a data step that creates your compliance dataset and includes the SET statement for that dataset and the macro calls created for that dataset earlier in this sentence.

 

Third, write a macro that creates macro calls to that second macro, one for each dataset/libname combination (you have largely done that above - just put select cats('%your_macro_here(libname=',libname,',memname=',memnane,',')') or similar in it.

 

No looping required, just create lots and lots of macro calls.

snoopy369
Barite | Level 11

Sort of vague example:

 


%macro find_val(val=,var=);
  *test it and output;
%mend find_val;

%macro find_in_Dataset(data=,lib=,val=);
  proc sql;
    select cats('%find_val(var=',name,",val=&val.)") 
     into :findvallist separated by ' '
     from dictionary.columns
     where libname="&lib." and memname="&data."
  ;
  quit;
  data something;
    set &lib..&data.;
    *probalby some code to save data/lib here;
    &findvallist.
  run;
%mend find_in_dataset;

proc sql;
  select cats('%find_in_dataset(data=',memname,',lib=',libname,",val=&val.)")
    into :datasetlist separated by ' '
    from dictionary.tables;
quit;

&datasetlist.
  

 

Shmuel
Garnet | Level 18

I would make just a slight change to @snoopy369's find_val macro.

If the variable is numeric you can check for equal value otherwise check using index function:

   

%macro find_val(val=,var=, type=);
   %if &type = NUM %then %do;
if &var = &val then output;
%end; %else %do;
if index(&var,"&vak) > then output;
%end; %mend find_val;

 

Mishka1
Fluorite | Level 6

Thank you both! I'll spend some time trying to make it work.

ballardw
Super User

To add to @snoopy369 you can get the name and library of the data set by adding the option INDSNAMe to the set statement.

You would want to assign a length and name to a permanent variable though. A stub of code:

 

%let searchval = 4;

data temp;
   length datasetname $41. variable $32. value $100.;
   set sashelp.cars indsname=dname;
   datasetname=dname;
   array c _character_;
   array n _numeric_;
   RecordNumber = _n_;
   SearchValue = "&searchval";
   do i = 1 to dim(c);
      if find(c[i],"&searchval","i")>0 then do;
      /* found the string somewhere*/
         variable = vname(c[i]);
         value = c[i];
         output;
      end;
   end;/* character search*/
   do i = 1 to dim(n);
      if n[i] = &searchval then do;
      /* found the number*/
         variable = vname(n[i]);
         value = put(n[i],best32.);
         output;
      end;
   end;/* numeric search*/
   keep datasetname searchvalue variable RecordNumber value;
run;

Note that this approach doesn't work well for a character value or a "number" like 123-4 that isn't a valid numeric du to the way I'm looking for equality. An exercise for the interested reader (and not difficult) to either add a check for type of searchval or value.

 

mkeintz
PROC Star

One tiny tweak

 

if dim(c)>0 then do I=1 to dim(c);

    and

if dim(n)>0 then do I=1 to dim(n);

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
It would be very easy, if you are using IML code.



data a;
input a : $40. b c;
cards;
Thisaccount,1234567891 1234567891 1234567891
wwrew 232442 343
;
run;
data b;
input aa : $40. bb xxxx : $20.;
cards;
qweqwe 233 2323
Thisaccount,1234567891 1234567891 1234567891
;
run;
data c;
input c : $40. bbbbbb;
cards;
Thisaccount 1234567891
Thisaccount,1234567891 2324
;
run;

proc iml;
dsn='work.'+datasets('work');
do i=1 to nrow(dsn);
use (dsn[i]);
 read all var _char_ into x[c=vnames];
 flag=(find(x,'1234567891')^=0);
 do j=1 to nrow(flag);
  loc=loc(flag[j,]);
  if ^isempty(loc) then do;
   temp=vnames[loc];
   vname=vname//temp;
   obs=obs//repeat(j,nrow(temp));
   table=table//repeat(dsn[i],nrow(temp));
  end;
 end;
 
 read all var _num_ into x[c=vnames];
 flag=(x=1234567891);
 do j=1 to nrow(flag);
  loc=loc(flag[j,]);
  if ^isempty(loc) then do;
   temp=vnames[loc]; 
   vname=vname//temp;
   obs=obs//repeat(j,nrow(temp));
   table=table//repeat(dsn[i],nrow(temp));
  end;
 end;
close (dsn[i]);
end;

create want var {table vname obs};
append;
close;
quit;

proc print noobs;run;

Mishka1
Fluorite | Level 6

This is very interesting. This is the first time I'm exploring IML and I'm excited about the possibilities. The 3 tables created with data correctly but is there a configuration I need to change in my 9.4 to make it work? It created dataset 'Want' but without any data. See log:

proc iml;
NOTE: IML Ready
280  dsn='work.'+datasets('work');
281  do i=1 to nrow(dsn);
282  use (dsn[i]);
283   read all var _char_ into x[c=vnames];
284   flag=(find(x,'1234567891')^=0);
285   do j=1 to nrow(flag);
286    loc=loc(flag[j,]);
287    if ^isempty(loc) then do;
288     temp=vnames[loc];
289     vname=vname//temp;
290     obs=obs//repeat(j,nrow(temp));
291     table=table//repeat(dsn[i],nrow(temp));
292    end;
293   end;
294
295   read all var _num_ into x[c=vnames];
296   flag=(x=1234567891);
297   do j=1 to nrow(flag);
298    loc=loc(flag[j,]);
299    if ^isempty(loc) then do;
300     temp=vnames[loc];
301     vname=vname//temp;
302     obs=obs//repeat(j,nrow(temp));
303     table=table//repeat(dsn[i],nrow(temp));
304    end;
305   end;
306  close (dsn[i]);
307  end;
ERROR: File WORK.WANT.DATA does not exist.

 statement : USE at line 282 column 1
308
309  create want var {table vname obs};
310  append;
WARNING: All data set variables are unvalued. No APPEND done.

 statement : APPEND at line 310 column 1
311  close;
NOTE: Closing WORK.WANT
NOTE: The data set WORK.WANT has 0 observations and 3 variables.
312  quit;
NOTE: Exiting IML.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IML used (Total process time):
      real time           0.06 seconds
      cpu time            0.09 seconds

 

Thanks again

Ksharp
Super User

Oh. That means No tables and no variables contain 1234567891.

Can you post some testing data to let me test and debug it ?
Mishka1
Fluorite | Level 6

I tried just running your sample code with the 3 tables of created data (A,B and C) that did include the value we are trying to find.

I closed all SAS instances and tried running it again. When I did. It threw this error--

 

ERROR: Invocation of unresolved module ISEMPTY.

 

25   proc iml;
NOTE: Writing HTML Body file: sashtml.htm
NOTE: IML Ready
26   dsn='work.'+datasets('work');
26 !                               /*Change to your libname*/
27   do i=1 to nrow(dsn);
28   use (dsn[i]);
29    read all var _char_ into x[c=vnames];
29 !                                       /*Seach chars*/
30    flag=(find(x,'123')^=0);
30 !                           /*Change to the value you want to find*/
31    do j=1 to nrow(flag);
32     loc=loc(flag[j,]);
33     if ^isempty(loc) then do;
34      temp=vnames[loc];
35      vname=vname//temp;
36      obs=obs//repeat(j,nrow(temp));
37      table=table//repeat(dsn[i],nrow(temp));
38     end;
39    end;
40
41    read all var _num_ into x[c=vnames];
41 !                                      /*Seach numbers*/
42    flag=(x=123);
42 !               /*Change to the value you want to find*/
43    do j=1 to nrow(flag);
44     loc=loc(flag[j,]);
45     if ^isempty(loc) then do;
46      temp=vnames[loc];
47      vname=vname//temp;
48      obs=obs//repeat(j,nrow(temp));
49      table=table//repeat(dsn[i],nrow(temp));
50     end;
51    end;
52   close (dsn[i]);
53   end;
ERROR: Invocation of unresolved module ISEMPTY.

 

 

If I delete the empty dataset 'Want' and run it again, I get a different error:

135  proc iml;
NOTE: IML Ready
136  dsn='work.'+datasets('work');
136!                               /*Change to your libname*/
137  do i=1 to nrow(dsn);
138  use (dsn[i]);
139   read all var _char_ into x[c=vnames];
139!                                       /*Seach chars*/
140   flag=(find(x,'123')^=0);
140!                           /*Change to the value you want to find*/
141   do j=1 to nrow(flag);
142    loc=loc(flag[j,]);
143    if ^isempty(loc) then do;
144     temp=vnames[loc];
145     vname=vname//temp;
146     obs=obs//repeat(j,nrow(temp));
147     table=table//repeat(dsn[i],nrow(temp));
148    end;
149   end;
150
151   read all var _num_ into x[c=vnames];
151!                                      /*Seach numbers*/
152   flag=(x=123);
152!               /*Change to the value you want to find*/
153   do j=1 to nrow(flag);
154    loc=loc(flag[j,]);
155    if ^isempty(loc) then do;
156     temp=vnames[loc];
157     vname=vname//temp;
158     obs=obs//repeat(j,nrow(temp));
159     table=table//repeat(dsn[i],nrow(temp));
160    end;
161   end;
162  close (dsn[i]);
163  end;
ERROR: File WORK.WANT.DATA does not exist.

 statement : USE at line 138 column 1

And if I leave the empty dataset 'Want' and try running it again, I get this error:

 

171
172  proc iml;
NOTE: IML Ready
173  dsn='work.'+datasets('work');
173!                               /*Change to your libname*/
174  do i=1 to nrow(dsn);
175  use (dsn[i]);
176   read all var _char_ into x[c=vnames];
176!                                       /*Seach chars*/
177   flag=(find(x,'123')^=0);
177!                           /*Change to the value you want to find*/
178   do j=1 to nrow(flag);
179    loc=loc(flag[j,]);
180    if ^isempty(loc) then do;
181     temp=vnames[loc];
182     vname=vname//temp;
183     obs=obs//repeat(j,nrow(temp));
184     table=table//repeat(dsn[i],nrow(temp));
185    end;
186   end;
187
188   read all var _num_ into x[c=vnames];
188!                                      /*Seach numbers*/
189   flag=(x=123);
189!               /*Change to the value you want to find*/
190   do j=1 to nrow(flag);
191    loc=loc(flag[j,]);
192    if ^isempty(loc) then do;
193     temp=vnames[loc];
194     vname=vname//temp;
195     obs=obs//repeat(j,nrow(temp));
196     table=table//repeat(dsn[i],nrow(temp));
197    end;
198   end;
199  close (dsn[i]);
200  end;
WARNING: Data set WORK.WANT is empty.

 statement : USE at line 175 column 1
WARNING: End of File reached.

 statement : READ at line 176 column 2
ERROR: (execution) Module not loaded, operation not available.

 operation : FIND at line 177 column 12
 operands  : x, *LIT1004

x      0 row       0 col     (type ?, size 0)


*LIT1004      1 row       1 col     (character, size 3)

 123

 statement : ASSIGN at line 177 column 2
Ksharp
Super User

Oh. You have low version SAS.

Try ANY() function .

 



data a;
input a : $40. b c;
cards;
Thisaccount,1234567891 1234567891 1234567891
wwrew 232442 343
;
run;
data b;
input aa : $40. bb xxxx : $20.;
cards;
qweqwe 233 2323
Thisaccount,1234567891 1234567891 1234567891
;
run;
data c;
input c : $40. bbbbbb;
cards;
Thisaccount 1234567891
Thisaccount,1234567891 2324
;
run;

proc iml;
dsn='work.'+datasets('work');
do i=1 to nrow(dsn);
use (dsn[i]);
 read all var _char_ into x[c=vnames];
 flag=(find(x,'1234567891')^=0);
 do j=1 to nrow(flag);
  if any(flag[j,]=1) then do;
    loc=loc(flag[j,]);
    temp=vnames[loc];
    vname=vname//temp;
    obs=obs//repeat(j,nrow(temp));
    table=table//repeat(dsn[i],nrow(temp));
  end;
 end;
 
 read all var _num_ into x[c=vnames];
 flag=(x=1234567891);
 do j=1 to nrow(flag);
  if any(flag[j,]=1) then do; 
    loc=loc(flag[j,]);
    temp=vnames[loc]; 
    vname=vname//temp;
    obs=obs//repeat(j,nrow(temp));
    table=table//repeat(dsn[i],nrow(temp));
  end;
 end;
close (dsn[i]);
end;

create want var {table vname obs};
append;
close;
quit;

proc print noobs;run;
LawrenceHW
Quartz | Level 8

I've been looking into trying to understand IML and how it can used for data checking and I came across this very useful thread. What I was interested in was using PRXMATCH functions and how that could work in place of a FIND. Has anyone tried to use PRXMATCH in place of FIND as I tried it but generated an error. Any ideas?

snoopy369
Barite | Level 11
Hi Lawrence: can you post that as a new question?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 10886 views
  • 5 likes
  • 7 in conversation