Base SAS 9.4--
I have an annoying compliance task that I'm trying to automate.
I need to find an account number (number stored as text or as a number) anywhere it appears in any dataset and in any variable in a library.
For example, I will need to find the value 1234567891 anywhere it appears in an entire library. It could be in one dataset in a text varible in the middle of a paragraph such as "This account, 1234567891, needs to be closed" or it could be stored in a different or same dataset in a variable that stores account numbers as numbers.
So far, I have generated a list of the datasets and could do something similar with each variable in each dataset using the following code. I was then going to create a loop to look for the account number in each variable in each dataset. When I found a match, it would write the "dataset, variable and obs number" to a table each time and move on to the next value to search.
I am not skilled enough to even get close. Is anyone able to assist with the looping logic?
Proc SQL noprint; /*** Create list of all the datasets to look through. ***/
Select 'work.' || memname as ViewName
into :TableNames separated by ' '
From sashelp.vmember
Where UPCASE(LIBNAME)='WORK'
and UPCASE(MEMTYPE)='DATA'
;quit;
%put &TableNames;
Thank you so much!
Oh. You have low version SAS.
Try ANY() function .
data a;
input a : $40. b c;
cards;
Thisaccount,1234567891 1234567891 1234567891
wwrew 232442 343
;
run;
data b;
input aa : $40. bb xxxx : $20.;
cards;
qweqwe 233 2323
Thisaccount,1234567891 1234567891 1234567891
;
run;
data c;
input c : $40. bbbbbb;
cards;
Thisaccount 1234567891
Thisaccount,1234567891 2324
;
run;
proc iml;
dsn='work.'+datasets('work');
do i=1 to nrow(dsn);
use (dsn[i]);
read all var _char_ into x[c=vnames];
flag=(find(x,'1234567891')^=0);
do j=1 to nrow(flag);
if any(flag[j,]=1) then do;
loc=loc(flag[j,]);
temp=vnames[loc];
vname=vname//temp;
obs=obs//repeat(j,nrow(temp));
table=table//repeat(dsn[i],nrow(temp));
end;
end;
read all var _num_ into x[c=vnames];
flag=(x=1234567891);
do j=1 to nrow(flag);
if any(flag[j,]=1) then do;
loc=loc(flag[j,]);
temp=vnames[loc];
vname=vname//temp;
obs=obs//repeat(j,nrow(temp));
table=table//repeat(dsn[i],nrow(temp));
end;
end;
close (dsn[i]);
end;
create want var {table vname obs};
append;
close;
quit;
proc print noobs;run;
The approach you need is probably to have nested macros.
First, write a macro that takes 'variable' and 'value' parameters, checks that variable for that value, and outputs a row with the needed information if it maches.
Second, write a macro that looks at the dictionary.columns table (or sashelp.vcolumn) and makes calls to that macro, taking parameters of libname and memname, and then contains a data step that creates your compliance dataset and includes the SET statement for that dataset and the macro calls created for that dataset earlier in this sentence.
Third, write a macro that creates macro calls to that second macro, one for each dataset/libname combination (you have largely done that above - just put select cats('%your_macro_here(libname=',libname,',memname=',memnane,',')') or similar in it.
No looping required, just create lots and lots of macro calls.
Sort of vague example:
%macro find_val(val=,var=);
*test it and output;
%mend find_val;
%macro find_in_Dataset(data=,lib=,val=);
proc sql;
select cats('%find_val(var=',name,",val=&val.)")
into :findvallist separated by ' '
from dictionary.columns
where libname="&lib." and memname="&data."
;
quit;
data something;
set &lib..&data.;
*probalby some code to save data/lib here;
&findvallist.
run;
%mend find_in_dataset;
proc sql;
select cats('%find_in_dataset(data=',memname,',lib=',libname,",val=&val.)")
into :datasetlist separated by ' '
from dictionary.tables;
quit;
&datasetlist.
I would make just a slight change to @snoopy369's find_val macro.
If the variable is numeric you can check for equal value otherwise check using index function:
%macro find_val(val=,var=, type=);
%if &type = NUM %then %do;
if &var = &val then output;
%end; %else %do;
if index(&var,"&vak) > then output;
%end;
%mend find_val;
Thank you both! I'll spend some time trying to make it work.
To add to @snoopy369 you can get the name and library of the data set by adding the option INDSNAMe to the set statement.
You would want to assign a length and name to a permanent variable though. A stub of code:
%let searchval = 4;
data temp;
length datasetname $41. variable $32. value $100.;
set sashelp.cars indsname=dname;
datasetname=dname;
array c _character_;
array n _numeric_;
RecordNumber = _n_;
SearchValue = "&searchval";
do i = 1 to dim(c);
if find(c[i],"&searchval","i")>0 then do;
/* found the string somewhere*/
variable = vname(c[i]);
value = c[i];
output;
end;
end;/* character search*/
do i = 1 to dim(n);
if n[i] = &searchval then do;
/* found the number*/
variable = vname(n[i]);
value = put(n[i],best32.);
output;
end;
end;/* numeric search*/
keep datasetname searchvalue variable RecordNumber value;
run;
Note that this approach doesn't work well for a character value or a "number" like 123-4 that isn't a valid numeric du to the way I'm looking for equality. An exercise for the interested reader (and not difficult) to either add a check for type of searchval or value.
One tiny tweak
if dim(c)>0 then do I=1 to dim(c);
and
if dim(n)>0 then do I=1 to dim(n);
It would be very easy, if you are using IML code. data a; input a : $40. b c; cards; Thisaccount,1234567891 1234567891 1234567891 wwrew 232442 343 ; run; data b; input aa : $40. bb xxxx : $20.; cards; qweqwe 233 2323 Thisaccount,1234567891 1234567891 1234567891 ; run; data c; input c : $40. bbbbbb; cards; Thisaccount 1234567891 Thisaccount,1234567891 2324 ; run; proc iml; dsn='work.'+datasets('work'); do i=1 to nrow(dsn); use (dsn[i]); read all var _char_ into x[c=vnames]; flag=(find(x,'1234567891')^=0); do j=1 to nrow(flag); loc=loc(flag[j,]); if ^isempty(loc) then do; temp=vnames[loc]; vname=vname//temp; obs=obs//repeat(j,nrow(temp)); table=table//repeat(dsn[i],nrow(temp)); end; end; read all var _num_ into x[c=vnames]; flag=(x=1234567891); do j=1 to nrow(flag); loc=loc(flag[j,]); if ^isempty(loc) then do; temp=vnames[loc]; vname=vname//temp; obs=obs//repeat(j,nrow(temp)); table=table//repeat(dsn[i],nrow(temp)); end; end; close (dsn[i]); end; create want var {table vname obs}; append; close; quit; proc print noobs;run;
This is very interesting. This is the first time I'm exploring IML and I'm excited about the possibilities. The 3 tables created with data correctly but is there a configuration I need to change in my 9.4 to make it work? It created dataset 'Want' but without any data. See log:
proc iml;
NOTE: IML Ready
280 dsn='work.'+datasets('work');
281 do i=1 to nrow(dsn);
282 use (dsn[i]);
283 read all var _char_ into x[c=vnames];
284 flag=(find(x,'1234567891')^=0);
285 do j=1 to nrow(flag);
286 loc=loc(flag[j,]);
287 if ^isempty(loc) then do;
288 temp=vnames[loc];
289 vname=vname//temp;
290 obs=obs//repeat(j,nrow(temp));
291 table=table//repeat(dsn[i],nrow(temp));
292 end;
293 end;
294
295 read all var _num_ into x[c=vnames];
296 flag=(x=1234567891);
297 do j=1 to nrow(flag);
298 loc=loc(flag[j,]);
299 if ^isempty(loc) then do;
300 temp=vnames[loc];
301 vname=vname//temp;
302 obs=obs//repeat(j,nrow(temp));
303 table=table//repeat(dsn[i],nrow(temp));
304 end;
305 end;
306 close (dsn[i]);
307 end;
ERROR: File WORK.WANT.DATA does not exist.
statement : USE at line 282 column 1
308
309 create want var {table vname obs};
310 append;
WARNING: All data set variables are unvalued. No APPEND done.
statement : APPEND at line 310 column 1
311 close;
NOTE: Closing WORK.WANT
NOTE: The data set WORK.WANT has 0 observations and 3 variables.
312 quit;
NOTE: Exiting IML.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IML used (Total process time):
real time 0.06 seconds
cpu time 0.09 seconds
Thanks again
Oh. That means No tables and no variables contain 1234567891. Can you post some testing data to let me test and debug it ?
I tried just running your sample code with the 3 tables of created data (A,B and C) that did include the value we are trying to find.
I closed all SAS instances and tried running it again. When I did. It threw this error--
ERROR: Invocation of unresolved module ISEMPTY.
25 proc iml;
NOTE: Writing HTML Body file: sashtml.htm
NOTE: IML Ready
26 dsn='work.'+datasets('work');
26 ! /*Change to your libname*/
27 do i=1 to nrow(dsn);
28 use (dsn[i]);
29 read all var _char_ into x[c=vnames];
29 ! /*Seach chars*/
30 flag=(find(x,'123')^=0);
30 ! /*Change to the value you want to find*/
31 do j=1 to nrow(flag);
32 loc=loc(flag[j,]);
33 if ^isempty(loc) then do;
34 temp=vnames[loc];
35 vname=vname//temp;
36 obs=obs//repeat(j,nrow(temp));
37 table=table//repeat(dsn[i],nrow(temp));
38 end;
39 end;
40
41 read all var _num_ into x[c=vnames];
41 ! /*Seach numbers*/
42 flag=(x=123);
42 ! /*Change to the value you want to find*/
43 do j=1 to nrow(flag);
44 loc=loc(flag[j,]);
45 if ^isempty(loc) then do;
46 temp=vnames[loc];
47 vname=vname//temp;
48 obs=obs//repeat(j,nrow(temp));
49 table=table//repeat(dsn[i],nrow(temp));
50 end;
51 end;
52 close (dsn[i]);
53 end;
ERROR: Invocation of unresolved module ISEMPTY.
If I delete the empty dataset 'Want' and run it again, I get a different error:
135 proc iml;
NOTE: IML Ready
136 dsn='work.'+datasets('work');
136! /*Change to your libname*/
137 do i=1 to nrow(dsn);
138 use (dsn[i]);
139 read all var _char_ into x[c=vnames];
139! /*Seach chars*/
140 flag=(find(x,'123')^=0);
140! /*Change to the value you want to find*/
141 do j=1 to nrow(flag);
142 loc=loc(flag[j,]);
143 if ^isempty(loc) then do;
144 temp=vnames[loc];
145 vname=vname//temp;
146 obs=obs//repeat(j,nrow(temp));
147 table=table//repeat(dsn[i],nrow(temp));
148 end;
149 end;
150
151 read all var _num_ into x[c=vnames];
151! /*Seach numbers*/
152 flag=(x=123);
152! /*Change to the value you want to find*/
153 do j=1 to nrow(flag);
154 loc=loc(flag[j,]);
155 if ^isempty(loc) then do;
156 temp=vnames[loc];
157 vname=vname//temp;
158 obs=obs//repeat(j,nrow(temp));
159 table=table//repeat(dsn[i],nrow(temp));
160 end;
161 end;
162 close (dsn[i]);
163 end;
ERROR: File WORK.WANT.DATA does not exist.
statement : USE at line 138 column 1
And if I leave the empty dataset 'Want' and try running it again, I get this error:
171
172 proc iml;
NOTE: IML Ready
173 dsn='work.'+datasets('work');
173! /*Change to your libname*/
174 do i=1 to nrow(dsn);
175 use (dsn[i]);
176 read all var _char_ into x[c=vnames];
176! /*Seach chars*/
177 flag=(find(x,'123')^=0);
177! /*Change to the value you want to find*/
178 do j=1 to nrow(flag);
179 loc=loc(flag[j,]);
180 if ^isempty(loc) then do;
181 temp=vnames[loc];
182 vname=vname//temp;
183 obs=obs//repeat(j,nrow(temp));
184 table=table//repeat(dsn[i],nrow(temp));
185 end;
186 end;
187
188 read all var _num_ into x[c=vnames];
188! /*Seach numbers*/
189 flag=(x=123);
189! /*Change to the value you want to find*/
190 do j=1 to nrow(flag);
191 loc=loc(flag[j,]);
192 if ^isempty(loc) then do;
193 temp=vnames[loc];
194 vname=vname//temp;
195 obs=obs//repeat(j,nrow(temp));
196 table=table//repeat(dsn[i],nrow(temp));
197 end;
198 end;
199 close (dsn[i]);
200 end;
WARNING: Data set WORK.WANT is empty.
statement : USE at line 175 column 1
WARNING: End of File reached.
statement : READ at line 176 column 2
ERROR: (execution) Module not loaded, operation not available.
operation : FIND at line 177 column 12
operands : x, *LIT1004
x 0 row 0 col (type ?, size 0)
*LIT1004 1 row 1 col (character, size 3)
123
statement : ASSIGN at line 177 column 2
Oh. You have low version SAS.
Try ANY() function .
data a;
input a : $40. b c;
cards;
Thisaccount,1234567891 1234567891 1234567891
wwrew 232442 343
;
run;
data b;
input aa : $40. bb xxxx : $20.;
cards;
qweqwe 233 2323
Thisaccount,1234567891 1234567891 1234567891
;
run;
data c;
input c : $40. bbbbbb;
cards;
Thisaccount 1234567891
Thisaccount,1234567891 2324
;
run;
proc iml;
dsn='work.'+datasets('work');
do i=1 to nrow(dsn);
use (dsn[i]);
read all var _char_ into x[c=vnames];
flag=(find(x,'1234567891')^=0);
do j=1 to nrow(flag);
if any(flag[j,]=1) then do;
loc=loc(flag[j,]);
temp=vnames[loc];
vname=vname//temp;
obs=obs//repeat(j,nrow(temp));
table=table//repeat(dsn[i],nrow(temp));
end;
end;
read all var _num_ into x[c=vnames];
flag=(x=1234567891);
do j=1 to nrow(flag);
if any(flag[j,]=1) then do;
loc=loc(flag[j,]);
temp=vnames[loc];
vname=vname//temp;
obs=obs//repeat(j,nrow(temp));
table=table//repeat(dsn[i],nrow(temp));
end;
end;
close (dsn[i]);
end;
create want var {table vname obs};
append;
close;
quit;
proc print noobs;run;
I've been looking into trying to understand IML and how it can used for data checking and I came across this very useful thread. What I was interested in was using PRXMATCH functions and how that could work in place of a FIND. Has anyone tried to use PRXMATCH in place of FIND as I tried it but generated an error. Any ideas?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.