BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
xpnerd
Fluorite | Level 6

I got a list with id's and I have to check the occurrence in multiple tables and I also want to register where this happens (how many times). But I get a lot of errors: invalid statements and syntax errors, I have used multiple sources for the code and what this different is the looped joins, subqueries, multiple libraries, members and column names.

My code in SAS EG are below, first the example data I have in 'work':

 

DATA work.id_search;
INPUT id $;
DATALINES;
A1B6
C3L4
...
Q6Z7 
RUN;

I did a search for the different column names (id and id_thing) in the db's that contain this number, which worked out using this post, link. Which gives me something like this data, library presence:

 

 

DATA table_search;
INPUT libname memname memtype name ...;
DATALINES;
A XY DATA id      ...
A BB ... id_thing ...
B ZY ... ...      ...
... ...  ... ...  ...
Z AK ... id       ...
RUN;

The result I am aiming for:

DATA work.id_search_result;
INPUT id $ A.XY A.BB B.ZY ... Z.AK;
DATALINES;
A1B6 0 5 6 1 ... 0
C3L4 9 4 3 2 ... 2
... ... ... ... ... ... ...
Q6Z7 0 0 0 0 ... 1
RUN;

 

The code I used to try this:

%macro searchIDs; 

proc sql; 
	/* count number of datasets containing id or id_thing */                                      
	select count(memname) into :cnt
	from work.table_search;

	select count(id) into :wcnt
	from work.search_id;

	%let cnt = &cnt.; /* removes trailing blanks from &cnt */ 
	%let wcnt = &wcnt.; 

	/* get names of all datasets and columns containing the variable*/  
	select libname into :ds1 - :ds&cnt.,
		memname into :ms1 - :ms&cnt.,
		name into :ls1 - :ls&cnt.
	from work.table_search;

	/* */ 
	select kenteken into :id1 - :id&wcnt.
	from work.search_id;
	
	create table terecht as
	select t.*
	from work. as t
	%do i = 1 %to &wcnt.;
 		%do j = 1 %to &cnt.;

			left join (select sq.&&ls&j.,
						count(case 	when upcase(sq.&&ls&j.) = "&&ken&i." 
									then 1 else 0 end) as &&ds&j._&&ms&j.
						from &&ds&j..&&ms&j. as sq
						where upcase(sq.&&ls&j.) = "&&ken&i.") as t1
					on t.kenteken = t1.&&ls&j.;
/* I tried: t1.&&ls.&j  t1.&&ls.&j. t1.&ls.&j.  t1.&ls&j.  t1.&ls.&j  to no avail*/
		%end;
	%end;
quit;
%mend searchIDs; 
 
%searchIDs; 
 

I have tried this a lot now and it does not seem to work out, since I am a novice to SAS I undoubtedly miss something...

The sources I used and things I tried:

This is a long question, very kind to stick around until this point. Hopefully someone is able to help me I thank you for your time 😉

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

First obvious error:

from work. as t

you are missing a dataset name here.

 

Depending on the number of tables, variables and values you will quickly exceed the maximum number of tables you can process in a single SQL query (256, IIRC)

 

I would rather do this:

  • create a list of variable names to search in a macro variable
  • create a list of datasets in a  second macro variable
  • in one single data step, put all those datasets into a single SET statement, with an indsname= option; us a %do loop for this
  • at _N_ = 1, create a hash object with all your search values
  • for every observation, do a find() method; if result = 0, output the dataset name, variable name, and value
  • repeat the above statement(s) in a %do loop for all variable names in your first macro variable

When this data step has run, you can use summary procedures to get your counts.

The big advantage: you will do one single, sequential pass through all the datasets, and not do a join.

 

View solution in original post

8 REPLIES 8
Kurt_Bremser
Super User

First obvious error:

from work. as t

you are missing a dataset name here.

 

Depending on the number of tables, variables and values you will quickly exceed the maximum number of tables you can process in a single SQL query (256, IIRC)

 

I would rather do this:

  • create a list of variable names to search in a macro variable
  • create a list of datasets in a  second macro variable
  • in one single data step, put all those datasets into a single SET statement, with an indsname= option; us a %do loop for this
  • at _N_ = 1, create a hash object with all your search values
  • for every observation, do a find() method; if result = 0, output the dataset name, variable name, and value
  • repeat the above statement(s) in a %do loop for all variable names in your first macro variable

When this data step has run, you can use summary procedures to get your counts.

The big advantage: you will do one single, sequential pass through all the datasets, and not do a join.

 

Kurt_Bremser
Super User

To illustrate what I meant, an example using SASHELP datasets:

data search;
input searchvalue :$13.; /* Use the maximum expected variable length here */
datalines;
Acura
Alfred
;

data where_to_search;
input libname :$8. memname :$32. name :$32.;
datalines;
sashelp class name
sashelp cars make
;

proc sql noprint;
select catx('.',libname,memname) into :datasets separated by ' ' from where_to_search;
select trim(name) into :names separated by ' ' from where_to_search;
run;

data all;
length &names $13;
/* set a common length for all variables to be searched */
/* otherwise the hash code might fail */
set &datasets indsname=dsname;
if _n_ = 1
then do;
  length searchvalue $13;
  declare hash h (dataset:"search");
  h.definekey('searchvalue');
  h.definedone();
  call missing(searchvalue); /* prevents "uninitialized" message */
end;
%macro allvars;
%do i = 1 %to %sysfunc(countw(&names));
if h.check(key:%scan(&names,&i)) = 0
then do;
  dataset = dsname;
  variable = "%scan(&names,&i)";
  value = %scan(&names,&i);
  output;
end;
%end;
%mend;
%allvars
keep dataset variable value;
run;

Resulting dataset:

1	SASHELP.CLASS	name	Alfred	
2	SASHELP.CARS	make	Acura	
3	SASHELP.CARS	make	Acura	
4	SASHELP.CARS	make	Acura	
5	SASHELP.CARS	make	Acura	
6	SASHELP.CARS	make	Acura	
7	SASHELP.CARS	make	Acura	
8	SASHELP.CARS	make	Acura
xpnerd
Fluorite | Level 6

You sir are a wizard, the sample code worked perfectly. So below are individual problems and reall depending on the databases (I guess).

 

It did not work for the database I am working on.

When I tried it for my list with different libnames and memnames I got the error: ERROR: Variable <other variable> has been defined as both character and numeric.

All those other variables are the ones I am not comparing to as far as I understand the code.

 

If I run it for a unique libname (and multiple memnames) (since there is no %do loop at 'set &datanames ...'), I get the error: ERROR: CLI cursor extended fetch error: [SAP AG][LIBODBCHDB DLL][HDBODBC] General error;-10427 Conversion of parameter/column (9)
from data type NVARCHAR to ASCII failed

 

sadly the database (which I am not owner of) tables have fields with multiple different char-lengths and is not clean at all.

Kurt_Bremser
Super User

The logic can only be applied to either character or numeric variables, and it needs some adaptions for numeric. From the data examples you posted, I took it that you were looking for categorical values.

xpnerd
Fluorite | Level 6

In case of: ERROR: Variable <other variable> has been defined as both character and numeric, it mentions fields in the table to be searched that are not selected. I also tried it out for the field 'name' contains only one label/name and checked if the field has character values. I just find it odd; an error for all the other fields which should not be searched.

 

As for the target label/name field it has character values, but these can be a mix of letters and numbers such as 'ABC3'. Does the hash not work in the case of this mixture of 'characters' in a string/char what should I change in that case?

 

I also tried to look into the other error, the things I found out about that would be things with encoding. and some technical stuff I don't understand at all.

Kurt_Bremser
Super User

If you have such badly designed data (where the same variable name is used within a single database for variables of different types - who comes up with such a crazy idea?), then you need to take additional measures, like adding logic that connects columns with datasets, so you can add keep= dataset options for all the datasets you read in the step with the hash.

xpnerd
Fluorite | Level 6

I did not design or aggregated those tables, I just have to deal with it I guess.

So I changed a line of code and made it

set &datasets. (keep=&names.) indsname=dsname;

Is this the change you meant ? It was the only thing that worked for the various permutation I tried.

No errors, weird results.

 

If run with the below dataset it does not throw an error, but the result gives me twice a hit in the src_two table, which is not right (might this be due to the %do loop in the code you gave?). In this case the order of the sources matters (if src_one listed second, it results in two hits for src_one).

 

Executing the code for only one source it gives me an empty result, which is weird to me. 

if the field 'name' has values 'id' en 'id_thing' and one and two resp. have a field named in such a way it throws ERROR The variable id in the DROP, KEEP, or RENAME list has never been referenced.

 

Thank you for your kind help.

 

Used example data:

data sources;
input libname :$8. memname :$32. name :$32.;
datalines;
work src_one id
work src_two id
;

data searchv;
input id :$4.;
datalines;
00AG
;

data src_one;
input mixure id :$4.;
datalines;
140 00AG
301 A644
;

data src_two;
input mixure :$6. id :$4.;
datalines;
name1 00AG
name3 B73C
;
run;
xpnerd
Fluorite | Level 6

For those interesed in a solution, as far as I understand all the data steps, the code below seems to do the job.

 

%macro testing(search, searchvalue, where_to_search, result, summary = 0, srchvarlength = $32);

proc sql noprint;
select trim(name) into :names separated by ' ' from &where_to_search.;
/* working value */
select cats(libname,'.', memname,'(keep=', trim(name), ')') into :testsets separated by ' ' from &where_to_search.;
run;

data &result.;
/* 
	prevent %scan from picking the shortest char-length, 
	also names per dictonary.columns max length is 32. 
*/
length variable $32;
length value &srchvarlength.;

/* set a common length for all variables to be searched */
/* otherwise the hash code might fail */
set &testsets. indsname=dsname ;
if _n_ = 1
then do;
  length &searchvalue. &srchvarlength.;
  declare hash h (dataset:"&search.");
  h.definekey("&searchvalue.");
  h.definedone();
  call missing(&searchvalue.); /* prevents "uninitialized" message */
end;

%do i = 1 %to %sysfunc(countw(&names.));
if h.check(key:%scan(&names., &i.)) = 0
then do;
  dataset = dsname;
  variable = "%scan(&names, &i)";
  value = %scan(&names., &i.);
  output;
end;
%end;

keep dataset variable value;
run;
%mend;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1243 views
  • 2 likes
  • 2 in conversation