Help using Base SAS procedures

LIKE / CONTAINS on the fritz?

Reply
Occasional Contributor
Posts: 17

LIKE / CONTAINS on the fritz?

I've got two tables, A and B.
In table A there is a field filled with complete names.(A)
In table B there is a field filled with partial names.(B)

Both fields are in upper case (UPCASE).
In the query I call for the entire fieldset (*) of Table A, where field (A) contains field (B). (It doesn't matter which one, so there isn't any JOIN)

This yields no results in some of the cases where I can clearly see they should.
When I make the same comparison using only the fields (A) and (B) as their respective tables, this does work however.

Would anybody know the reason for this?
Super Contributor
Super Contributor
Posts: 3,174

Re: LIKE / CONTAINS on the fritz?

It's important to share your code for good feedback.

Scott Barry
SBBWorks, Inc.
Occasional Contributor
Posts: 17

Information : LIKE / CONTAINS on the fritz?

Here it is. Did have to replace the names though (confidentiality issue):
Table1.naam = Partial Name belonging to an adress
Table2.naam = Full name (originally can contain ' or ", hence COMPRESS)
&SOURCETABLE = reference to source table name. (working of course)

*** The next 2 queries set up the 2 tables as wel as formats NAAM to upper case ***;
PROC SQL;
CREATE TABLE TUSSENDT.TABLE1 AS
SELECT SOURCE.GROEP,
UPCASE(SOURCE.NAAM) AS NAAM,
SOURCE.HUISNR,
SOURCE.POSTCODE
FROM &SOURCETABLE AS SOURCE
WHERE SOURCE.GROEP NOT IS NULL;
QUIT;

PROC SQL;
CREATE TABLE TUSSENDT.TABLE2 AS
SELECT UPCASE(COMPRESS(COMPRESS(TABLE2SOURCE.NAAM,''''),'"')) AS NAAM FORMAT=$200.,
TABLE2SOURCE.ID AS ID FORMAT=11.,
TABLE2SOURCE.HUISNR AS HUISNR FORMAT=$6.,
TABLE2SOURCE.HUISNRTOE AS HUISNRTOE FORMAT=$12.,
TABLE2SOURCE.POSTCODE AS POSTCODE FORMAT=$6.,
TABLE2SOURCE.BUITENLAND_IND AS BUITENLAND_IND FORMAT=$1.,
TABLE2SOURCE.BEKEND_DAT AS BEKEND_DAT FORMAT=MMDDYY8.
FROM SOURCE.TABLE2SOURCE AS TABLE2SOURCE;
QUIT;


*** Checks if TABLE2.naam contains a TABLE1.naam, adds GROEP and NAAM from TABLE1 ***;

PROC SQL;
CREATE TABLE WORK.KOI_INFO_OP_NAAM AS SELECT DISTINCT TABLE2.ID FORMAT=11.,
TABLE1.NAAM FORMAT=$200. AS NAAM_1,
TABLE2.NAAM FORMAT=$200. AS NAAM_2,
TABLE2.HUISNR FORMAT=$6.,
TABLE2.HUISNRTOE AS HUISN2 FORMAT=$12.,
TABLE2.POSTCODE AS POSTCO FORMAT=$6.,
TABLE2.BUITENLAND_IND AS BUITEN FORMAT=$1.,
TABLE2.BEKEND_DAT AS BEKEND FORMAT=MMDDYY8.,
TABLE1.GROEP,
('OP NAAM') AS OORSPRONG
FROM TUSSENDT.TABLE2 AS TABLE2,
TUSSENDT.TABLE2 AS TABLE2
WHERE TABLE2.NAAM Contains TABLE1.NAAM;
QUIT; - As extra information: I'm using SAS 9.1.3


Message was edited by: RFLinnenbank
Super User
Posts: 5,256

Re: Information : LIKE / CONTAINS on the fritz?

I did a simplified example using SASHELP.CLASS:

14 /* Infoga anpassad kod före skickad kod här */
15 options msglevel=i;
16
17 data ClassPartname;
18 set sashelp.class;
19 length PartName $2;
20 PartName = substr(name,2,2);
21 keep Partname;
22 run;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.CLASSPARTNAME has 19 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds


23
24 proc sql;
25 create table ContainsJoin as
26 select class.*, PartName
27 from sashelp.class as class,
28 ClassPartName
29 where class.Name contains ClassPartName.PartName
30 ;
NOTE: The execution of this query involves performing one or more Cartesian product joins that can not be optimized.
NOTE: Table WORK.CONTAINSJOIN created, with 31 rows and 6 columns.

31 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
To help you out more, please share the input data to the "join", and any LOG information that could be of interest.

/Linus Message was edited by: Linus H
Data never sleeps
Super Contributor
Super Contributor
Posts: 3,174

Re: Information : LIKE / CONTAINS on the fritz?

You can leave off the FORMAT= keyword for character variables. Also, you must have an incorrect FROM in your last PROC SQL query (there is no TABLE1 reference). I suspect that you need a change to CONTAINS TRIM(TABLE1.NAAM), adding the TRIM function, otherwise (at least with a DATA step), the padded variable length (LENGTH, not FORMAT) is used which could result in no-match if there is trailing blanks.

And, so, you will need to dump your TABLE1 values for NAAM to attempt to correlate them against a "compressed" TABLE2.NAAM.

Scott Barry
SBBWorks, Inc.
Occasional Contributor
Posts: 17

Re: Information : LIKE / CONTAINS on the fritz?

Sorry about that, misplaced that one while replacing the naming of the original query.

As far as the trailing blanks are concerned, wouldn't be surprised if those were the cause of it once again. The "SAS fills character-types up with blanks"-thing rears it's ugly head every once in a while, just when you least expect it.

I'll let you know as soon as possible if your suggestions did the trick. I expect so though.

Thanks a lot, even if this wouldn't quite do the trick yet!
Occasional Contributor
Posts: 17

Re: Information : LIKE / CONTAINS on the fritz?

Turns out it was indeed once again a Trimming issue!
Works like a charm now!
Ask a Question
Discussion stats
  • 6 replies
  • 172 views
  • 0 likes
  • 3 in conversation