BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Dontik
Obsidian | Level 7

So, I have a significant problem with proc compare. I have two datasets with the two columns. One column lists table names and the other one - names of variables which correspond to table names from the first column. I want compare values of one of them based on the values of first column. I somewhat made it work but the thing is that these datasets have different sizes due to additional values in one of them. Which means that some new variable was added in the middle of a dataset (new variable was added to a table). Unfortunately, proc compare compares values from two datasets horizontally and checks them against each other for values, so in my case it looks like this:

    ds 1 | ds 2

    cost | box_nr

    other | cost_total

As you can see, a new value `box_nr` was added to the second dataset that appears above the value  that I want it to compare variable `cost` to (`cost_total`). So I would like to know if it's possible to compare values (check for differences in character sequence) that have at least minimal similarity - for example 3 letters (cos) or if it's possible to just put values like `box_nr` at the end suggesting that they don't appear in a certain dataset.

My code:

    PROC Compare base=USERSPDS.MIzew compare=USERSPDS.MIwew
        out=USERSPDS.result outbase outcomp outdif noprint;
        id 'TABLE  HD'n;
        where ;
    run;

    proc print data=USERSPDS.result noobs;
       by 'TABLE  HD'n;
       id 'TTABLE  HD'n;
       title 'COMPARISON:';
    run;

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Your better off using proc sql, joining the two tables - assuming columns are DATASET, VAR_NAME:

proc sql;

     create table MATCHES as    

     select     COALESCE(A.DATASET,B.DATASET) as DATASET,

                   COALESCE(A.VAR_NAME,B.VAR_NAME) as VAR_NAME,

                    case when A.DATASET="" or A.VAR_NAME="" then "N" else "Y" as IN_BASE,

                    case when B.DATASET="" or B.VAR_NAME="" then "N" else "Y" as IN_COMP,

     from       BASE_DATASET A

     full join   COMP_DATASET B

     on           A.DATASET=B.DATASET

     and          A.VAR_NAME=B.VARNAME;

quit;

This will give a full list of all dataset/varname combinations and a flag if its present in either.

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Your better off using proc sql, joining the two tables - assuming columns are DATASET, VAR_NAME:

proc sql;

     create table MATCHES as    

     select     COALESCE(A.DATASET,B.DATASET) as DATASET,

                   COALESCE(A.VAR_NAME,B.VAR_NAME) as VAR_NAME,

                    case when A.DATASET="" or A.VAR_NAME="" then "N" else "Y" as IN_BASE,

                    case when B.DATASET="" or B.VAR_NAME="" then "N" else "Y" as IN_COMP,

     from       BASE_DATASET A

     full join   COMP_DATASET B

     on           A.DATASET=B.DATASET

     and          A.VAR_NAME=B.VARNAME;

quit;

This will give a full list of all dataset/varname combinations and a flag if its present in either.

Dontik
Obsidian | Level 7

And what if I don't want to see values that are present in both of these datasets?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Change:

     and          A.VAR_NAME=B.VARNAME;

quit;

To:

     and          A.VAR_NAME=B.VARNAME

     where     IN_BASE="N" or IN_COMP="Y";

/* note may need to replace with where CALCULATED IN_BASE="Y"... */

quit;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1478 views
  • 0 likes
  • 2 in conversation