BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Michael141
Calcite | Level 5

I have two string variables which I want to compare using SAS in order to establish which observations occur in both variables and which are unique to one of these variables. I’ve provided an example of the data below, where Department_Old and Department_New are the variable names.

 

Department_Old

Department_New

Communications

Advertising

Finance

Communications

Health and Wellbeing

Finance

Transport

HR

 

Health and Wellbeing

 

I have tried to use merge to do this, but from what I can see it can only make comparisons between observations within the same row. What I would like is to check whether an observations in one variable appear anywhere in the other variable, and have to SAS provide me with a list of those observations which are unique to one variable. I know that this is something which can be achieved in excel using the VLOOKUP function, but would ideally like to keep all of my work within SAS. Is there a particular process I could use to achieve this within SAS?

1 ACCEPTED SOLUTION

Accepted Solutions
sbxkoenk
SAS Super FREQ

Hello,

I guess this is what you want:

data have;
LENGTH Department_Old $ 25 Department_New $ 25;
infile cards delimiter='|';
input Department_Old $ Department_New $;
cards;
Communications       |Advertising
Finance              |Communications
Health and Wellbeing |Finance
Transport            |HR
                     |Health and Wellbeing
;
run;

PROC SQL noprint;
 create table want_unique_Department_Old as
 select Department_Old
 from have
 where  upcase(Department_Old)
  NOT IN (select distinct upcase(Department_New) 
          from have)
  AND Department_Old is not missing
;
QUIT;

PROC SQL noprint;
 create table want_unique_Department_New as
 select Department_New
 from have
 where  upcase(Department_New)
  NOT IN (select distinct upcase(Department_Old) 
          from have)
  AND Department_New is not missing
;
QUIT;
/* end of program */

Koen

View solution in original post

7 REPLIES 7
sbxkoenk
SAS Super FREQ

Hello,

I guess this is what you want:

data have;
LENGTH Department_Old $ 25 Department_New $ 25;
infile cards delimiter='|';
input Department_Old $ Department_New $;
cards;
Communications       |Advertising
Finance              |Communications
Health and Wellbeing |Finance
Transport            |HR
                     |Health and Wellbeing
;
run;

PROC SQL noprint;
 create table want_unique_Department_Old as
 select Department_Old
 from have
 where  upcase(Department_Old)
  NOT IN (select distinct upcase(Department_New) 
          from have)
  AND Department_Old is not missing
;
QUIT;

PROC SQL noprint;
 create table want_unique_Department_New as
 select Department_New
 from have
 where  upcase(Department_New)
  NOT IN (select distinct upcase(Department_Old) 
          from have)
  AND Department_New is not missing
;
QUIT;
/* end of program */

Koen

Michael141
Calcite | Level 5

This is exactly what I needed! I've been struggling with this for a few days, so I really appreciate your time and help. Thank you very much for your quick and helpful response

sbxkoenk
SAS Super FREQ

Or this:

data have;
LENGTH Department_Old $ 25 Department_New $ 25;
infile cards delimiter='|';
input Department_Old $ Department_New $;
cards;
Communications       |Advertising
Finance              |Communications
Health and Wellbeing |Finance
Transport            |HR
                     |Health and Wellbeing
;
run;

PROC FREQ data=have noprint;
 tables Department_Old / out=Count_Old(rename=(Department_Old=Department));
 tables Department_New / out=Count_New(rename=(Department_New=Department));
run;

data both(drop=count percent);
 merge Count_Old(in=a) Count_New(in=b);
 by Department;
 if a or b;
 if a then OLD=1; else OLD=0;
 if b then NEW=1; else NEW=0;
run;
/* end of program */
Michael141
Calcite | Level 5

This solution is also very helpful thanks - and while it wasn't I was looking for originally, it might actually be more helpful. Thanks again for your time and help

Reeza
Super User

Here's one different option for you. 

It combines the two columns into one, identifying the source as well. Then you count the source for ones that have only a single source (using HAVING).

 

data have;
LENGTH Department_Old $ 25 Department_New $ 25;
infile cards delimiter='|';
input Department_Old $ Department_New $;
ID = _n_;
cards;
Communications       |Advertising
Finance              |Communications
Health and Wellbeing |Finance
Transport            |HR
                     |Health and Wellbeing
;
run;


data stacked;
set have (keep = department_old rename = (department_old = department) in=t1)
     have (keep = department_new rename = (department_new = department) in=t2);
source = T1*1 + T2*2;
run;

proc sql;
create table unique_values as
select distinct department, source
from want
group by department
having count(distinct source) = 1;
quit;
sbxkoenk
SAS Super FREQ

Also nice,

but instead of 

from want

in your PROC SQL, it should be 

from stacked

Just clarifying for @Michael141 .

 

Kind regards

sbxkoenk
SAS Super FREQ

With pleasure.

Note that I have just edited the last program.

In the last data step (that does the merge) I have dropped the variables count and percent from the output. Count and Percent are coming from the last data set in the merge (Count_New), overriding the Count and Percent from Count_Old. As they don't bring any value and don't have any meaning in the MERGEd output data set, it's better to delete them!

data both(drop=count percent);

Koen

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1334 views
  • 2 likes
  • 3 in conversation