BookmarkSubscribeRSS Feed
sarav93
Fluorite | Level 6


Good Evening,

I have two columns which have comma seperated values. i need to Comapre and caputre the diff in new column as Diff_column.


s.no               column_x                 column_y
----                 ---------                      --------
1                  40,80,70,90,30        40,80,70,30
2                 40,80,70,90,30        40,80,70,10,30

If the column_x has 70,80,90 and the column_y doesn't have 10, then don't capture the values 70 or 80 or 90 in diff_column.

Expected Result:
---------------
s.no             column_x              column_y       diff_column
----              ---------                   --------             -----------
1               40,80,70,90,30       40,80,70,30          -
2               40,80,70,90,30       40,80,70,10,30    90

 

 

 

Thanks in advance.

 

Sarav

4 REPLIES 4
Kurt_Bremser
Super User

See

data have;
input s_no (column_x column_y) (:$20.);
datalines;
1 40,80,70,90,30 40,80,70,30
2 40,80,70,90,30 40,80,70,10,30
;

data want;
set have;
length diff_column $20;
if findw(column_y,'10',',')
then do string = '70','80','90';
  if findw(column_x,string,',') and not findw(column_y,string,',')
  then diff_column = catx(',',diff_column,string);
end;
drop string;
run;

proc print data=want noobs;
run;

Result:

s_no	column_x	column_y	diff_column
1	40,80,70,90,30	40,80,70,30	 
2	40,80,70,90,30	40,80,70,10,30	90
ed_sas_member
Meteorite | Level 14

Hi @sarav93 

 

when you say:


If the column_x has 70,80,90 and the column_y doesn't have 10, then don't capture the values 70 or 80 or 90 in diff_column.


does it mean that this is the exact rule to be literally implemented (as a consequence, '70,80,90' will be hardcoded, as well as the '10'). Or is it just an example and do we need to generalize? In this case, what would be the rule? E.g. if AT LEAST one of the values from column_x is deleted AND replaced by another value in column_y, then the deleted value should be put in diff_column.

E.g. if ONLY one of the values from column_x is deleted AND replaced by another value in column_y, then the deleted value should be put in diff_column.

...

Thank you for the clarification.

 

Best,

sarav93
Fluorite | Level 6

It is just an example. I have two columns with the data which defined in comma separated.i need to capture the difference in a new column as Diff_column.

 

 

Thanks.

Sarav

Tom
Super User Tom
Super User

I don't understand the rule.  I don't see a pattern in even the small two sample datasets you provided.

 

Are you asking for the values in Y that are not in X?  Or the values in X that are not in Y?  Or a combination of the two?

ID  X                Y             X_only Y_only Both
1  40,80,70,90,30  40,80,70,30     90     .      90
2  40,80,70,90,30  40,80,70,10,30  90     10     90,10

What does the comment of 10 and 90 mean?

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1259 views
  • 3 likes
  • 4 in conversation