BookmarkSubscribeRSS Feed
sarav93
Fluorite | Level 6


Good Evening,

I have two columns which have comma seperated values. i need to Comapre and caputre the diff in new column as Diff_column.


s.no               column_x                 column_y
----                 ---------                      --------
1                  40,80,70,90,30        40,80,70,30
2                 40,80,70,90,30        40,80,70,10,30

If the column_x has 70,80,90 and the column_y doesn't have 10, then don't capture the values 70 or 80 or 90 in diff_column.

Expected Result:
---------------
s.no             column_x              column_y       diff_column
----              ---------                   --------             -----------
1               40,80,70,90,30       40,80,70,30          -
2               40,80,70,90,30       40,80,70,10,30    90

 

 

 

Thanks in advance.

 

Sarav

4 REPLIES 4
Kurt_Bremser
Super User

See

data have;
input s_no (column_x column_y) (:$20.);
datalines;
1 40,80,70,90,30 40,80,70,30
2 40,80,70,90,30 40,80,70,10,30
;

data want;
set have;
length diff_column $20;
if findw(column_y,'10',',')
then do string = '70','80','90';
  if findw(column_x,string,',') and not findw(column_y,string,',')
  then diff_column = catx(',',diff_column,string);
end;
drop string;
run;

proc print data=want noobs;
run;

Result:

s_no	column_x	column_y	diff_column
1	40,80,70,90,30	40,80,70,30	 
2	40,80,70,90,30	40,80,70,10,30	90
ed_sas_member
Meteorite | Level 14

Hi @sarav93 

 

when you say:


If the column_x has 70,80,90 and the column_y doesn't have 10, then don't capture the values 70 or 80 or 90 in diff_column.


does it mean that this is the exact rule to be literally implemented (as a consequence, '70,80,90' will be hardcoded, as well as the '10'). Or is it just an example and do we need to generalize? In this case, what would be the rule? E.g. if AT LEAST one of the values from column_x is deleted AND replaced by another value in column_y, then the deleted value should be put in diff_column.

E.g. if ONLY one of the values from column_x is deleted AND replaced by another value in column_y, then the deleted value should be put in diff_column.

...

Thank you for the clarification.

 

Best,

sarav93
Fluorite | Level 6

It is just an example. I have two columns with the data which defined in comma separated.i need to capture the difference in a new column as Diff_column.

 

 

Thanks.

Sarav

Tom
Super User Tom
Super User

I don't understand the rule.  I don't see a pattern in even the small two sample datasets you provided.

 

Are you asking for the values in Y that are not in X?  Or the values in X that are not in Y?  Or a combination of the two?

ID  X                Y             X_only Y_only Both
1  40,80,70,90,30  40,80,70,30     90     .      90
2  40,80,70,90,30  40,80,70,10,30  90     10     90,10

What does the comment of 10 and 90 mean?

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 683 views
  • 3 likes
  • 4 in conversation