Good Evening,
I have two columns which have comma seperated values. i need to Comapre and caputre the diff in new column as Diff_column.
s.no column_x column_y
---- --------- --------
1 40,80,70,90,30 40,80,70,30
2 40,80,70,90,30 40,80,70,10,30
If the column_x has 70,80,90 and the column_y doesn't have 10, then don't capture the values 70 or 80 or 90 in diff_column.
Expected Result:
---------------
s.no column_x column_y diff_column
---- --------- -------- -----------
1 40,80,70,90,30 40,80,70,30 -
2 40,80,70,90,30 40,80,70,10,30 90
Thanks in advance.
Sarav
See
data have;
input s_no (column_x column_y) (:$20.);
datalines;
1 40,80,70,90,30 40,80,70,30
2 40,80,70,90,30 40,80,70,10,30
;
data want;
set have;
length diff_column $20;
if findw(column_y,'10',',')
then do string = '70','80','90';
if findw(column_x,string,',') and not findw(column_y,string,',')
then diff_column = catx(',',diff_column,string);
end;
drop string;
run;
proc print data=want noobs;
run;
Result:
s_no column_x column_y diff_column 1 40,80,70,90,30 40,80,70,30 2 40,80,70,90,30 40,80,70,10,30 90
Hi @sarav93
when you say:
If the column_x has 70,80,90 and the column_y doesn't have 10, then don't capture the values 70 or 80 or 90 in diff_column.
does it mean that this is the exact rule to be literally implemented (as a consequence, '70,80,90' will be hardcoded, as well as the '10'). Or is it just an example and do we need to generalize? In this case, what would be the rule? E.g. if AT LEAST one of the values from column_x is deleted AND replaced by another value in column_y, then the deleted value should be put in diff_column.
E.g. if ONLY one of the values from column_x is deleted AND replaced by another value in column_y, then the deleted value should be put in diff_column.
...
Thank you for the clarification.
Best,
It is just an example. I have two columns with the data which defined in comma separated.i need to capture the difference in a new column as Diff_column.
Thanks.
Sarav
I don't understand the rule. I don't see a pattern in even the small two sample datasets you provided.
Are you asking for the values in Y that are not in X? Or the values in X that are not in Y? Or a combination of the two?
ID X Y X_only Y_only Both
1 40,80,70,90,30 40,80,70,30 90 . 90
2 40,80,70,90,30 40,80,70,10,30 90 10 90,10
What does the comment of 10 and 90 mean?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.