BookmarkSubscribeRSS Feed
mazouz
Calcite | Level 5
 col1col2col3col4col5col6col7col8col9col10FIRST DIFFSECOND DIFFTHIRD DIFF
obs1-16,13-6,45-9,68ZEAM1682222XXX09-janv-19col6--
obs2-16,13-6,45-9,68ZEAM1682222XXX16-janv-19col6--
obs3-16,13-6,45-9,68ZEAM1682222XXX23-janv-19col6--
obs4-16,13-6,45-9,68ZEAM1682222XXX30-janv-19col6--
obs516,136,459,68ZEAM1689222XXX09-janv-19col6--
obs616,136,459,68ZEAM1689222XXX16-janv-19col6--
obs716,136,459,68ZEAM1689222XXX23-janv-19col6--
obs816,136,459,68ZEAM1689222XXX30-janv-19col6--

I want to compare similar rows in (col4 to col10) and their negative matches in col1 col2 col3 and I want to detect if some rows are not similar on col4 to col10; in this example col6 is not similar

6 REPLIES 6
PeterClemmensen
Tourmaline | Level 20

Why col6? i don't understand this logic. Please be more specific.

mazouz
Calcite | Level 5
because 16822 not similar to 16892
between two rows col4 to col10 must being similars when the columns (col1 col2 col 3) have their negative correspondents
andreas_lds
Jade | Level 19

Please don't use only upcase-letters in the subject, this is called "screaming" i prefer not be yelled at.

You will also want to post the data in usable form: a data-step using datalines/cards and extend the data so that all cases you are looking have an example.

mazouz
Calcite | Level 5

data have;
input col1 col2 col3 $ col4 $ col5 $ col6 $ col7 $ col8 $ col9 $ col10 $;
datalines;
-16 -6 -9 ze am o 2 km JH date1
-16 -6 -9 ze am o 2 km JH date2
-16 -6 -9 ze am o 2 km JH date3
-16 -6 -9 ze am o 2 km JH date4
16 6 9 ze am n 2 km JH date1
16 6 9 ze am n 2 km JH date2
16 6 9 ze am n 2 km JH date3
16 6 9 ze am n 2 km JH date4
;
run;

columns from 4 to 10 are class variables they must be similars
RichardDeVen
Barite | Level 11
What are your rules for similarity ?
For the case of computing a columns dissimilar classification, would it be 'the number of distinct character values, or absolute numeric values in the column is greater than 1` ?
Why is column 10 not a DIFF result ?
Are you looking at value run patterns instead of similarities (or equivalences) between two halves of the rows ? If so, what if you have an odd number of rows ?
And, as devils advocate, if only one column has a single value, and all the other columns have varying values, could it be the static column is now the dissimilar one ?
mazouz
Calcite | Level 5
to put you in context, it's about invoice cancellation and reimbursement. when an invoice is refunded I must have two lines which have the same values for the class variables but the amounts are opposite (-16 against 16).
my objective is to see if two lines are similar for all the variables of class except one or two or three variables which are not similar and to detect these variables not similar to correct them
column 10 not a DIFF result because I have their positive similar

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1081 views
  • 0 likes
  • 4 in conversation