Hi everyone, monday morning and back to work. Hope you had a nice weekend.
So, in reply to Cynthia:
1) Y/A are NOT acceptable, because they at row Y/B start to overlap. In this case, i would like SAS to identify that there is the Y/A-Y/B overlap, and then return all of the Y rows.
This means that the ENTIRE Y row is not acceptable, and needs manual truncating, why i need to get all the Y rows in a output.
2) Z/C is acceptable because they change AT THE SAME TIME, exactly as you deduced.
3) Neither Y/A (both orcurrences), Y/B or Y/D is acceptable, b/c countycode changes, while postnumbercode stays the same. Thats the stuff i wan' t to pick up!
4) My output given the sample above, would be ALL the Y rows from Postnummercode. Because there is variation in Countycode, whilst postnumbercode stays the same.
5) Lets say that POSTNUMBERCODE = Z is first found to have COUNTYCODE = A, THEN POSTNUMBERCODE = Z - SHOULD - NEVER have COUNTYCODE = B - and if it happens, i need to identify it. I want to allow variation in postnumbercode over countycode, so POSTNUMBERCODE A,B,C FOR COUNTYCODE = C is acceptable. But POSTNUMBERCODE A for COUNTYCODE = C,D,E is NOT acceptable.
And if there is found "unacceptable" variation, then i would like SAS to output all the postnumbercode in which that variation happened. In the case above, all the Y rows.
-
In reply to Peter.C:
Yes. The deal is, that i have some more variables, and i need a specific numeric order of my data. So sorting around and (dis)allowing some rows is not an option for me.
In reply to SPR:
Sadly doing nodupkey will ruin my dataset as i need to IDENTIFY where the "overlap" specified above happens, and truncate some data manually.
In reply to RSB:
It isen't that the SECOND orcurrence of county for the same postcode is unacceptale. The unacceptable part is that there is variation in countycode, while postcode stays the same! (See reply 4&5 for Cynthia)
-
Well, thanks a bunch for the input already! I hope that my second post cleared up some stuff.
-
This is a link to a small sample of data. I've deleted some sensetive stuff, but the essence of my problem is visible. All the red rows, are the stuff that i want SAS to output to me, while the stuff marked green is acceptable.
Link:
http://dl.dropbox.com/u/1321324/Work/sample.xls
-
-T
PS: My data is sorted with POSTNUMBERCODE from smallest to largest.
Message was edited by: TMorville