Dear SAS community,
I have two source tables which I'd like to combine into one dataset. I'd like to perform a check based on some conditions which'll decide whether I select the records from Table A or the records Table B. The conditions will be based on the maximum depth for a group across the two tables and the number of records within the group in events where the maximum depth is equal across the two tables. With everything being equal, I prefer to use the records from Table A.
For demonstration purposes, let's say I have these tables:
Table A:
Name Depth Coord_X Coord_Y
A 0 5 6
A 100 5 6
A 200 5 6
A 300 5 6
B 0 4 3
B 2000 4 3
B 3000 4 3
B 4000 4 3
C 0 2 1
C 3000 2 1
C 4000 2 1
C 5000 2 1
Table B
Name Depth Coord_X Coord_Y
A 0 5 6
A 250 5 6
A 300 5 6
B 0 4 3
B 2000 4 3
B 3000 4 3
B 4000 4 3
C 0 2 1
C 3000 2 1
C 4000 2 1
C 5000 2 1
C 5500 2 1
D 0 11 12
D 10 11 12
D 20 11 12
As can be seen, the tables contains the same attributes. However, Table A may contain some groups not in Table B, and vice versa. This can be seen in the example above where Table B contains a group D not in Table A.
First, I'd like to test on the maximum depth - I want to select the records from the input table which contains the greatest depth. Using the tables above as an example, for group C, I'd like my target table to contain the records from Table B as this has a greater maximum depth then Table A. The maximum depth is the key condition, and as long as a table contains a greater maximum depth then the other table, the number of records within each group is irrelevant, and I always select the records from the source containing the maximum depth. In other words, even if Table A contained 1000 records and Table B contained 4 records, as long as Table B contains a greater maximum depth I select the 4 records from Table B over the 1000 records from Table A.
Secondly, if the two tables contains an equal maximum depth for a group, I use the number of records within each group as a tie breaker. Using the above tables as an example, for group A, I want to select the records from Table A as this contains 4 records compared to the 3 records in Table B with the maximum depth being equal between the two tables.
In the event that both the maximum depth and the number of records are equal, I want to use Table A as my source.
Overall, based on these conditions, my target table should look like this. Notice that I want to add a column which tells which source has been selected for my Target table:
Target Table
Name Depth Coord_X Coord_Y Source
A 0 5 6 Table A
A 100 5 6 Table A
A 200 5 6 Table A
A 300 5 6 Table A
B 0 4 3 Table A
B 2000 4 3 Table A
B 3000 4 3 Table A
B 4000 4 3 Table A
C 0 2 1 Table B
C 3000 2 1 Table B
C 4000 2 1 Table B
C 5000 2 1 Table B
C 5500 2 1 Table B
D 0 11 12 Table B
D 10 11 12 Table B
D 20 11 12 Table B
I am fairly new to SAS and programming in general, and I am therefore unsure what the best strategy to solving this issue efficiently would be.
... View more