Hi,
I have a list of data in the 10-100's of thousands that changes weekly. I'm trying to find out if a unit has different data entered than the others in the group, on the same date. For example:
ID | Unit | Field | Date |
1297 | 1 | G02 | 04/09/2018 |
1297 | 1 | G02 | 09/30/2018 |
1300 | 1 | G04 | 03/01/2018 |
1300 | 2 | G04 | 03/19/2018 |
1300 | 2 | G04 | 08/01/2018 |
1304 | 1 | G03 | 12/01/2018 |
1315 | 1 | G01 | 01/23/2018 |
1315 | 2 | G01 | 01/23/2018 |
1318 | 1 | G02 | 06/05/2018 |
1318 | 1 | G02 | 12/05/2018 |
1318 | 2 | G02 | 06/05/2018 |
1318 | 2 | G02 | 12/05/2018 |
1318 | 3 | G07 | 06/05/2018 |
In the case above, I'd want a way to flag ID 1318, on 06/05/2018 because 2 of the units have G02, while the third has G07. Also, as an added difficulty step... some of the fields might start with something other than G. I want to ignore those from the comparison (so if a ID has 3 units, and 1 unit is P01, while other 2 are G01, it wouldn't need flagged)
Any help on this would be greatly appreciated! Thank you!
data have;
input ID Unit Field $ Date :mmddyy10.;
format Date mmddyy10.;
cards;
1297 1 G02 04/09/2018
1297 1 G02 09/30/2018
1300 1 G04 03/01/2018
1300 2 G04 03/19/2018
1300 2 G04 08/01/2018
1304 1 G03 12/01/2018
1315 1 G01 01/23/2018
1315 2 G01 01/23/2018
1318 1 G02 06/05/2018
1318 1 G02 12/05/2018
1318 2 G02 06/05/2018
1318 2 G02 12/05/2018
1318 3 G07 06/05/2018
1318 3 P01 06/05/2018
;
proc sql;
create table want(drop=t f) as
select *,max(f) as flag
from
(select *,first(field)='G' as t,(count(distinct field)>1)*(calculated t=1) as f from have group by id,t)
group by id;
quit;
/* Assuming I understand your requirement and the above meets your desired. I added one more PO1 to the last id. If want to flag only G subgroup, proc sql can be made even shorter*/
Hi @BPT227 A more efficient would be DOW, and I find that boring as a video game to play. Kindly wait for somebody to give you that. If not, I will post that later. Proc sql is something not so efficient here, but i enjoy having fun. Cheers
Just to be clear on the logic here. You say:
"In the case above, I'd want a way to flag ID 1318, on 06/05/2018 because 2 of the units have G02, while the third has G07".
However, as I see it, it is the fifth observation in the ID=1318 group that has Field="G07"?
Just so there are choices, here is a simple set of tools:
data test_me / view=test_me;
set have;
field = substr(field, 2);
run;
proc freq data=test_me;
tables id * date * field / noprint out=combos (rename=(count=field_count));
run;
proc freq data=combos;
tables id * date / noprint out=print_me (where=(count > 1) drop=percent);
run;
proc print data=print_me;
run;
The idea is the TEST_ME is view (takes virtually 0 time to create it) tnat revises the FIELD values by chopping off the first character.
Then COMBOS contains one observation for each ID / DATE / FIELD found.
Feed that into a second PROC FREQ, to count how many fields there are per ID / DATE. Keep just those with more than 1 field. The final data set PRINT_ME contains COUNT, a count of the number of different fields found for that ID / DATE.
Very innovative use of proc freq. I learned something new. Thank you @Astounding
@Astounding
Thank you for the response! It's very close to what I need. Here's what I'm seeing:
ID | Unit | Date | Field | flag |
1391 | 3 | 6/15/2018 | E18 | 1 |
1391 | 4 | 6/15/2018 | E18 | 1 |
1391 | 4 | 12/15/2018 | E02 | 1 |
1391 | 5 | 6/15/2018 | E18 | 1 |
1391 | 5 | 12/15/2018 | E02 | 1 |
This ID would not need flagged, due to 6/15/2018 all being E18, while 12/15/2018 is all E02. If the field on the same date is the same, it's good. Only looking to flag the ID's that have different Fields on the same Date, per ID. (If that makes sense)
While I agree with you, I didn't write the code that would assign flag. I only wrote enough to identify who needs to be flagged. So please supply the log from the entire process: creating the flags and then assigning them.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.