Help using Base SAS procedures

Flagging out observations based on multiple by group values

Accepted Solution Solved
Reply
Contributor
Posts: 68
Accepted Solution

Flagging out observations based on multiple by group values

Dear all,

I have a huge dataset with the following info

ID_AID_BEXP_AEXP_BCREATION_DATERAT_X
147976667KMH124OCT2011:13:52:224
147976667                     06AUG2012:22:44:3017
55112366667ADN103JUN2012:08:05:293
147976667MR0127MAR2012:08:56:0817
147976667JHM124OCT2011:13:51:464
147976667JHM224OCT2011:13:51:434
55112366667ADN103JUN2013:10:54:193

I need to flag or save in a different dataset all the observations that for variable EXP_A have different EXP_B and RAT_X

Please note that generally the EXP_A is grouped under ID_A & ID_B, although it seems that the same EXP_A can be found under two or more different groupings by ID_A & ID_B values (I do not know whether this is an error or not )

Thank you in advance.

Best regards

Nikos


Accepted Solutions
Solution
‎05-03-2014 02:41 PM
Respected Advisor
Posts: 4,925

Re: Flagging out observations based on multiple by group values

If I understand the problem statement, you may want something like this:

data have;

input  ID_A ID_B EXP_A $ EXP_B $ CREATION_DATE $ RAT_X;

drop creation_date;

datalines;

1 4797 6667 KMH1 24OCT2011:13:52:22 4

1 4797 6667 .       06AUG2012:22:44:30 17

55 11236 6667 ADN1 03JUN2012:08:05:29 3

1 4797 6667 MR01 27MAR2012:08:56:08 17

1 4797 6667 JHM1 24OCT2011:13:51:46 4

1 4797 6667 JHM2 24OCT2011:13:51:43 4

55 11236 6667 ADN1 03JUN2013:10:54:19 3

;

proc sql;

create table flagged as

select ID_A, ID_B, EXP_A

from have

group by ID_A, ID_B, EXP_A

having count(distinct catx("-", EXP_B, RAT_X)) > 1;

select * from flagged;

quit;

PG

PG

View solution in original post


All Replies
Solution
‎05-03-2014 02:41 PM
Respected Advisor
Posts: 4,925

Re: Flagging out observations based on multiple by group values

If I understand the problem statement, you may want something like this:

data have;

input  ID_A ID_B EXP_A $ EXP_B $ CREATION_DATE $ RAT_X;

drop creation_date;

datalines;

1 4797 6667 KMH1 24OCT2011:13:52:22 4

1 4797 6667 .       06AUG2012:22:44:30 17

55 11236 6667 ADN1 03JUN2012:08:05:29 3

1 4797 6667 MR01 27MAR2012:08:56:08 17

1 4797 6667 JHM1 24OCT2011:13:51:46 4

1 4797 6667 JHM2 24OCT2011:13:51:43 4

55 11236 6667 ADN1 03JUN2013:10:54:19 3

;

proc sql;

create table flagged as

select ID_A, ID_B, EXP_A

from have

group by ID_A, ID_B, EXP_A

having count(distinct catx("-", EXP_B, RAT_X)) > 1;

select * from flagged;

quit;

PG

PG
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 202 views
  • 0 likes
  • 2 in conversation