Help using Base SAS procedures

Select groupings if contains an intermittent field

Reply
Super Contributor
Posts: 401

Select groupings if contains an intermittent field

I have a large dataset grouped by Field_A and dates for each rows within the groups.  There is also another Field_B that may or may not be intermittent within each group.. I need to pick only the groupings that have Field_B intermittent..  here's what I mean:

HAVE:

Field_A          Date          Field_B
A                 2001.01             T

A                 2001.02             C

A                 2001.03             C

A                 2001.04             T

----------------------------------------------------

B                 2001.01             T 

B                 2001.02             T

B                 2001.03             C

B                 2001.04             C

-----------------------------------------------------

C                 2001.01             C 

C                 2001.02             T

C                 2001.03             C

C                 2001.04             T

-----------------------------------------------------

D                 2001.01             C 

D                 2001.02             C

D                 2001.03             T

D                 2001.04             T

WANT:

Field_A          Date          Field_B
A                 2001.01             T

A                 2001.02             C

A                 2001.03             C

A                 2001.04             T

-----------------------------------------------------

C                 2001.01             T 

C                 2001.02             C

C                 2001.03             T

C                 2001.04             C

           

WANT tables shows that Field_B has values that are mixed within it, not arranged as seen in Field_A = B or D in the HAVE table.

PROC Star
Posts: 7,487

Re: Select groupings if contains an intermittent field

I think you have to specify exactly what you mean by "intermittent".

Super Contributor
Posts: 401

Re: Select groupings if contains an intermittent field

The options could be either Option 1 : Field_B is 1 value and then the other (eg. T's and then C's or vice versa) OR

Option 2 : T's then C's and then T's again or C's then T's and then C's again...  so if there is only one switch in Feild_B then don't keep . if there is more than one switch in Field_B then keep (sorted by date and groupped by Field_A)

Super User
Posts: 19,855

Re: Select groupings if contains an intermittent field

Here's a very basic way. Count the number of switches in a dataset and then select the ones that meet your criteria.

data have;
input Field_A  $        Date   $       Field_B $;
cards;
A                 2001.01             T
A                 2001.02             C
A                 2001.03             C
A                 2001.04             T
B                 2001.01             T
B                 2001.02             T
B                 2001.03             C
B                 2001.04             C
C                 2001.01             C
C                 2001.02             T
C                 2001.03             C
C                 2001.04             T
D                 2001.01             C
D                 2001.02             C
D                 2001.03             T
D                 2001.04             T
;
run;

data have2;
set have;
by field_a field_b notsorted;

retain switch;

if first.field_a then switch=0;

if first.field_b and not first.field_a then switch+1;

run;

proc sql;
create table want as
select * from have2
group by field_a
having max(switch)>1;
quit;

PROC Star
Posts: 7,487

Re: Select groupings if contains an intermittent field

: As usual, you probably have a number of options.  The following probably isn't the most efficient, but easy to follow what the code is doing:

data want (drop=test1-test3);

  set have (in=a) have (in=b);;

  by field_a;

  retain test1-test3;

  length test1 test2 $1;

  if a then do;

    if first.field_a then do;

      test1=field_b;

      call missing(test2);

      call missing(test3);

    end;

    else if missing(test2) then do;

      if field_b ne test1 then test2=field_b;

    end;

    else if field_b ne test2 then test3=1;

  end;

  else if test3 then output;

run;

Respected Advisor
Posts: 4,930

Re: Select groupings if contains an intermittent field

I would count on the BY mechanism to count the number of switches in Field_B, like this :

data have;
input Field_A $ Date $ Field_B $;
datalines;
A                 2001.01             T
A                 2001.02             C
A                 2001.03             C
A                 2001.04             T
B                 2001.01             T 
B                 2001.02             T
B                 2001.03             C
B                 2001.04             C
C                 2001.01             C 
C                 2001.02             T
C                 2001.03             C
C                 2001.04             T
D                 2001.01             C 
D                 2001.02             C
D                 2001.03             T
D                 2001.04             T
;

data want;
do until(last.Field_A);
     set have; by Field_A Field_B notsorted;
     c = sum(c, first.Field_b);
     end;
do until(last.Field_A);
     set have; by Field_A Field_B notsorted;
     if c <= 2 then output;
     end;
drop c;
run;

proc print data=want noobs; run;

PG

PG
Respected Advisor
Posts: 3,156

Re: Select groupings if contains an intermittent field

Liked!

Ask a Question
Discussion stats
  • 6 replies
  • 261 views
  • 2 likes
  • 5 in conversation