DATA Step, Macro, Functions and more

Hard question: Selecting observation(s) conditionally by group

Accepted Solution Solved
Reply
Contributor
Posts: 40
Accepted Solution

Hard question: Selecting observation(s) conditionally by group

I think this question is a hard one.

group x xnum

a ui 100

b gg 100

c hy 40

c kt 59

c jj 1

d kq 88

d qp 3

d qv 9

e nl 100

f aq 34

f zz 2

f bb 30

f mq 34

g qq 66

g gt 5

g zf 29

h hu 100

i aa 40

i xf 20

i ir 40

j er 33

j ae 33

j vt 33

Here, I have to select observation(s) that the value(s) of xnum is(are) maximum by group. One more thing to consider!:  Look at the group of f,i, and j. The group f and i have two maximum values in each group in xnum, and the group j has three maximum values in xnum. So, the final output data should be as follows:

group x xnum

a ui 100

b gg 100

c kt 59

d kq 88

e nl 100

f aq/mq 34

g qq 66

h hu 100

i aa/ir 40

j er/ae/vt 33

I wonder whether or not any SAS expert can solve this problem. Thanks.


Accepted Solutions
Solution
‎12-12-2013 03:13 AM
Respected Advisor
Posts: 4,173

Re: Hard question: Selecting observation(s) conditionally by group

One way to go:

data have;
  infile datalines dlm=' ' truncover;
  input (group x) (:$2.) xnum 8.;
  datalines;
a ui 100
b gg 100
c hy 40
c kt 59
c jj 1
d kq 88
d qp 3
d qv 9
e nl 100
f aq 34
f zz 2
f bb 30
f mq 34
g qq 66
g gt 5
g zf 29
h hu 100
i aa 40
i xf 20
i ir 40
j er 33
j ae 33
j vt 33
;
run;

proc sql;
  create view v_inter as
    select
      group,
      x as _x,
      max(xnum) as max_xnum
    from have
    group by group
    having max(xnum)=xnum
  ;
quit;

data want(drop=_Smiley Happy;
  set v_inter;
  by group;
  length x $30;
  retain x;
  x=catx('/',x,_x);
  if last.group then
    do;
      output;
      call missing(x);
    end;
run;

View solution in original post


All Replies
Solution
‎12-12-2013 03:13 AM
Respected Advisor
Posts: 4,173

Re: Hard question: Selecting observation(s) conditionally by group

One way to go:

data have;
  infile datalines dlm=' ' truncover;
  input (group x) (:$2.) xnum 8.;
  datalines;
a ui 100
b gg 100
c hy 40
c kt 59
c jj 1
d kq 88
d qp 3
d qv 9
e nl 100
f aq 34
f zz 2
f bb 30
f mq 34
g qq 66
g gt 5
g zf 29
h hu 100
i aa 40
i xf 20
i ir 40
j er 33
j ae 33
j vt 33
;
run;

proc sql;
  create view v_inter as
    select
      group,
      x as _x,
      max(xnum) as max_xnum
    from have
    group by group
    having max(xnum)=xnum
  ;
quit;

data want(drop=_Smiley Happy;
  set v_inter;
  by group;
  length x $30;
  retain x;
  x=catx('/',x,_x);
  if last.group then
    do;
      output;
      call missing(x);
    end;
run;

Occasional Contributor
Posts: 7

Re: Hard question: Selecting observation(s) conditionally by group

Here is another:

proc sort data=have;

by group descending xnum;

run;

data want(keep=group--xmax);

length group $1 xxx $100 xmax 8;

do until(last.group);

    set have;

    by group;

    if first.group then xmax=xnum;

    if xnum=xmax then xxx=catx("/",xxx,x);

end;

run;

Super Contributor
Posts: 345

Re: Hard question: Selecting observation(s) conditionally by group

And a one-step-solution.

data _null_;

    length group $ 2

        xList $ 200

        max 8

    ;

   

    if _n_ = 1 then do;

        declare hash h(ordered: 'yes');

        h.defineKey('group');

        h.defineData('group', 'xList', 'max');

        h.defineDone();

        call missing(group, xList, max);

    end;

    set work.have end= jobDone;

    if h.find() ^= 0 then do;

        /* new group */

        max = xnum;

        xList = x;

        h.add();

    end;

    else do;

        if max = xnum then do;

            /* adding x to the group */

            xList = catx('/', xList, x);

            h.replace();

        end;

        if max < xnum then do;

            /* replacing small xnum */

            max = xnum;

            xList = x;

            h.replace();

        end;

    end;

    if jobDone then do;

        h.output(dataset: 'work.want(rename=(xList = x max = xnum))');

    end;

run;

Occasional Contributor
Posts: 9

Re: Hard question: Selecting observation(s) conditionally by group

Hi... Pls the try following code:

data have;

  infile datalines dlm=' ' truncover;

  input (groupname x) (:$2.) xnum 8.;

  datalines;

a ui 100

b gg 100

c hy 40

c kt 59

c jj 1

d kq 88

d qp 3

d qv 9

e nl 100

f aq 34

f zz 2

f bb 30

f mq 34

g qq 66

g gt 5

g zf 29

h hu 100

i aa 40

i xf 20

i ir 40

j er 33

j ae 33

j vt 33

;

run;

proc sql;

create table temp_table as(select groupname, x , xnum from have a where xnum=(select max(xnum) from have b group by groupname having a.groupname=b.groupname));

quit;

proc sort data=temp_table; by groupname xnum; run;

data want;length b1 $20.; retain b1;

set temp_table; by groupname xnum;

if first.xnum then b1=x;

else b1=catt(b1,',',x);

if last.xnum;

drop x; rename b1=x;

run;

Respected Advisor
Posts: 3,156

Re: Hard question: Selecting observation(s) conditionally by group

Surprised that nobody mentioned the classical 2XDOW approach:

data have;

  infile datalines dlm=' ' truncover;

  input (group x) (:$2.) xnum 8.;

  datalines;

a ui 100

b gg 100

c hy 40

c kt 59

c jj 1

d kq 88

d qp 3

d qv 9

e nl 100

f aq 34

f zz 2

f bb 30

f mq 34

g qq 66

g gt 5

g zf 29

h hu 100

i aa 40

i xf 20

i ir 40

j er 33

j ae 33

j vt 33

;

run;

data want;

do until (last.group);

  set have;

    by group;

       length list $ 100;

     max=max(max,xnum);

end;

do until (last.group);

  set have;

    by group;

       if xnum=max then list=catx('/',list,x);

end;

drop x xnum;

run;

Haikuo

Contributor
Posts: 40

Re: Hard question: Selecting observation(s) conditionally by group

Thank you, all. You guys are amazing!

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 413 views
  • 6 likes
  • 6 in conversation