Select Highest Value From Multiple Obs Clusters

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

Select Highest Value From Multiple Obs Clusters

Greetings, I am working with some complex phone calls logs from an organization that has multiple locations, and a rollover system that routes calls to a given location to multiple extension lines.

 

There is one particular analysis I would like to do, but my SAS skills are not up to the task (I've spent several hours trying to figure this out, but remain 100% stumped).

 

The following is a greatly simplified example of my problem: two calls are shown (one each to Locations A and B), with three rollovers for the first call, and four for the second. What I need to do is flag the rollover with the longest duration for each call (without deleting shorter rollovers).

 

sheet 2.png

Can someone suggest a general approach? I've tried a few things, but am getting nowhere.

 

Thanks.

 

-- Glenn


Accepted Solutions
Solution
‎06-05-2017 05:44 PM
Contributor
Posts: 63

Re: Select Highest Value From Multiple Obs Clusters

@kiranv_: It does not have to quite so complicated:

 

proc sql;
 select *,
   case
     when b=max(b) then 'Y'
     else 'N'
   end as long_flag
 from aaa
 group by id;

should do the trick!


kiranv_ wrote:

I took simplied version of your sample and tried to solve the problem. Hope this helps

 data aaa;
 input id c $ b;
 datalines;
 1 vv 100
 1 kk 90
 2 cc 229
 2 dd 100
 ;
 
 proc sql;
 select a.*,
      case when a.id=c.id 
            and a.b= c.b then 'y'
            else 'n' end as long_flag
  from 
 (select * from aaa) a
 left join
 (select id, max(b) as b
 from aaa
 group by id)c
 on a.id =c.id
 and a.b =c.b;
 quit;

 

 

View solution in original post


All Replies
Regular Contributor
Posts: 223

Re: Select Highest Value From Multiple Obs Clusters

I took simplied version of your sample and tried to solve the problem. Hope this helps

 data aaa;
 input id c $ b;
 datalines;
 1 vv 100
 1 kk 90
 2 cc 229
 2 dd 100
 ;
 
 proc sql;
 select a.*,
      case when a.id=c.id 
            and a.b= c.b then 'y'
            else 'n' end as long_flag
  from 
 (select * from aaa) a
 left join
 (select id, max(b) as b
 from aaa
 group by id)c
 on a.id =c.id
 and a.b =c.b;
 quit;
Solution
‎06-05-2017 05:44 PM
Contributor
Posts: 63

Re: Select Highest Value From Multiple Obs Clusters

@kiranv_: It does not have to quite so complicated:

 

proc sql;
 select *,
   case
     when b=max(b) then 'Y'
     else 'N'
   end as long_flag
 from aaa
 group by id;

should do the trick!


kiranv_ wrote:

I took simplied version of your sample and tried to solve the problem. Hope this helps

 data aaa;
 input id c $ b;
 datalines;
 1 vv 100
 1 kk 90
 2 cc 229
 2 dd 100
 ;
 
 proc sql;
 select a.*,
      case when a.id=c.id 
            and a.b= c.b then 'y'
            else 'n' end as long_flag
  from 
 (select * from aaa) a
 left join
 (select id, max(b) as b
 from aaa
 group by id)c
 on a.id =c.id
 and a.b =c.b;
 quit;

 

 

Regular Contributor
Posts: 223

Re: Select Highest Value From Multiple Obs Clusters

@s_lassen you are absolutely right. Many times I use various RDBMS in which columns in group by and select are required to be same, else you get an error. It is so that I forget sometimes, that it can be done in much easier way  in Proc SQL.

Contributor
Posts: 63

Re: Select Highest Value From Multiple Obs Clusters

@kiranv_: don't I know the feeling! So many times I have tried to use HAVING in Oracle in ways that would be perfectly valid in SAS, and received some mysterious error message for my efforts.


kiranv_ wrote:

@s_lassen you are absolutely right. Many times I use various RDBMS in which columns in group by and select are required to be same, else you get an error. It is so that I forget sometimes, that it can be done in much easier way  in Proc SQL.


 

Occasional Contributor
Posts: 7

Re: Select Highest Value From Multiple Obs Clusters

Worked perfectly, thank you!

 

-- Glenn

Grand Advisor
Posts: 17,297

Re: Select Highest Value From Multiple Obs Clusters

Your data as shown, doesn't provide a way to uniquely identify a call. 


One solution to this problem would be to sort the data - PROC SORT and then use BY processing. If sorted by call and duration, the duration with the longest can be either the first or last record based on how you sort. There's a chapter on BY group processing in the documentaiton. 

 

Another solution, is to use an SQL query - note you still make two passes of the data here. S/he used a subquery but you could probably do it in one step as well, see this example based on SAS SQL. 

 

proc sql;
create table want as
select name, age, sex, height, weight, max(age) as oldest,
case when age= CALCULATED oldest then 1 else 0 end as flag
from sashelp.class
group by sex;
quit;

proc print data=want;run;
Occasional Contributor
Posts: 7

Re: Select Highest Value From Multiple Obs Clusters

Thank you for the replies, I'll try these suggestions out later today, and follow up about the results.

-- Glenn
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 141 views
  • 2 likes
  • 4 in conversation