Solved: Getting the average amount of transaction for specifics event

Pontch · Posted 03-30-2012 04:02 PM

Hi guys,

I got a table with a lot of events.

table sample would look like

ItemID Event ID

ABC 100

ABC 200

ABC 300

CBA 100

CBA 200

CBA 300

For the same Item i can have a lot of different events associated, for one item i can have multiple time the same events.

What i would like to achieve is to get the average count of event in the whole population of items.

e.g a table that would look like

Event ID AverageAmountofTx

100 1

200 1.5

300 1

What would be the most efficient way to carry out that testing? Is there a direct way to do this using proc means?

Probably an easy way to do that but it's friday ,-) My table is huge so i would like to avoid creating an intermediary table with:

ItemIT EVENT ID Count

ABC 100 1

ABC 200 2

ABC 300 1

etc

Ksharp · Posted 03-30-2012 11:42 PM

How about:

data have;
input Item_ID $ Event_ID;
datalines;
ABC 100
ABC 200
ABC 200
ABC 300
CBA 100
CBA 200
CBA 300
;
run;
proc sql ;
create table want as
 select event_id,count(event_id)/count(distinct item_id) as avg
  from have
   group by event_id;
quit;

Ksharp

View solution in original post

akberali67 · Posted 03-30-2012 04:11 PM

PROC SUMMARY DATA = HAVE NWAY MISSING;

CLASS EVENTID ITEMID;

OUTPUT OUT = HAVE1;

RUN;

_FREQ_ WILL BE YOUR COUNT AND YOU SHOULD ALSO HAVE OTHER STATISTICS DISPLAYED

Hima · Posted 03-30-2012 04:33 PM

data have;
input Item_ID $ Event_ID;
datalines;
ABC 100
ABC 200
ABC 200
ABC 300
CBA 100
CBA 200
CBA 300
;
run;

proc sort data = have;
by Item_ID;
run;

proc freq data = have;
tables Item_ID*Event_ID / list missing norow nocol nocum nopercent;
run;

Output:

                               Item_ID    Event_ID    Frequency
                                ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
                                ABC             100           1
                                ABC             200           2
                                ABC             300           1
                                CBA             100           1
                                CBA             200           1
                                CBA             300           1

Pontch · Posted 03-30-2012 04:41 PM

Thanks guys for the quick answers, my question was more to know if you can avoid creating that table:

Item_ID Event_ID Frequency

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

ABC 100 1

ABC 200 2

ABC 300 1

CBA 100 1

CBA 200 1

CBA 300 1

I would like to create the following table directly without using an intermediary table (my source table is huge)

Event_ID AverageCount

100 1

200 1.5

300 1

akberali67 · Posted 03-30-2012 04:46 PM

didnt you get those required statiistics using my code for Proc summary??

Hima · Posted 03-30-2012 07:01 PM

data have;

input Item_ID $ Event_ID;

datalines;

ABC 100

ABC 200

ABC 300

CBA 100

CBA 200

CBA 300

;

run;

proc sql;

select Event_id,avg(cnt) as aver from

(select count(item_id) as cnt, event_id,item_id from have group by event_id, Item_id)

group by event_id;

quit;

Tested n working.

Ksharp · Posted 03-30-2012 11:42 PM

How about:

data have;
input Item_ID $ Event_ID;
datalines;
ABC 100
ABC 200
ABC 200
ABC 300
CBA 100
CBA 200
CBA 300
;
run;
proc sql ;
create table want as
 select event_id,count(event_id)/count(distinct item_id) as avg
  from have
   group by event_id;
quit;

Ksharp

Pontch · Posted 03-31-2012 04:29 PM

Thanks guys, will give it a try on Monday, the table will have over 1 billion records pretty soon, so will see how each query perform.

I was also thinking maybe i should just do the average on a given period of time and annualize it so that creating an intermediary table would be ok...

Will still give a try to all queries above will keep you guys updated.

PGStats · Posted 03-31-2012 05:01 PM

It depends if you want the average number of each event type in items in which they appear (which Ksharps' query above will calculate) or the average over all items. Compare :

data have;

input Item_ID $ Event_ID;

datalines;

ABC 100

ABC 200

ABC 300

CBA 100

CBA 200

;

proc sql ;

create table want1 as

select event_id,count(event_id)/count(distinct item_id) as avg

from have

group by event_id;

create table want2 as

select event_id,count(event_id)/(select count(distinct item_id) from have) as avg

from have

group by event_id;

select want1.*, want2.avg as avg2

from want1, want2 where want1.event_ID=want2.event_ID;

quit;

PG

Pontch · Posted 04-02-2012 03:23 PM

Awesome guys, great to have so many different answers!

Thanks for the additional information PG, i am using Ksharp query but yours just gave me more depth for my analysis

Getting the average amount of transaction for specifics event

Re: Getting the average amount of transaction for specifics event

Re: Getting the average amount of transaction for specifics event

Getting the average amount of transaction for specifics event

Getting the average amount of transaction for specifics event

Getting the average amount of transaction for specifics event

Re: Getting the average amount of transaction for specifics event

Re: Getting the average amount of transaction for specifics event

Re: Getting the average amount of transaction for specifics event

Re: Getting the average amount of transaction for specifics event

Re: Getting the average amount of transaction for specifics event

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away