Programming the statistical procedures from SAS

SAS Running total by category and date

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 12
Accepted Solution

SAS Running total by category and date

Hi,

I have the following data:

data have;

   input username $ amount betdate : datetime.;

   dateOnly = datepart(betdate) ;

   format betdate DATETIME.;

   format dateOnly ddmmyy8.;

   datalines;

player1 90 12NOV2008:12:04:01

player1 -100 04NOV2008:09:03:44

player2 120 07NOV2008:14:03:33

player1 -50 05NOV2008:09:00:00

player1 -30 05NOV2008:09:05:00

player1 20 05NOV2008:09:00:05

player2 10 09NOV2008:10:05:10

player2 -35 15NOV2008:15:05:33

run;

PROC PRINT data=have; RUN;

proc sort data=have;

   by username betdate;

run;

data want;

   set have;

   by username betdate dateOnly;  

   retain calendarTime eventTime dailyProfit;

   if first.username then calendarTime = 0;

   if first.dateOnly then calendarTime + 1;

   if first.username then eventTime = 0;

   if first.betdate then eventTime + 1;  

   if first.username then dailyProfit = 0;

    dailyProfit = dailyProfit + amount;

PROC PRINT data=want; RUN;

The value in dailyProfit is incorrect, as it is resetting every new betdate, not every new dateOnly and new username. How can I get the syntax right on this?

Best regards.


Accepted Solutions
Solution
‎06-06-2013 11:36 AM
Grand Advisor
Posts: 10,062

Re: SAS Running total by category and date

Try removing BETDATE from the BY statement and use DATEONLY instead of BETDATE. Dateonly is "first" for each leverl of BETDATE.

View solution in original post


All Replies
Solution
‎06-06-2013 11:36 AM
Grand Advisor
Posts: 10,062

Re: SAS Running total by category and date

Try removing BETDATE from the BY statement and use DATEONLY instead of BETDATE. Dateonly is "first" for each leverl of BETDATE.

Respected Advisor
Posts: 4,770

Re: SAS Running total by category and date

Your program is missing one statement:

retain dailyProfit;

You can add it anywhere within the DATA step.  The other variables (calendarTime and eventTime) don't need it because there are already being retained.  The syntax "variablename + 1" will automatically retain that variable.

Occasional Contributor
Posts: 12

Re: SAS Running total by category and date

Great stuff. That is now giving me the cumulative dailyProfit in each row as I wanted, and resetting to zero each new date and new user.

But rather than a cumulative value, how can I sum from first.betdate to last.betdate over each username? This would give me the total dailyProfit for each username for each date (repearing across multiple rows just like a SUMIF() in Excel).

Thanks again.

Respected Advisor
Posts: 4,770

Re: SAS Running total by category and date

It's up to you when you reset DailyProfit to 0.  Right now you have coded:

if first.username than dailyProfit=0;

You can switch that to whatever conditions you would like, such as:

if first.betdate then dailyProfit=0;

Of course, to do that, BETDATE must remain part of your BY statement.

Occasional Contributor
Posts: 12

Re: SAS Running total by category and date

I'm not sure if that resolves it. I really need to add a flag to each user flagging whether the customers total dailyProfit (for each parituclar day) was positive or not. Because the main query I want to run is this:

proc sql;

  select

  outer.calendarTime as calendarTime,

  [dailyProfitIndicator,]

  mean(outer.stake) as meanStake,

  sum(outer.stake) as totalStake,

  sum(outer.winnings - outer.stake) as profit,

  sum(outer.winnings - outer.stake) /sum(outer.stake) as margin

from work.want as outer

where calendarTime =1

[and dailyProfitIndicator = 1]

group by 1

having sum(outer.winnings - outer.stake) > 0;

quit;

What it's supposed to do is get the averageStake for customers on their first day if the first day was a 'winning' day i.e. if the total daily profit for the user on that day was greater than zero. And I want to do this over ALL customers, so I don't think the 'having sum(outer.winnings - outer.stake) > 0;' is aggregating over each username and each day, just over each row in this query (which is only 1).

I've changed the data step above to add another 'totalProfit' rather than the cumulative 'dailyProfit' variable above. But it still shows a cumulative rather than repeating the total in each row. Is this expected?

data want;

   set work.temp; 

      by username dateOnly betdate; 

   retain calendarTime eventTime dailyProfit totalProfit;

   if first.username then calendarTime = 0;

   if first.dateOnly then calendarTime + 1;

   if first.username then eventTime = 0;

   if first.betdate then eventTime + 1;

   if first.username then dailyProfit = 0;

   if first.dateOnly then dailyProfit = 0;

   if first.betdate then dailyProfit + sum(winnings - stake);

   if first.dateOnly then totalProfit = 0;

   if first.betdate then totalProfit + sum(winnings - stake);

run;

Grand Advisor
Posts: 10,062

Re: SAS Running total by category and date

If I wanted a summary of total by date for each user I would drop out of a datastep and go to proc means or summary;

proc means data=have sum;

     class username betdate;

     var amount;

run;

If a data set is needed then an output statement can be added.


Occasional Contributor
Posts: 12

Re: SAS Running total by category and date

The question I need to answer is 'What is the average spent by profitable people on their first day'? I can add the 'first day' flag (as above), but can't control for getting the average spent by those that are profitable on the first day rather than the entire sample.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 305 views
  • 6 likes
  • 3 in conversation