Hi,
I have the following data:
data have;
input username $ amount betdate : datetime.;
dateOnly = datepart(betdate) ;
format betdate DATETIME.;
format dateOnly ddmmyy8.;
datalines;
player1 90 12NOV2008:12:04:01
player1 -100 04NOV2008:09:03:44
player2 120 07NOV2008:14:03:33
player1 -50 05NOV2008:09:00:00
player1 -30 05NOV2008:09:05:00
player1 20 05NOV2008:09:00:05
player2 10 09NOV2008:10:05:10
player2 -35 15NOV2008:15:05:33
run;
PROC PRINT data=have; RUN;
proc sort data=have;
by username betdate;
run;
data want;
set have;
by username dateOnly betdate;
retain calendarTime eventTime cumulativeDailyProfit profitableFlag totalDailyProfit;
if first.username then calendarTime = 0;
if first.dateOnly then calendarTime + 1;
if first.username then eventTime = 0;
if first.betdate then eventTime + 1;
if first.username then cumulativeDailyProfit = 0;
if first.dateOnly then cumulativeDailyProfit = 0;
if first.betdate then cumulativeDailyProfit + amount;
if first.dateOnly then totalDailyProfit = 0;
if first.betdate then totalDailyProfit + amount;
PROC PRINT data=want; RUN;
I want to run the query below, and get the mean stake on all days (which is correct now) and the mean stake on losing days and on winnings days. If I get the 'profitableFlag' field set properly above this should be ok.
proc sql;
select calendarTime,
mean(amount) as meanStake,
mean(case when 1 = profitableFlag then amount else . End) as meanLosingDayStake,
mean(case when 1 = profitableFlag then amount else . End) as meanWinningDayStake
from want
group by 1;
quit;
Is this possible in the data step above?
What is the business rule for profitable fkag?
Profitable flag should be set to 1 if the cumulative total is greater than 0 and 0 otherwise.
I have tried doing this with a (horrible) subselect as follows, but would rather something a little more elegant:
proc sql;
select calendarTime,
mean(amount ) as meanStake,
mean(case when (case when (select cumulativeDailyProfit from work.want as sub where sub.username = outer.username and sub.betdate = (select max(subsub.betdate) from work.want as subsub where subsub.username = outer.username and subsub.dateOnly = outer.dateOnly)) <= 0 then 1 else 0 end) = 1 then amount else . End) as meanLosingDayStake,
sum(case when (case when (select cumulativeDailyProfit from work.want as sub where sub.username = outer.username and sub.betdate = (select max(subsub.betdate) from work.want as subsub where subsub.username = outer.username and subsub.dateOnly = outer.dateOnly)) <= 0 then 1 else 0 end) = 1 then amount else . End) as sumLosingDayStake,
mean(case when (case when (select cumulativeDailyProfit from work.want as sub where sub.username = outer.username and sub.betdate = (select max(subsub.betdate) from work.want as subsub where subsub.username = outer.username and subsub.dateOnly = outer.dateOnly)) > 0 then 1 else 0 end) = 1 then amount else . End) as meanWinningDayStake,
sum(case when (case when (select cumulativeDailyProfit from work.want as sub where sub.username = outer.username and sub.betdate = (select max(subsub.betdate) from work.want as subsub where subsub.username = outer.username and subsub.dateOnly = outer.dateOnly)) > 0 then 1 else 0 end) = 1 then amount else . End) as sumWinningDayStake
from want as outer
group by 1;
quit;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.