Data visualization with SAS programming

SAS Data Visualization (Three 4x4 matrices)

Reply
Occasional Contributor
Posts: 12

SAS Data Visualization (Three 4x4 matrices)

I need some advice about how to represent the following data. The code is lengthy, but i've tested it and it works well to create random data for this sample.

I want to represent the difference in mean stake from Day 1 - Day 3 i.e. the difference in the mean of the variables day1TotalStake, day2TotalStake and day3TotalStakeAll. All I've done at the end of the code below is call a 'proc means' on those three variables and I also tried a 'proc tabulate' on one of those variables: day1TotalStake to see the distribution by stake quartile and return quartile for the first day only. I need some way of showing how the mean value in each quartile of that matrix changes over the next two days.

There is quite a lot of data and eyeballing it in table format is not working. A clever data visualization technique of some kind would nail the representation of the data much better than a very verbose table. I was thinking of some kind of heat map or a graph of some kind but , but have no idea how that would show changes over time i.e. what happens to the value in cell (0,0) on day1, day2 and day3.

I'd appreciate any pointers at all on this.

Thanks.

* This greats a series of rows at random;

data random(keep=date timestamp username stake price winlose winnings);

call streaminit(123); /* set random number seed */

format timestamp datetime.;

format date date9.;

NRows = 1000;MinStake = 0.01;MaxStake = 10;Betprice = 1.83;StartDate = MDY(1,1,2020);DateRange=100;OpeningHour=8;ClosingHour=15;MaxCustomers=1000;

  do i = 1 to NRows;

    u = rand("Uniform");    /* U[0,1] */

  date = StartDate+floor(ranuni(0)*DateRange);

    timestamp = DHMS(date,OpeningHour + floor((1+ClosingHour-0)*u),0 + floor((1+59-0)*u),0 + floor((1+59-0)*u));

  username = cat("player", 1 + floor((1+MaxCustomers-1)*u));

  stake = floor( (MinStake+MaxStake)*u ) * 10;

  price = betprice;

  winlose=rand('BERN',.50);

  winnings = winlose * Betprice * stake;  

    output;

  end;

run;

* This sort is necessary for the next data step;

proc sort data=random;

   by username timestamp;

run;

* This data step calculates a number of variables on the fly;

data random;

   set random; 

      by username date timestamp; 

   retain calendarTime eventTime;

   if first.username then calendarTime = 0;

   if first.date then calendarTime + 1;

   if first.username then eventTime = 0;

   if first.timestamp then eventTime + 1;

run;

* This creates an aggregate table to enable functions like 'proc freq' to be run;

proc sql;

  create table randomAggregate as

  select

  username,

  sum(case when calendarTime =1 then stake else . End) as day1TotalStake,

  sum(case when calendarTime =1 then winnings - stake else . End) as day1TotalWinnings,

  sum(case when calendarTime =1 then winnings - stake else . End)/sum(case when calendarTime =1 then stake else . End)as day1AverageReturn,

  sum(case when calendarTime =2 then stake else . End) as day2TotalStake,

  sum(case when calendarTime =2 then winnings - stake else . End) as day2TotalWinnings,

  sum(case when calendarTime =2 then winnings - stake else . End)/sum(case when calendarTime =2 then stake else . End)as day2AverageReturn,

  sum(case when calendarTime =3 then stake else . End) as day3TotalStake,

  sum(case when calendarTime =3 then winnings - stake else . End) as dasy3TotalWinnings,

  sum(case when calendarTime =3 then winnings - stake else . End)/sum(case when calendarTime =3 then stake else . End)as day3AverageReturn

  from random

  group by username;

quit;

* This adds the quartile flags;

Proc rank data=randomAggregate ties=mean out=randomRanked groups=4;

     var day1TotalStake day1AverageReturn day2TotalStake day2AverageReturn;

  ranks day1TotalStakeQuartile day1AverageReturnQuartile day2TotalStakeQuartile day2AverageReturnQuartile;

  LABEL day1TotalStakeQuartile = 'day1TotalStakeQ';

  LABEL day1AverageReturnQuartile = 'day1AverageReturnQ';

  LABEL day2TotalStakeQuartile = 'day2TotalStakeQ';

  LABEL day2AverageReturnQuartile = 'day2AverageReturnQ';

run;

*This prints out the count and stats for each cell in the matrix in list format for the first THREE days;

proc means data = randomRanked nmiss mean MAXDEC=2;

  var day1TotalStake day2TsotalStake day3TotalStake;

  class day1TotalStakeQuartile day1AverageReturnQuartile;

run;

* This shows the sum and mean at the end of Day 1 in MATRIX format;

* it only shows the TOTAL stake: not the mean;

proc tabulate data=randomRanked noseps missing;

  class day1AverageReturnQuartile day1TotalStakeQuartile;

  var day1TotalStake;

  table day1TotalStakeQuartile= "Stake Quartile", day1TotalStake*day1AverageReturnQuartile="AverageReturn Quartile"*sum;

run;

SAS Super FREQ
Posts: 1,082

Re: SAS Data Visualization (Three 4x4 matrices)

Hard to say, but maybe you can plot a 4x4 matrix using a GTL LAYOUT LATTICE to define a 4x4 matrix of SERIES plot with day on x axis, and mean values on y axis.

Occasional Contributor
Posts: 12

Re: SAS Data Visualization (Three 4x4 matrices)

Great: thanks for that. Any idea how I might be able to compare the three matrices themselves?

Ask a Question
Discussion stats
  • 2 replies
  • 314 views
  • 3 likes
  • 2 in conversation