turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS/GRAPH and ODS Graphics
- /
- SAS Data Visualization (Three 4x4 matrices)

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-08-2013 06:58 PM

I need some advice about how to represent the following data. The code is lengthy, but i've tested it and it works well to create random data for this sample.

I want to represent the difference in mean stake from Day 1 - Day 3 i.e. the difference in the mean of the variables day1TotalStake, day2TotalStake and day3TotalStakeAll. All I've done at the end of the code below is call a 'proc means' on those three variables and I also tried a 'proc tabulate' on one of those variables: day1TotalStake to see the distribution by stake quartile and return quartile for the first day only. I need some way of showing how the mean value in each quartile of that matrix changes over the next two days.

There is quite a lot of data and eyeballing it in table format is not working. A clever data visualization technique of some kind would nail the representation of the data much better than a very verbose table. I was thinking of some kind of heat map or a graph of some kind but , but have no idea how that would show changes over time i.e. what happens to the value in cell (0,0) on day1, day2 and day3.

I'd appreciate any pointers at all on this.

Thanks.

* This greats a series of rows at random;

data random(keep=date timestamp username stake price winlose winnings);

call streaminit(123); /* set random number seed */

format timestamp datetime.;

format date date9.;

NRows = 1000;MinStake = 0.01;MaxStake = 10;Betprice = 1.83;StartDate = MDY(1,1,2020);DateRange=100;OpeningHour=8;ClosingHour=15;MaxCustomers=1000;

do i = 1 to NRows;

u = rand("Uniform"); /* U[0,1] */

date = StartDate+floor(ranuni(0)*DateRange);

timestamp = DHMS(date,OpeningHour + floor((1+ClosingHour-0)*u),0 + floor((1+59-0)*u),0 + floor((1+59-0)*u));

username = cat("player", 1 + floor((1+MaxCustomers-1)*u));

stake = floor( (MinStake+MaxStake)*u ) * 10;

price = betprice;

winlose=rand('BERN',.50);

winnings = winlose * Betprice * stake;

output;

end;

run;

* This sort is necessary for the next data step;

proc sort data=random;

by username timestamp;

run;

* This data step calculates a number of variables on the fly;

data random;

set random;

by username date timestamp;

retain calendarTime eventTime;

if first.username then calendarTime = 0;

if first.date then calendarTime + 1;

if first.username then eventTime = 0;

if first.timestamp then eventTime + 1;

run;

* This creates an aggregate table to enable functions like 'proc freq' to be run;

proc sql;

create table randomAggregate as

select

username,

sum(case when calendarTime =1 then stake else . End) as day1TotalStake,

sum(case when calendarTime =1 then winnings - stake else . End) as day1TotalWinnings,

sum(case when calendarTime =1 then winnings - stake else . End)/sum(case when calendarTime =1 then stake else . End)as day1AverageReturn,

sum(case when calendarTime =2 then stake else . End) as day2TotalStake,

sum(case when calendarTime =2 then winnings - stake else . End) as day2TotalWinnings,

sum(case when calendarTime =2 then winnings - stake else . End)/sum(case when calendarTime =2 then stake else . End)as day2AverageReturn,

sum(case when calendarTime =3 then stake else . End) as day3TotalStake,

sum(case when calendarTime =3 then winnings - stake else . End) as dasy3TotalWinnings,

sum(case when calendarTime =3 then winnings - stake else . End)/sum(case when calendarTime =3 then stake else . End)as day3AverageReturn

from random

group by username;

quit;

* This adds the quartile flags;

Proc rank data=randomAggregate ties=mean out=randomRanked groups=4;

var day1TotalStake day1AverageReturn day2TotalStake day2AverageReturn;

ranks day1TotalStakeQuartile day1AverageReturnQuartile day2TotalStakeQuartile day2AverageReturnQuartile;

LABEL day1TotalStakeQuartile = 'day1TotalStakeQ';

LABEL day1AverageReturnQuartile = 'day1AverageReturnQ';

LABEL day2TotalStakeQuartile = 'day2TotalStakeQ';

LABEL day2AverageReturnQuartile = 'day2AverageReturnQ';

run;

*This prints out the count and stats for each cell in the matrix in list format for the first THREE days;

proc means data = randomRanked nmiss mean MAXDEC=2;

var day1TotalStake day2TsotalStake day3TotalStake;

class day1TotalStakeQuartile day1AverageReturnQuartile;

run;

* This shows the sum and mean at the end of Day 1 in MATRIX format;

* it only shows the TOTAL stake: not the mean;

proc tabulate data=randomRanked noseps missing;

class day1AverageReturnQuartile day1TotalStakeQuartile;

var day1TotalStake;

table day1TotalStakeQuartile= "Stake Quartile", day1TotalStake*day1AverageReturnQuartile="AverageReturn Quartile"*sum;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to tobriain

06-09-2013 06:46 AM

Hard to say, but maybe you can plot a 4x4 matrix using a GTL LAYOUT LATTICE to define a 4x4 matrix of SERIES plot with day on x axis, and mean values on y axis.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Sanjay_SAS

06-09-2013 10:04 AM

Great: thanks for that. Any idea how I might be able to compare the three matrices themselves?