BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mduarte
Quartz | Level 8

Hi - I would like a chart similar to the following, where the x-axis is shown in weeks (note the last bin with >53) (but data in days).  

Capture.PNG

 

Here are my attempts so far:

1. Use SGPLOT with binwdith and xaxis when data is not restricted.  Status: Bins are not for 7 days.  Note that the data ranges from -377 to 277 in my example ( I need to work out how to set the seed ... )  An explanation of this would be great ... 

 

2. Use SGPLOT restric the data from 0 to 365.  Status: This gets me closer however I don't get the last bin ">53".  Labels are also incorrect. Is this possible with SGPLOT without a custom format?

 

3.  Try to create a custom format.  Status: Doesn't format the data as expected

 

Test code is as follows:

/* Add data */
data Hist(drop=i);
label T = "Days";
call streaminit(1);
do i = 1 to 100;
   T = rand("Normal", 120, 60); /* normal with mean 120  */
   T = round(T, 1);             /* round to nearest day */
   output;
end;
run;
proc means data=hist;
run;
Title 'Data not restricted';
/*Attempt One: Use SGPLOT with no data restriction / bins not correct */
proc sgplot data=hist;
histogram T / binwidth=7 binstart=0 showbins;
xaxis values=(0 to 365 by 7);
run;
/*Attempt Two: When data is restricted, bins are correct*/
Title "Data restricted to 0-365";
proc sgplot data=hist(where=(0<=T<=365));
histogram T / binwidth=7 binstart=0 showbins;
xaxis values=(0 to 365 by 7);
run;
/*Start of attempt Three: Use a custom format - not yet working*/
data dayToWeek;
attrib fmtname informat=$10. 
		eexcl informat=$1. 
		hlo informat=$ 1.
		label informat=$3.;
  drop i;
  retain fmtname '$dayToWeek' EEXCL 'Y' HLO '';

do i = 0 to 51;
  start = i*7;
  end = (i+1)*7;
  label = i;
  output;
  end;

  start=364;
  label = '52+';
  HLO = 'H';
	output; 
run;

proc format cntlin=dayToWeek; 
run;

proc freq data=hist;
tables T;
format T daytoweek.;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
mduarte
Quartz | Level 8

I ended up using the following code ... 

 

/*Create Format*/
data dayToWeek;
attrib fmtname informat=$10.
hlo informat=$1.
label informat=$3.;
drop i;
retain fmtname 'dayToWeek' SEXCL 'N' EEXCL 'Y' HLO '' TYPE 'N' ;

do i = 0 to 51;
start = i*7;
end = (i+1)*7;
label = i;
output;
end;

start=364;
label = '52+';
HLO = 'H';
EEXCL = 'N';
output;
run;
proc format cntlin=dayToWeek;
run;

/*see all formats*/
PROC FORMAT CNTLOUT=fmtout;
RUN;
PROC PRINT DATA=fmtout;
RUN;

/*Create fake data*/
data Hist(drop=i);
label T = "Days";
call streaminit(1);
do i = 1 to 100;
T = rand("Normal", 220, 90); /* normal with mean 220 */
T = round(T, 1); /* round to nearest day */
output;
end;
run;

proc freq data=hist(where=(T>0)) NOPRINT;
tables T / out=work.hist_bin;
format T dayToWeek.;
run;

title 'Length of Travel';
proc sgplot data=work.hist_bin;
vbar T / response=percent;
xaxis label="Duration (Weeks)";
yaxis label="Percent";
run;

NB. Found this reference extremely handy: Creating a Format from Raw Data or a SAS® Dataset 

View solution in original post

7 REPLIES 7
BrunoMueller
SAS Super FREQ

How about using the WEEK function to create the week number out of a date and then use the VBAR statement of SGPLOT.

 

Below is a code sample.

/* Add data */
data Hist(drop=i);
  label T = "Days";
  call streaminit(1);

  do i = 1 to 100;
    T = rand("Normal", 120, 60); /* normal with mean 120  */
    T = round(T, 1);             /* round to nearest day */
    week = week(t, "V");
    output;
  end;
run;

proc sgplot data=hist;
  vbar week / stat=percent;
  xaxis type=linear 
    min=1 max=53
    values=( 4 to 52 by 4 )
    valueshint
  ;
run;

Bruno

 

 

mduarte
Quartz | Level 8

Thanks Bruno, that is almost there (and much simpler).  Do you know how to get the last bin of > 53 as well?  Do you also know my methods 1 and 3 didn't work?

BrunoMueller
SAS Super FREQ

For method 3, the format name used starts with a $ , so this creates a character format, where as your data is numeric. Also the format would only work for positive numbers. Also the logic would only cover a specific year.

 

I do not know of an easy way to label week 53, but there are only week numbers from 1 to 53. So week 53 is nothing special

 

Bruno

mduarte
Quartz | Level 8

The real data is for duration in days and only has values >= 0.  The last bin is supposed to be for all durations of 53 weeks and greater, i.e. durations of 53, 54, 55, 56, ... weeks.  

ballardw
Super User

Once you have a variable with the week number then you can use a custom format such as :

proc format;
   value weeknum
   53-high = '>53'
   ;
run;

Then use that format with the summary, if you presummarize data before the graph, and graph procedure associated with the week variable. The format will create the appropriate bin.

 

mduarte
Quartz | Level 8

I ended up using the following code ... 

 

/*Create Format*/
data dayToWeek;
attrib fmtname informat=$10.
hlo informat=$1.
label informat=$3.;
drop i;
retain fmtname 'dayToWeek' SEXCL 'N' EEXCL 'Y' HLO '' TYPE 'N' ;

do i = 0 to 51;
start = i*7;
end = (i+1)*7;
label = i;
output;
end;

start=364;
label = '52+';
HLO = 'H';
EEXCL = 'N';
output;
run;
proc format cntlin=dayToWeek;
run;

/*see all formats*/
PROC FORMAT CNTLOUT=fmtout;
RUN;
PROC PRINT DATA=fmtout;
RUN;

/*Create fake data*/
data Hist(drop=i);
label T = "Days";
call streaminit(1);
do i = 1 to 100;
T = rand("Normal", 220, 90); /* normal with mean 220 */
T = round(T, 1); /* round to nearest day */
output;
end;
run;

proc freq data=hist(where=(T>0)) NOPRINT;
tables T / out=work.hist_bin;
format T dayToWeek.;
run;

title 'Length of Travel';
proc sgplot data=work.hist_bin;
vbar T / response=percent;
xaxis label="Duration (Weeks)";
yaxis label="Percent";
run;

NB. Found this reference extremely handy: Creating a Format from Raw Data or a SAS® Dataset 

BrunoMueller
SAS Super FREQ

I was thinking again about the labeling of the x axis to 52 and >=53, so I found a way of doing this using the VALUESFORMAT= option on the XAXIS statement.

 

I also had a closer look at your orginal graph and noticed that there is actually a bar between 52 and >=53

 

Anyway find below the code sample for the x axis labeling as it was in the original graph

 

/*Create Format*/
data dayToWeek;
  attrib
    fmtname informat=$10.
    hlo informat=$1.
    label informat=$3.
  ;
  drop i;
  retain
    fmtname 'dayToWeek' SEXCL 'Y' EEXCL 'N' HLO '' TYPE 'N'
  ;

  do i = 0 to 52;
    start = i*7;
    end = (i+1)*7;

    label = put(i, z2.);
    output;
  end;

  start=371;
  label = '53';
  HLO = 'H';
  EEXCL = 'N';
  output;
run;

proc format cntlin=dayToWeek;
run;

/*Create fake data*/
data Hist(drop=i);
  label T = "Days";
  call streaminit(1);

  do i = 1 to 1000;
    T = rand("Normal", 120, 90); /* normal with mean 220 */
    T = round(T, 1); /* round to nearest day */
    output;
  end;
run;

proc freq data=hist( where = ( T>0 )) NOPRINT;
  tables T / out=work.hist_bin;
  format T dayToWeek.;
run;

*
* convert the bin week to the real week number
*;
data hist_bin2;
  set hist_bin;
  t2 = input( vvalue(t), 8.);
run;

*
* create an all weeks data set
*;
data allWeeks;
  do t2 = 1 to 53;
    output;
  end;
run;

*
* merge data and all weeks
* convert t to char
*;
data hist_bin_all;
  merge hist_bin2 allWeeks;
  by t2;
  tc = put(t2, z2.);
run;

*
* create a format for labeling the x axis
*;
proc format;
  value $tc_display
    "04" = "4"
    "08" = "8"
    "12" = "12"
    "16" = "16"
    "20" = "20"
    "24" = "24"
    "28" = "28"
    "32" = "32"
    "36" = "36"
    "40" = "40"
    "44" = "44"
    "48" = "48"
    "52" = "52"
    "53" = "(*ESC*){unicode '2265'x}53"
    other = " "
  ;
run;

title 'Length of Travel';
ods graphics / width=1600 height=800;

proc sgplot data=work.hist_bin_all;
  vbar Tc / response=percent;
  xaxis  
    label="Duration (Weeks)"
    valuesformat=$tc_display. 
  ;
  yaxis label="Percent";
run;

Bruno

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 2311 views
  • 1 like
  • 3 in conversation