BookmarkSubscribeRSS Feed
EricS
Fluorite | Level 6

Hi, I am using PROC SGPLOT in SAS 9.4 to plot data for weekdays/non-holidays such that points are shown consecutively (ie, excluding weekends and holidays) using a DISCRETE axis.  What I would like to do is show the first day of whatever my interval is (eg, first day of a month), but these will not be evenly spaced.  Ideally, I would simply use a separate variable as an XAXIS label, but I don't see a way to do this. (Can this be done?)

 

The strategy that I've employed* is to use a dummy secondary x-axis for labels, and a leave the primary x-axis as blank.  The problem I'm having is that when I use the VALUES option on then XAXIS that I want to display (which are not evenly distributed), it shows the correct text, but the XAXIS appears to space the tickmarks evenly, so that the tickmarks do not line up with the where the points should be displayed. 

 

(*Note: I posted an earlier related question to stackexchange, in which a kind user gave the suggestion that I use a second blank xaxis and the values option, which has led me to my current problem; this user recommended that I post here.)

 

Below is code that illustrates this.  I'll quickly describe each step:

(1) inputs raw data with a few months of date/price data

(2) creates month and week variables which are used to create labels

(3) gets a unique list of the labels to use

(4) uses the %ARRAY and %DO_OVER macros to get the VALUES I want to use in the XAXIS (I've attached these in case you're not familiar with them, and there is a description of them here)

(5) plots the data with PROC SGPLOT; for this purpose I've plotted both a SERIES and a SCATTER, which should overlap if plotted correctly; the SERIES is plotted on the hidden XAXIS, and the SCATTER is plotted on displayed XAXIS2

 

Running this code generates the plot below: the SERIES is plotted correctly (on the hidden axis), but the SCATTER is not correct as the tickmarks on the x-axis appear to be spread out evenly across the axis, rather than at the points specified by the VALUES option in the XAXIS2 statement, even though the labels display the correct text.  For example, the first tick interval is much shorter than the latter ones, so the value for 1OCT should be moved to the left such that the point appears on the series line, but instead it appears be evenly spaced across the x-axis.

 

SGPLOT_1.PNG

 

 

 

I've tried a bunch of things, like including/excluding missing values from the different variables, but I can't figure out how to get these to line up properly.  Any suggestions would be greatly appreciated.  Thank you!

 

*(1) Input raw data;
data rawdata;
	informat date date10.;
	format date date9.;
	input Obs Date price;
	datalines;
1 22SEP2015 12.36
2 23SEP2015 12.24
3 24SEP2015 12.38
4 25SEP2015 12.24
5 28SEP2015 12.38
6 29SEP2015 11.99
7 30SEP2015 12.19
8 01OCT2015 12.31
9 02OCT2015 12.19
10 05OCT2015 12.2
11 06OCT2015 12.12
12 07OCT2015 12.22
13 08OCT2015 12.3
14 09OCT2015 12.69
15 12OCT2015 12.88
16 13OCT2015 12.76
17 14OCT2015 12.78
18 15OCT2015 12.79
19 16OCT2015 13.04
20 19OCT2015 12.94
21 20OCT2015 12.88
22 21OCT2015 13.16
23 22OCT2015 13.09
24 23OCT2015 13.3
25 26OCT2015 13.36
26 27OCT2015 13.24
27 28OCT2015 13.16
28 29OCT2015 13.17
29 30OCT2015 13.11
30 02NOV2015 12.95
31 03NOV2015 13.1
32 04NOV2015 13.12
33 05NOV2015 12.95
34 06NOV2015 13
35 09NOV2015 13.25
36 10NOV2015 12.87
37 11NOV2015 13
38 12NOV2015 13.04
39 13NOV2015 12.85
40 16NOV2015 12.74
41 17NOV2015 13.03
42 18NOV2015 13.17
43 19NOV2015 13.38
44 20NOV2015 13.41
45 23NOV2015 13.53
46 24NOV2015 13.53
47 25NOV2015 13.41
48 26NOV2015 13.5
49 27NOV2015 13.47
50 30NOV2015 13.39
51 01DEC2015 13.72
;
run;

*(2) Create time intervals;
data rawdata1;
	set rawdata;
	year=year(date);
	month=year*100+month(date);
	week=year*100+week(date);
run;

*(3) Get labels to display based on selected time interval;
%let timeint=month;

data rawdata2;
	set rawdata1;
	by &timeint;
	retain xlab;

	if first.&timeint then
		xlab=put(date,date5.);
run;

proc sort data=rawdata2 (keep=obs xlab) out=xlab nodupkey;
	by xlab;
run;

proc sort data=xlab;
	by obs;
run;

*(4) Put selected values in to macro variables using ARRAY;
options nosymbolgen nomprint;

%array(xval,data=xlab,var=obs);
%array(xlab,data=xlab,var=xlab);
%put %do_over(xval,phrase=?);
%put %do_over(xlab,phrase=?);
options mprint;

*(5) Plot series and scatter, using DO_OVER macro to specify VALUES;
proc sgplot data=rawdata2;
	scatter x=obs y=price / datalabel=price;
	series x=date y=price / x2axis;
	xaxis values=(%do_over(xval,phrase="?")) valuesdisplay=(%do_over(xlab,phrase="?")) type=discrete discreteorder=data;
	x2axis display=none type=discrete;
run;

 

 

   

 

9 REPLIES 9
EricS
Fluorite | Level 6

*EDIT: The code now appears to be OK in what I originally posted, so please ignore this reply.  

 

It appears that the code I posted in the "Insert SAS Code" tool did not include the carriage returns. Here is the code as I pasted it:

*(1) Input raw data;
data rawdata;
informat date date10.;
format date date9.;
input Obs Date price;
datalines;
1 22SEP2015 12.36
2 23SEP2015 12.24
3 24SEP2015 12.38
4 25SEP2015 12.24
5 28SEP2015 12.38
6 29SEP2015 11.99
7 30SEP2015 12.19
8 01OCT2015 12.31
9 02OCT2015 12.19
10 05OCT2015 12.2
11 06OCT2015 12.12
12 07OCT2015 12.22
13 08OCT2015 12.3
14 09OCT2015 12.69
15 12OCT2015 12.88
16 13OCT2015 12.76
17 14OCT2015 12.78
18 15OCT2015 12.79
19 16OCT2015 13.04
20 19OCT2015 12.94
21 20OCT2015 12.88
22 21OCT2015 13.16
23 22OCT2015 13.09
24 23OCT2015 13.3
25 26OCT2015 13.36
26 27OCT2015 13.24
27 28OCT2015 13.16
28 29OCT2015 13.17
29 30OCT2015 13.11
30 02NOV2015 12.95
31 03NOV2015 13.1
32 04NOV2015 13.12
33 05NOV2015 12.95
34 06NOV2015 13
35 09NOV2015 13.25
36 10NOV2015 12.87
37 11NOV2015 13
38 12NOV2015 13.04
39 13NOV2015 12.85
40 16NOV2015 12.74
41 17NOV2015 13.03
42 18NOV2015 13.17
43 19NOV2015 13.38
44 20NOV2015 13.41
45 23NOV2015 13.53
46 24NOV2015 13.53
47 25NOV2015 13.41
48 26NOV2015 13.5
49 27NOV2015 13.47
50 30NOV2015 13.39
51 01DEC2015 13.72
;

run;

*(2) Create time intervals;
data rawdata1;
set rawdata;
year=year(date);
month=year*100+month(date);
week=year*100+week(date);
run;

*(3) Get labels to display based on selected time interval;
%let timeint=month;

data rawdata2;
set rawdata1;
by &timeint;
retain xlab;
if first.&timeint then xlab=put(date,date5.);
run;

proc sort data=rawdata2 (keep=obs xlab) out=xlab nodupkey;
by xlab;
run;

proc sort data=xlab;
by obs;
run;

*(4) Put selected values in to macro variables using ARRAY;
options nosymbolgen nomprint;
%array(xval,data=xlab,var=obs);
%array(xlab,data=xlab,var=xlab);
%put %do_over(xval,phrase=?);
%put %do_over(xlab,phrase=?);

options mprint;

*(5) Plot series and scatter, using DO_OVER macro to specify VALUES;
proc sgplot data=rawdata2;
scatter x=obs y=price / datalabel=price;
series x=date y=price / x2axis;
xaxis values=(%do_over(xval,phrase="?")) valuesdisplay=(%do_over(xlab,phrase="?")) type=discrete discreteorder=data;
x2axis display=none type=discrete;
run;

Reeza
Super User
What's your exact version of SAS? There are a lot of changes between each iteration these days. ie 9.4 TS1M4
EricS
Fluorite | Level 6
According to the log it is SAS 9.4 (TS1M5). I upgraded the system from 9.3 last week. Thanks!
Jay54
Meteorite | Level 14

Looks like exactly what I was doing here.  🙂

All values are displayed using the discrete axis to squeeze out the holidays.

Note how I have placed a tick value at the first work day of each month.

I will be happy to hear of other ways to deal with this.

https://blogs.sas.com/content/graphicallyspeaking/2017/09/27/stock-chart/

 

EricS
Fluorite | Level 6
Wow, great minds, I guess...

Yes, this is almost identical to what I'm trying to do.
The xaxistable is great idea. I'm still very curious as to why my code doesn't work, because both axes are DISCRETE.

Thanks very much!
Jay54
Meteorite | Level 14

Update...

 

I noticed you said both your axes were set to discrete.  Clearly, you are still using two different variables, and their values and generated offsets etc could be different, resulting in different mapping.  I tried running your code, but it did not resolve the %array macro.  I suggest you look at your data set just before you get to the SGPLOT step to see what the two variables look like.

 

If you want to overlay different plots reliably, it is best to use the same x variable for all.  In my example, I use only one (discrete) axis.  Everything is mapped to the same axis variable (datechar).  This is a character variable, and there is one in the correct order for each non-holiday.  They are drawn with equal spacing, so if a month with only 20 working days will be shorter than one with 22 days.  This also makes the intervals that are displayed (correctly) unequal.  Now, the task is to place the axis tick values at the right location using the axis table.  The xAxisTable uses the (default) x=datechar.  So, the "DateRef" value is shown at the right discrete location because it is placed in the data set with the right observation.  Other dateRef values are missing so the axis is not cluttered.

 

In your case, I suggest you plot both the series and scatter to the same (discrete) variable.  They will then be sure to be correctly aligned.  Then, plot the axis at the right location.  The column "DateRef" contains the date only for the first day of each month.  So, it is very easy to populate.

EricS
Fluorite | Level 6
This is extremely helpful. Thanks. In your example, is there a way to use the xaxistable to display tickmarks, or is the refline the only way to mark these points?

I also just responded to your request to chime in on your other post.
Jay54
Meteorite | Level 14

Thanks.  My blog article provides a workaround to compressing out non-existing date values on a time axis.  The DATE formatted axis will work like a scaled linear axis.  If values are absent for some dates, there will be blanks, as the value still exists on the axis, only there is no data.  Currently there is no way to compress out the non existing date values while still keeping the axis as a date axis.

 

My proposal is to add an option to the time axis to plot only the data that has been provided sequentially as if it was discrete.  Then, plot the axis values at the right place once per interval.  The data will still have to be using a SAS date format.  This will also be able to handle tick marks.  This will basically automate my workaround.

 

About the tick marks for the workaround...I will look into it.  I am sure it is possible with more code.  🙂

EricS
Fluorite | Level 6

Thanks for the reply.  I've been thinking about this more, and it occurred to me that what I was subconsciously hoping to find was an XAXIS option that effectively functioned the way the DATALABEL option does for plots -- meaning they adhere to the x-axis positions of the plotted variables, but only appear when a non-blank value appears.  So I decided to try using a combination of a REFLINE and a SCATTER with datalabels for an axis, and here's what I came up with:

 

I create variables in the positions where I want labels, where "XAXISVAL"=0 in places where I want axis labels, and a separate, and then use a REFLINE and a SCATTER with DATALABEL=PLUS for the tickmarks.  Below is the SGPLOT code (full code is pasted below).    

 

 

*Set up variables for xaxis;
data rawdata2;
set rawdata1;
by month;
if first.month then do;
xlab=put(date,date5.); *Label for the xaxis;
xaxisval=0; *Tick marks set the y value to zero to appear on xaxis;
 pricelab=price; *To be used as DATALABEL for the plotted values;
end;
run;

*Plot using different axis;
proc sgplot data=rawdata2 noborder noautolegend;
series x=date y=price;
scatter x=date y=pricelab / datalabel=pricelab DATALABELPOS=top;
xaxis display=none type=discrete;
yaxis label="price";

*Use refline and series as xaxis;
refline 0 / LINEATTRS=(color=black);
scatter x=date y=xaxisval / datalabel=xlab DATALABELPOS=bottom MARKERATTRS=(SYMBOL=plus color=black);

run;

This code generates the following plot:

 

SGPLOT_2.PNG

 

 

This is very much a hack, so any suggestions are welcome.  Since the functionality seems to be evolving, if we are talking about making potential modifications to the functionality of a discrete XAXIS in future releases, what I'm envisioning is something very simple along the lines of the following, which would assign the tickmarker placement and labels based on a variable in the dataset (in this case "xlab"), which functions similarly to the DATALABEL option for SCATTER plots.  Tickers and labels would be placed at locations on the XAXIS for non-missing observations (I assume there would be other parameters which would give more flexibility, but this is the basic structure I'm envisioning): 

 

 

xaxis type=discrete discreteorder=data tickmarvar=xlab; *NOTE: Hypothetical code only -- does not work;

 

 

(I also believe the TIME axis options described in the last paragraph of this post by @Jay54 would be very useful.) 

 

Thanks for reading!  I've re-pasted the relevant code from above, with modifications to generate the plot above. 

 

*(1) Input raw data;
data rawdata;
informat date date10.;
format date date9.;
input Obs Date price;
datalines;
1 22SEP2015 12.36
2 23SEP2015 12.24
3 24SEP2015 12.38
4 25SEP2015 12.24
5 28SEP2015 12.38
6 29SEP2015 11.99
7 30SEP2015 12.19
8 01OCT2015 12.31
9 02OCT2015 12.19
10 05OCT2015 12.2
11 06OCT2015 12.12
12 07OCT2015 12.22
13 08OCT2015 12.3
14 09OCT2015 12.69
15 12OCT2015 12.88
16 13OCT2015 12.76
17 14OCT2015 12.78
18 15OCT2015 12.79
19 16OCT2015 13.04
20 19OCT2015 12.94
21 20OCT2015 12.88
22 21OCT2015 13.16
23 22OCT2015 13.09
24 23OCT2015 13.3
25 26OCT2015 13.36
26 27OCT2015 13.24
27 28OCT2015 13.16
28 29OCT2015 13.17
29 30OCT2015 13.11
30 02NOV2015 12.95
31 03NOV2015 13.1
32 04NOV2015 13.12
33 05NOV2015 12.95
34 06NOV2015 13
35 09NOV2015 13.25
36 10NOV2015 12.87
37 11NOV2015 13
38 12NOV2015 13.04
39 13NOV2015 12.85
40 16NOV2015 12.74
41 17NOV2015 13.03
42 18NOV2015 13.17
43 19NOV2015 13.38
44 20NOV2015 13.41
45 23NOV2015 13.53
46 24NOV2015 13.53
47 25NOV2015 13.41
48 26NOV2015 13.5
49 27NOV2015 13.47
50 30NOV2015 13.39
51 01DEC2015 13.72
;

run;


*(2) Create time intervals;
data rawdata1;
set rawdata;
year=year(date);
month=year*100+month(date);
week=year*100+week(date);
run;


*(3) Get labels to display based on selected time interval;
data rawdata2;
set rawdata1;
by month;
if first.month then do;
xlab=put(date,date5.); *Label for the xaxis;
xaxisval=0; *Tick marks set the y value to zero to appear on xaxis;
pricelab=price; *To be used as DATALABEL for the plotted values;
end;
run;


*(4) Plot series and scatter, using a REFLINE and SERIES as the XAXIS;
proc sgplot data=rawdata2 noborder noautolegend;
series x=date y=price;
scatter x=date y=pricelab / datalabel=pricelab DATALABELPOS=top;
xaxis display=none type=discrete;
yaxis label="price";

*Use refline and series as xaxis;
refline 0 / LINEATTRS=(color=black);
scatter x=date y=xaxisval / datalabel=xlab DATALABELPOS=bottom MARKERATTRS=(SYMBOL=plus color=black);

run;

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 5162 views
  • 3 likes
  • 3 in conversation