Quartz | Level 8

## Using two data sources in a proc sgplot procedure

I'm creating a graph with exposure across different levels of a factor, vehicle age. On a secondary axis I include the average frequency per level. My code is

proc sgplot data=dataset1 (where=(compress(level) not in ('-1','Z.Unknown')));

title "Vehicle Age";

vbar level / group = source response = exposure;

run;

This produces the graph with exposure split by source for each level but I want to include the average frequency on the secondary axis but this is available on another dataset that doesn't split out exposure into source. So in dataset 1, there is an average frequency but this is given for each source in each vehicle age. I want to use the overall average frequency for each level. something like...

proc sgplot data=Egidwh_oneway_yr3 (where=(rank <= 10 and compress(factor) = "glm_4_vehicl_vehicle_age" and compress(level) not in ('-1','Z.Unknown')));
title "&factor - Exposure By Year";
vbar level / group = source response = exposure;
vline level / response=freq y2axis;

run;

but where the vline info comes from the more summarized dataset

any ideas??

1 ACCEPTED SOLUTION

Accepted Solutions
Super User

## Re: Using two data sources in a proc sgplot procedure

The place to start would be to combine the data sets:

Data plot;

set dataset1 dataset2;

run;

As long as your summary dataset does not have values for Exposure and the first no values for Freq then you should be good to go.

HINT: Keep only  the variables you need for the plot if you have lots. And since you have such complex where criteria, do that in the combination step, it looks like on the first dataset.

6 REPLIES 6
Super User

## Re: Using two data sources in a proc sgplot procedure

The place to start would be to combine the data sets:

Data plot;

set dataset1 dataset2;

run;

As long as your summary dataset does not have values for Exposure and the first no values for Freq then you should be good to go.

HINT: Keep only  the variables you need for the plot if you have lots. And since you have such complex where criteria, do that in the combination step, it looks like on the first dataset.

Quartz | Level 8

## Re: Using two data sources in a proc sgplot procedure

Brilliant, that makes perfect sense. Thanks

Quartz | Level 8

## Re: Using two data sources in a proc sgplot procedure

I tried this but I get the following error message:

Once a GROUP variable is used in a categorical chart, that GROUP

variable must be used in all overlaid charts.  The specified GROUP

variable has been removed from the graph display.

here is my code:

data vehage(keep = factor level all_freq all_sev flag4 );
set Egidwh_oneway4;
if compress(factor) = "glm_4_vehicl_vehicle_age";
run;

data vehageyr(drop = all_freq all_sev ) ;
set Egidwh_oneway_yr3;
if compress(factor) = "glm_4_vehicl_vehicle_age";
run;

data comb;
set vehage vehageyr;
run;

proc sgplot data=comb (where=(flag4 = "Y" and compress(factor) = "glm_4_vehicl_vehicle_age" and compress(level) not in ('-1','Z.Unknown')));
title "vehicle_age - Severity";
vbar level / group = source response=exposure;
vline level /  response=all_freq y2axis;
run;

Super User

## Re: Using two data sources in a proc sgplot procedure

Sorry, I missed the group part before. I way is to add a level for group to the summary data (probably only one) and add to the vline syntax.

data comb;

set vehage vehageyr (in=in1);

/* assuming that vehageyr is the summary data lacking the Source Variable*/

if in1 then Source = <a source value>;

run;

to prevent the second plot from adding to the legend use the NAME="text" to name each of the plots by using KEYLEGEND "text" ; where text is the same text used for the VBAR name option.

Quartz | Level 8

## Re: Using two data sources in a proc sgplot procedure

Could you give a little example of this, I don't quite follows.

Thanks

Super User

## Re: Using two data sources in a proc sgplot procedure

proc sgplot data=comb (where=(flag4 = "Y" and compress(factor) = "glm_4_vehicl_vehicle_age" and compress(level) not in ('-1','Z.Unknown')));

title "vehicle_age - Severity";

vbar level / group = source response=exposure name='bar';

vline level /  group=source response=all_freq y2axis name='line';

keylegend 'bar';

run;

Discussion stats
• 6 replies
• 10177 views
• 0 likes
• 2 in conversation