Data visualization with SAS programming

Using two data sources in a proc sgplot procedure

Accepted Solution Solved
Reply
Super Contributor
Posts: 256
Accepted Solution

Using two data sources in a proc sgplot procedure

I'm creating a graph with exposure across different levels of a factor, vehicle age. On a secondary axis I include the average frequency per level. My code is

proc sgplot data=dataset1 (where=(compress(level) not in ('-1','Z.Unknown')));

     title "Vehicle Age";

     vbar level / group = source response = exposure;

  run;

This produces the graph with exposure split by source for each level but I want to include the average frequency on the secondary axis but this is available on another dataset that doesn't split out exposure into source. So in dataset 1, there is an average frequency but this is given for each source in each vehicle age. I want to use the overall average frequency for each level. something like...

proc sgplot data=Egidwh_oneway_yr3 (where=(rank <= 10 and compress(factor) = "glm_4_vehicl_vehicle_age" and compress(level) not in ('-1','Z.Unknown')));
     title "&factor - Exposure By Year";
     vbar level / group = source response = exposure;
     vline level / response=freq y2axis;

  run;

but where the vline info comes from the more summarized dataset

any ideas??


Accepted Solutions
Solution
‎03-05-2015 04:01 PM
Grand Advisor
Posts: 10,251

Re: Using two data sources in a proc sgplot procedure

The place to start would be to combine the data sets:

Data plot;

     set dataset1 dataset2;

run;

As long as your summary dataset does not have values for Exposure and the first no values for Freq then you should be good to go.

HINT: Keep only  the variables you need for the plot if you have lots. And since you have such complex where criteria, do that in the combination step, it looks like on the first dataset.

View solution in original post


All Replies
Solution
‎03-05-2015 04:01 PM
Grand Advisor
Posts: 10,251

Re: Using two data sources in a proc sgplot procedure

The place to start would be to combine the data sets:

Data plot;

     set dataset1 dataset2;

run;

As long as your summary dataset does not have values for Exposure and the first no values for Freq then you should be good to go.

HINT: Keep only  the variables you need for the plot if you have lots. And since you have such complex where criteria, do that in the combination step, it looks like on the first dataset.

Super Contributor
Posts: 256

Re: Using two data sources in a proc sgplot procedure

Brilliant, that makes perfect sense. Thanks

Super Contributor
Posts: 256

Re: Using two data sources in a proc sgplot procedure

I tried this but I get the following error message:

Once a GROUP variable is used in a categorical chart, that GROUP

         variable must be used in all overlaid charts.  The specified GROUP

         variable has been removed from the graph display.

here is my code:

data vehage(keep = factor level all_freq all_sev flag4 );
set Egidwh_oneway4;
if compress(factor) = "glm_4_vehicl_vehicle_age";
run;

data vehageyr(drop = all_freq all_sev ) ;
set Egidwh_oneway_yr3;
if compress(factor) = "glm_4_vehicl_vehicle_age";
run;

data comb;
set vehage vehageyr;
run;

  proc sgplot data=comb (where=(flag4 = "Y" and compress(factor) = "glm_4_vehicl_vehicle_age" and compress(level) not in ('-1','Z.Unknown')));
     title "vehicle_age - Severity";
     vbar level / group = source response=exposure;
     vline level /  response=all_freq y2axis;
  run;

Grand Advisor
Posts: 10,251

Re: Using two data sources in a proc sgplot procedure

Sorry, I missed the group part before. I way is to add a level for group to the summary data (probably only one) and add to the vline syntax.

data comb;

set vehage vehageyr (in=in1);

/* assuming that vehageyr is the summary data lacking the Source Variable*/

if in1 then Source = <a source value>;

run;

to prevent the second plot from adding to the legend use the NAME="text" to name each of the plots by using KEYLEGEND "text" ; where text is the same text used for the VBAR name option.

Super Contributor
Posts: 256

Re: Using two data sources in a proc sgplot procedure

Could you give a little example of this, I don't quite follows.

Thanks

Grand Advisor
Posts: 10,251

Re: Using two data sources in a proc sgplot procedure

proc sgplot data=comb (where=(flag4 = "Y" and compress(factor) = "glm_4_vehicl_vehicle_age" and compress(level) not in ('-1','Z.Unknown')));

     title "vehicle_age - Severity";

     vbar level / group = source response=exposure name='bar';

     vline level /  group=source response=all_freq y2axis name='line';

      keylegend 'bar';

  run;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 1236 views
  • 0 likes
  • 2 in conversation