Hi,
I have following data set with region and happiness score, and I want to create a single plot that shows the distribution of happiness score by region. Only include regions in Asia, Europe, and Africa and make sure the regions appear grouped within these three regions.
Region, Happiness_score
Western Europe, 7.587
Western Europe, 7.561
North America, 7.427
North America, 7.527
North America, 7.327
Australia and New Zealand, 7.286
Australia and New Zealand, 7.386
Central and Eastern Europe, 6.505
Central and Eastern Europe, 6.405
Eastern Asia, 5.984
Eastern Asia, 5.684
Latin America and Caribbean, 6.477
Middle East and Northern Africa, 6.411
Southeastern Asia, 6.455
Southern Asia, 5.194
Sub-Saharan Africa, 5.268
I used the following code but the plot is not shown:
proc sort data=combined2 out=combined2_sorted;
by Region;
run;
Proc format;
value $RTypeFmt
'Eastern Asia', 'Southeastern Asia', 'Southern Asia' = 'Asia'
'Central and Eastern Europe', 'Western Europe' ='Europe'
'Middle East and Northern Africa', 'Sub-Saharan Africa' ='Africa'
other='';
run;
proc boxplot data=combined2_sorted;
plot Happiness_score*Region/horizontal;
where Region='Asia' and 'Europe' and 'Africa';
run;
Looking for your kind support. Kind Regards
PROC BOXPLOT appears to work differently than other SAS PROCs in how it uses the formats. You will have to do a sort by the formatted value.
data combined2a;
set combined2;
region2=put(region,$RTypefmt.);
run;
proc sort data=combined2a;
by region2;
run;
proc boxplot data=combined2a;
plot Happiness_score*Region2/horizontal;
where region2 in ('Asia','Europe','Africa');
run;
From now on, please paste your code into the window that appears when you click on the "little running man" icon. Example:
where Region='Asia' and 'Europe' and 'Africa';
The above line of code is not valid syntax. The following would be valid syntax, except for your formatting.
where Region in ('Asia' ,'Europe' ,'Africa');
but even that won't work because of your formatting. I think you just remove the WHERE statement from your code, as all regions in this data seems to be Asia, Europe or Africa. Also, you need this in PROC BOXPLOT otherwise without this your formats will not get used by PROC BOXPLOT.
format region $RTypeFmt.;
Any time you see errors, copy from the log the submitted code along with all the notes and messages, on the forum open a text box using the </> and paste all the copied text.
An error without the code generating it is mostly useless.
OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 68 69 proc boxplot data=combined2_sorted; 70 plot Happiness_score*Region/horizontal; 71 where Region in ('Asia' ,'Europe' ,'Africa'); 72 run; NOTE: No observations were selected from data set WORK.COMBINED2_SORTED. NOTE: There were 0 observations read from the data set WORK.COMBINED2_SORTED. WHERE Region in ('Africa', 'Asia', 'Europe'); NOTE: PROCEDURE BOXPLOT used (Total process time): real time 0.00 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 749.84k OS Memory 32680.00k Timestamp 06/15/2022 03:35:43 PM Step Count 663 Switch Count 0 Page Faults 0 Page Reclaims 51 Page Swaps 0 Voluntary Context Switches 0 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 0 73 74 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 84
proc sort data=combined2 out=combined2_sorted;
by Region;
run;
Proc format;
value $RTypeFmt
'Eastern Asia', 'Southeastern Asia', 'Southern Asia' = 'Asia'
'Central and Eastern Europe', 'Western Europe' ='Europe'
'Middle East and Northern Africa', 'Sub-Saharan Africa' ='Africa'
other='';
run;
proc boxplot data=combined2_sorted;
plot Happiness_score*Region/horizontal;
where Region in ('Asia' ,'Europe' ,'Africa');
run;
You requested a where statement with the values of the variable not the FORMATTED value of the variable.
Try
proc boxplot data=combined2_sorted; plot Happiness_score*Region/horizontal; where put(Region,$RTypeFmt.) in ('Asia' ,'Europe' ,'Africa');
/* and as was previously mentioned to DISPLAY the formatted value in the graph*/
format region $$RTypeFmt. ; run;
Comparisons such as IF, WHERE, = , not equal and such will use the VALUE of a variable not the formatted value unless you specifically force that value to be used by applying the format in the operation with a Put.
Repeating that has already been said three times now:
its time for you to stop saying you get errors, and SHOW US the log, including your code as it appears in the log.
So
PLEASE SHOW US THE LOG
We can't see your plot. Please use the "Insert Photos" icon to include a screen capture of your plot.
PROC BOXPLOT needs data that is sorted by REGION to produce a meaningful plot.
This is the plot.
So only three regions are shown. The reason some regions are appear in several different places on the axis is that you have to sort the data by REGION before you run PROC BOXPLOT.
Hi Page,
Thanks for your kind effort. Yes, I sorted the data by Region before I run PROC BOXPLOT. Please have a look on the code I used:
proc sort data=combined2 out=combined2_sorted;
by Region;
run;
Proc format;
value $RTypeFmt
'Eastern Asia', 'Southeastern Asia', 'Southern Asia' = 'Asia'
'Central and Eastern Europe', 'Western Europe' ='Europe'
'Middle East and Northern Africa', 'Sub-Saharan Africa' ='Africa'
other='';
run;
proc boxplot data=combined2_sorted;
plot Happiness_score*Region/horizontal;
where put(region,$RTypeFmt.) in ('Asia','Europe','Africa');
/* and as was previously mentioned to DISPLAY the formatted value in the graph*/
format region $RTypeFmt.;
run;
But it shows the same.
Looking for your kind response.
PROC BOXPLOT appears to work differently than other SAS PROCs in how it uses the formats. You will have to do a sort by the formatted value.
data combined2a;
set combined2;
region2=put(region,$RTypefmt.);
run;
proc sort data=combined2a;
by region2;
run;
proc boxplot data=combined2a;
plot Happiness_score*Region2/horizontal;
where region2 in ('Asia','Europe','Africa');
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.