BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Uddin
Fluorite | Level 6

Hi,

I have following data set with region and happiness score, and I want to create a single plot that shows the distribution of happiness score by region. Only include regions in Asia, Europe, and Africa and make sure the regions appear grouped within these three regions.

Region, Happiness_score

Western Europe, 7.587

Western Europe, 7.561

North America, 7.427

North America, 7.527

North America, 7.327

Australia and New Zealand, 7.286

Australia and New Zealand, 7.386

Central and Eastern Europe, 6.505

Central and Eastern Europe, 6.405

Eastern Asia, 5.984

Eastern Asia, 5.684

Latin America and Caribbean, 6.477

Middle East and Northern Africa, 6.411

Southeastern Asia, 6.455

Southern Asia, 5.194

Sub-Saharan Africa, 5.268

 

I used the following code but the plot is not shown:

 

proc sort data=combined2 out=combined2_sorted;

by Region;

run;

 

Proc format;

value $RTypeFmt

'Eastern Asia', 'Southeastern Asia', 'Southern Asia' = 'Asia'

'Central and Eastern Europe', 'Western Europe' ='Europe'

'Middle East and Northern Africa', 'Sub-Saharan Africa' ='Africa'

other='';

run;

proc boxplot data=combined2_sorted;

plot Happiness_score*Region/horizontal;

where Region='Asia' and 'Europe' and 'Africa';

run;

 

Looking for your kind support. Kind Regards

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

PROC BOXPLOT appears to work differently than other SAS PROCs in how it uses the formats. You will have to do a sort by the formatted value.

 

data combined2a;
     set combined2;
     region2=put(region,$RTypefmt.);
run;
proc sort data=combined2a;
    by region2;
run;
proc boxplot data=combined2a;
     plot Happiness_score*Region2/horizontal;
     where region2 in ('Asia','Europe','Africa');
run;

 

--
Paige Miller

View solution in original post

16 REPLIES 16
PaigeMiller
Diamond | Level 26

From now on, please paste your code into the window that appears when you click on the "little running man" icon. Example:

 

where Region='Asia' and 'Europe' and 'Africa';

 

The above line of code is not valid syntax. The following would be valid syntax, except for your formatting.

 

where Region in ('Asia' ,'Europe' ,'Africa');

 

but even that won't work because of your formatting. I think you just remove the WHERE statement from your code, as all regions in this data seems to be Asia, Europe or Africa. Also, you need this in PROC BOXPLOT otherwise without this your formats will not get used by PROC BOXPLOT.

 

format region $RTypeFmt.;

 

--
Paige Miller
Uddin
Fluorite | Level 6
Hi Paige,

Thank you so much for your kind response. I tried with both ways as you suggested but it shows all group without WHERE statement like this:
[cid:5ecd9f9b-14d3-4c85-b2ad-520051335356]

And with WHERE statement it shows error in WHER statement. Looking forward to your kind response.
ballardw
Super User

Any time you see errors, copy from the log the submitted code along with all the notes and messages, on the forum open a text box using the </> and paste all the copied text.

 

An error without the code generating it is mostly useless.

Uddin
Fluorite | Level 6
OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 68         
 69         proc boxplot data=combined2_sorted;
 70         plot Happiness_score*Region/horizontal;
 71         where Region in ('Asia' ,'Europe' ,'Africa');
 72         run;
 
 NOTE: No observations were selected from data set WORK.COMBINED2_SORTED.
 NOTE: There were 0 observations read from the data set WORK.COMBINED2_SORTED.
       WHERE Region in ('Africa', 'Asia', 'Europe');
 NOTE: PROCEDURE BOXPLOT used (Total process time):
       real time           0.00 seconds
       user cpu time       0.01 seconds
       system cpu time     0.00 seconds
       memory              749.84k
       OS Memory           32680.00k
       Timestamp           06/15/2022 03:35:43 PM
       Step Count                        663  Switch Count  0
       Page Faults                       0
       Page Reclaims                     51
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           0
       
 
 73         
 74         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 84         
proc sort data=combined2 out=combined2_sorted;
by Region;
run;
Proc format;
value $RTypeFmt
'Eastern Asia', 'Southeastern Asia', 'Southern Asia' = 'Asia'
'Central and Eastern Europe', 'Western Europe' ='Europe'
'Middle East and Northern Africa', 'Sub-Saharan Africa' ='Africa'
other='';
run;

proc boxplot data=combined2_sorted;
plot Happiness_score*Region/horizontal;
where Region in ('Asia' ,'Europe' ,'Africa');
run;
ballardw
Super User

You requested a where statement with the values of the variable not the FORMATTED value of the variable.

Try

          
 proc boxplot data=combined2_sorted;
     plot Happiness_score*Region/horizontal;
     where put(Region,$RTypeFmt.) in ('Asia' ,'Europe' ,'Africa');
/* and as was previously mentioned to DISPLAY the formatted value in the graph*/
format region $$RTypeFmt. ; run;

 

Comparisons such as IF, WHERE, = , not equal and such will use the VALUE of a variable not the formatted value unless you specifically force that value to be used by applying the format in the operation with a Put.

 

Uddin
Fluorite | Level 6
Hi,

Thank for your kind support. The code you provided it gives the following plot:
[cid:77563531-5e01-4757-8e4a-d5aa68080e7f]

It is not coming with only three regions. It seems it covers all. Looking for your kind response.
PaigeMiller
Diamond | Level 26

Repeating that has already been said three times now:

 

its time for you to stop saying you get errors, and SHOW US the log, including your code as it appears in the log.

 

So

 

PLEASE SHOW US THE LOG

--
Paige Miller
Uddin
Fluorite | Level 6
Here is the log:


1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
68
69 proc boxplot data=combined2_sorted;
70 plot Happiness_score*Region/horizontal;
71 where put(region,$RTypeFmt.) in ('Asia','Europe','Africa');
72 /* and as was previously mentioned to DISPLAY the formatted value in the graph*/
73 format region $RTypeFmt. ;
74 run;

NOTE: Processing beginning for PLOT statement number 1.
NOTE: There were 132 observations read from the data set WORK.COMBINED2_SORTED.
WHERE PUT(region, $RTYPEFMT6.) in ('Africa', 'Asia', 'Europe');
NOTE: PROCEDURE BOXPLOT used (Total process time):
real time 0.16 seconds
user cpu time 0.10 seconds
system cpu time 0.01 seconds
memory 14857.50k
OS Memory 43716.00k
Timestamp 06/15/2022 04:31:26 PM
Step Count 760 Switch Count 1
Page Faults 0
Page Reclaims 2850
Page Swaps 0
Voluntary Context Switches 704
Involuntary Context Switches 1
Block Input Operations 0
Block Output Operations 672


75
76 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
86

Looking for your kind response.
PaigeMiller
Diamond | Level 26

We can't see your plot. Please use the "Insert Photos" icon to include a screen capture of your plot.

 

PROC BOXPLOT needs data that is sorted by REGION to produce a meaningful plot.

--
Paige Miller
Uddin
Fluorite | Level 6

Boxplot.jpg

This is the plot.

PaigeMiller
Diamond | Level 26

So only three regions are shown. The reason some regions are appear in several different places on the axis is that you have to sort the data by REGION before you run PROC BOXPLOT.

--
Paige Miller
Uddin
Fluorite | Level 6

Hi Page,

Thanks for your kind effort. Yes, I sorted the data by Region before I run PROC BOXPLOT. Please have a look on the code I used:

proc sort data=combined2 out=combined2_sorted;
by Region;
run;

Proc format;
value $RTypeFmt
'Eastern Asia', 'Southeastern Asia', 'Southern Asia' = 'Asia'
'Central and Eastern Europe', 'Western Europe' ='Europe'
'Middle East and Northern Africa', 'Sub-Saharan Africa' ='Africa'
other='';
run;

proc boxplot data=combined2_sorted;
     plot Happiness_score*Region/horizontal;
     where put(region,$RTypeFmt.) in ('Asia','Europe','Africa');
     /* and as was previously mentioned to DISPLAY the formatted value in the graph*/
    format region $RTypeFmt.;
run;

But it shows the same.

 

Boxplot.jpg

 

Looking for your kind response.

 

 

 

 

PaigeMiller
Diamond | Level 26

PROC BOXPLOT appears to work differently than other SAS PROCs in how it uses the formats. You will have to do a sort by the formatted value.

 

data combined2a;
     set combined2;
     region2=put(region,$RTypefmt.);
run;
proc sort data=combined2a;
    by region2;
run;
proc boxplot data=combined2a;
     plot Happiness_score*Region2/horizontal;
     where region2 in ('Asia','Europe','Africa');
run;

 

--
Paige Miller
Uddin
Fluorite | Level 6
Hi Paige,

thank you so much for your kind efforts and patience. Really amazing!!!!!!!Appreciated

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 16 replies
  • 800 views
  • 0 likes
  • 4 in conversation