Re: Creating a graph based on group

Uddin · Posted 06-16-2022 12:19 PM

Hi,

I have 3-set data (2017-2019), and I need to make it into long data set. I made it (combined) and then I list the happiness ranks and happiness scores for Australia and the countries that were in the top 5 or the bottom 5.

Top Country:

Country, Happiness_rank, Happiness_score, Year

Finland, 1, 7.769, 2019

Denmark, 2, 7.6, 2019

Norway, 3, 7.554, 2019

Iceland, 4, 7.494, 2019

Netherlands, 5, 7.488, 2019

Finland, 2, 7.369, 2018

Denmark, 3, 7.4, 2018

Norway, 1, 7.254, 2018

Iceland, 6, 7.394, 2018

Netherlands, 5, 7.188, 2018

Finland, 1, 7.369, 2017

Denmark, 4, 7.4, 2017

Norway, 1, 7.254, 2017

Iceland, 3, 7.394, 2017

Netherlands, 5, 7.188, 2017

Bottom 5:

Country, Happiness_rank, Happiness_score, Year

South Sudan, 156, 2.853, 2019

Central Africa, 155, 3.083, 2019

Afghanistan, 154, 3.203, 2019

Tanzania, 153, 3.231, 2019

Rwanda, 152, 3.334, 2019

South Sudan, 156, 2.653, 2018

Central Africa, 155, 3.183, 2018

Afghanistan, 154, 3.223, 2018

Tanzania, 153, 3.261, 2018

Rwanda, 152, 3.364, 2018

South Sudan, 156, 2.753, 2017

Central Africa, 155, 3.183, 2017

Afghanistan, 154, 3.293, 2017

Tanzania, 153, 3.331, 2017

Rwanda, 152, 3.344, 2017

Australia

Country, Happiness_rank, Happiness_score, Year

Australia, 11, 7.228, 2019

Australia, 10, 7.128, 2018

Australia, 12, 7.628, 2017

Now I want to create graphs for the change in happiness index over time for the two groupings identified in above: the top 5 plus Australia and then the bottom 5. In addition to this, I want to showall11countriesonthesame graph.

I used the following code:

data have;
set combined;
if Happiness_score <=7.488 then group=top;
else if Happiness_score >=3.334 then group=bottom;
else group=Australia;
run;
title 'Changes in happiness score by group (top and bottom) over time including Australia';
proc sgplot data=combined;
series x=year y=Happiness_score /group = Hapiness_score;
yaxis grid minor minorgrid min=0;
xaxis label='Year';
run;

Could you please help me in this regard. Would be highly appreciated

tarheel13 · Posted 06-16-2022 12:33 PM

I think if you want group to be a character variable, then you would need to wrap the text in quotes.

ballardw · Posted 06-16-2022 01:55 PM

Your code has numerous errors. You are attempting to use variables Top, Bottom and Australia which do not appear in your example data. IF it ran you would assign Australia to either top or bottom group as your last "else" never executes.

And the direction of your comparisons places the lowest scores in the Top category and the range comparison you use sets some countries as both "top" and "bottom". So that whole logic is flaky.

Why are you using "group=Hapiness_score"? That is going to attempt to create a separate "group" for each of your Y variables (if you spell the name correctly. As shown I expect an error of a variable not found.)

This will make a graph with a separate line for each country.

You need to tell use how the Top, Bottom or Australia are intended to be displayed on the graph. Different color? Different line? In the legend?

data combined;
   infile datalines dlmstr=', ';
   input Country :$15. Happiness_rank Happiness_score Year;
   length Display $ 10;
datalines;
Finland, 1, 7.769, 2019
Denmark, 2, 7.6, 2019
Norway, 3, 7.554, 2019
Iceland, 4, 7.494, 2019
Netherlands, 5, 7.488, 2019
Finland, 2, 7.369, 2018
Denmark, 3, 7.4, 2018
Norway, 1, 7.254, 2018
Iceland, 6, 7.394, 2018
Netherlands, 5, 7.188, 2018
Finland, 1, 7.369, 2017
Denmark, 4, 7.4, 2017
Norway, 1, 7.254, 2017
Iceland, 3, 7.394, 2017
Netherlands, 5, 7.188, 2017
South Sudan, 156, 2.853, 2019
Central Africa, 155, 3.083, 2019
Afghanistan, 154, 3.203, 2019
Tanzania, 153, 3.231, 2019
Rwanda, 152, 3.334, 2019
South Sudan, 156, 2.653, 2018
Central Africa, 155, 3.183, 2018
Afghanistan, 154, 3.223, 2018
Tanzania, 153, 3.261, 2018
Rwanda, 152, 3.364, 2018
South Sudan, 156, 2.753, 2017
Central Africa, 155, 3.183, 2017
Afghanistan, 154, 3.293, 2017
Tanzania, 153, 3.331, 2017
Rwanda, 152, 3.344, 2017
Australia, 11, 7.228, 2019
Australia, 10, 7.128, 2018
Australia, 12, 7.628, 2017
; 


proc sgplot data=combined;
   series x=year y=Happiness_score /group = country ;
   yaxis grid minor minorgrid min=0;
   xaxis label='Year' values=(2017, 2018, 2019);
run;

Do want 11 separate series? Then the Group variable would be the Country.

Uddin · Posted 06-19-2022 12:55 AM

Hi Ballardw,

Thank you so much for your kind support. Kind Regards

Uddin · Posted 06-19-2022 12:57 AM

Hi,

Here is the solution I found:

PROC SQL;
CREATE TABLE Top_countries as
SELECT * 
FROM work.combined
WHERE (Country like 'Finland') or (Country like 'Denmark') or (Country like 'Norway') or (Country like 'Iceland') or (Country like 'Netherlands') or (Country like 'Australia');
quit;

title 'Changes in happiness_score by top 5 countries plus Australia over time';
proc sgplot data=Top_countries;
   series x=year y=Happiness_score /group = country;
   run;
   title;

PROC SQL;
CREATE TABLE Bottom_countries as
SELECT * 
FROM work.combined
WHERE (Country like 'South Sudan') or (Country like 'Central African') or (Country like 'Afghanistan') or (Country like 'Tanzania') or (Country like 'Rwanda');
quit;

title 'Changes in happiness_score by bottom 5 countries over time';
proc sgplot data=Bottom_countries;
   series x=year y=Happiness_score /group = country;
   run;
   title;

Thanks everybody for your kind support

Creating a graph based on group