Hi,
I have 3-set data (2017-2019), and I need to make it into long data set. I made it (combined) and then I list the happiness ranks and happiness scores for Australia and the countries that were in the top 5 or the bottom 5.
Top Country:
Country, Happiness_rank, Happiness_score, Year
Finland, 1, 7.769, 2019
Denmark, 2, 7.6, 2019
Norway, 3, 7.554, 2019
Iceland, 4, 7.494, 2019
Netherlands, 5, 7.488, 2019
Finland, 2, 7.369, 2018
Denmark, 3, 7.4, 2018
Norway, 1, 7.254, 2018
Iceland, 6, 7.394, 2018
Netherlands, 5, 7.188, 2018
Finland, 1, 7.369, 2017
Denmark, 4, 7.4, 2017
Norway, 1, 7.254, 2017
Iceland, 3, 7.394, 2017
Netherlands, 5, 7.188, 2017
Bottom 5:
Country, Happiness_rank, Happiness_score, Year
South Sudan, 156, 2.853, 2019
Central Africa, 155, 3.083, 2019
Afghanistan, 154, 3.203, 2019
Tanzania, 153, 3.231, 2019
Rwanda, 152, 3.334, 2019
South Sudan, 156, 2.653, 2018
Central Africa, 155, 3.183, 2018
Afghanistan, 154, 3.223, 2018
Tanzania, 153, 3.261, 2018
Rwanda, 152, 3.364, 2018
South Sudan, 156, 2.753, 2017
Central Africa, 155, 3.183, 2017
Afghanistan, 154, 3.293, 2017
Tanzania, 153, 3.331, 2017
Rwanda, 152, 3.344, 2017
Australia
Country, Happiness_rank, Happiness_score, Year
Australia, 11, 7.228, 2019
Australia, 10, 7.128, 2018
Australia, 12, 7.628, 2017
Now I want to create graphs for the change in happiness index over time for the two groupings identified in above: the top 5 plus Australia and then the bottom 5. In addition to this, I want to showall11countriesonthesame graph.
I used the following code:
data have;
set combined;
if Happiness_score <=7.488 then group=top;
else if Happiness_score >=3.334 then group=bottom;
else group=Australia;
run;
title 'Changes in happiness score by group (top and bottom) over time including Australia';
proc sgplot data=combined;
series x=year y=Happiness_score /group = Hapiness_score;
yaxis grid minor minorgrid min=0;
xaxis label='Year';
run;
Could you please help me in this regard. Would be highly appreciated
I think if you want group to be a character variable, then you would need to wrap the text in quotes.
Your code has numerous errors. You are attempting to use variables Top, Bottom and Australia which do not appear in your example data. IF it ran you would assign Australia to either top or bottom group as your last "else" never executes.
And the direction of your comparisons places the lowest scores in the Top category and the range comparison you use sets some countries as both "top" and "bottom". So that whole logic is flaky.
Why are you using "group=Hapiness_score"? That is going to attempt to create a separate "group" for each of your Y variables (if you spell the name correctly. As shown I expect an error of a variable not found.)
This will make a graph with a separate line for each country.
You need to tell use how the Top, Bottom or Australia are intended to be displayed on the graph. Different color? Different line? In the legend?
data combined; infile datalines dlmstr=', '; input Country :$15. Happiness_rank Happiness_score Year; length Display $ 10; datalines; Finland, 1, 7.769, 2019 Denmark, 2, 7.6, 2019 Norway, 3, 7.554, 2019 Iceland, 4, 7.494, 2019 Netherlands, 5, 7.488, 2019 Finland, 2, 7.369, 2018 Denmark, 3, 7.4, 2018 Norway, 1, 7.254, 2018 Iceland, 6, 7.394, 2018 Netherlands, 5, 7.188, 2018 Finland, 1, 7.369, 2017 Denmark, 4, 7.4, 2017 Norway, 1, 7.254, 2017 Iceland, 3, 7.394, 2017 Netherlands, 5, 7.188, 2017 South Sudan, 156, 2.853, 2019 Central Africa, 155, 3.083, 2019 Afghanistan, 154, 3.203, 2019 Tanzania, 153, 3.231, 2019 Rwanda, 152, 3.334, 2019 South Sudan, 156, 2.653, 2018 Central Africa, 155, 3.183, 2018 Afghanistan, 154, 3.223, 2018 Tanzania, 153, 3.261, 2018 Rwanda, 152, 3.364, 2018 South Sudan, 156, 2.753, 2017 Central Africa, 155, 3.183, 2017 Afghanistan, 154, 3.293, 2017 Tanzania, 153, 3.331, 2017 Rwanda, 152, 3.344, 2017 Australia, 11, 7.228, 2019 Australia, 10, 7.128, 2018 Australia, 12, 7.628, 2017 ; proc sgplot data=combined; series x=year y=Happiness_score /group = country ; yaxis grid minor minorgrid min=0; xaxis label='Year' values=(2017, 2018, 2019); run;
Do want 11 separate series? Then the Group variable would be the Country.
Hi,
Here is the solution I found:
PROC SQL;
CREATE TABLE Top_countries as
SELECT *
FROM work.combined
WHERE (Country like 'Finland') or (Country like 'Denmark') or (Country like 'Norway') or (Country like 'Iceland') or (Country like 'Netherlands') or (Country like 'Australia');
quit;
title 'Changes in happiness_score by top 5 countries plus Australia over time';
proc sgplot data=Top_countries;
series x=year y=Happiness_score /group = country;
run;
title;
PROC SQL;
CREATE TABLE Bottom_countries as
SELECT *
FROM work.combined
WHERE (Country like 'South Sudan') or (Country like 'Central African') or (Country like 'Afghanistan') or (Country like 'Tanzania') or (Country like 'Rwanda');
quit;
title 'Changes in happiness_score by bottom 5 countries over time';
proc sgplot data=Bottom_countries;
series x=year y=Happiness_score /group = country;
run;
title;
Thanks everybody for your kind support
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.