BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
_maldini_
Barite | Level 11

I'd like to create a line graph showing the proportion of respondents answering "Yes" to past month tobacco use, by age category, over time. Basically I am trying to create a graph like this one: 

IMG_7992.jpg

 

These instructions are helpful, but I'm not sure how to manipulate the data before running the PROC SGPLOT.

When I run this syntax using the data as is... 


PROC SGPLOT DATA = input;
   SERIES X = year Y = past_month_use / group=age_cat;
   SERIES X = year Y = past_month_use / group=age_cat;
   SERIES X = year Y = past_month_use / group=age_cat;
RUN; 

I get this: 

Screen Shot 2022-03-31 at 11.46.17 AM.png

 

How do I need to manipulate these data so that I can produce the desired outcome?

 

Thank you. 

 

Sample data below:
4 vars:
id (1-20 respondents)
age_cat (1,2,3)
past_month_use (0=no, 1=yes)
year (2018-2020)

 

id age_cat past_month_use year
1 1 1 2018
1 1 1 2019
1 1 1 2020
1 2 0 2021
2 2 0 2018
2 2 0 2019
2 2 1 2020
2 2 1 2021
3 3 1 2018
3 3 1 2019
3 3 1 2020
3 3 1 2021
4 2 1 2018
4 3 1 2019
4 3 1 2020
4 3 1 2021
5 1 0 2018
5 1 0 2019
5 1 0 2020
5 1 0 2021
6 3 0 2018
6 3 1 2019
6 3 0 2020
6 3 1 2021
7 2 1 2018
7 2 1 2019
7 2 1 2020
7 2 1 2021
8 2 0 2018
8 2 1 2019
8 3 0 2020
8 3 1 2021
9 3 1 2018
9 3 1 2019
9 3 1 2020
9 3 1 2021
10 1 1 2018
10 1 0 2019
10 1 0 2020
10 1 0 2021
11 2 0 2018
11 2 0 2019
11 2 1 2020
11 2 1 2021
12 1 1 2018
12 1 0 2019
12 1 1 2020
12 2 1 2021
13 3 1 2018
13 3 1 2019
13 3 1 2020
13 3 1 2021
14 2 1 2018
14 2 1 2019
14 2 0 2020
14 3 0 2021
15 1 0 2018
15 1 0 2019
15 1 1 2020
15 1 1 2021
16 2 1 2018
16 2 0 2019
16 2 0 2020
16 2 0 2021
17 2 0 2018
17 3 1 2019
17 3 1 2020
17 3 1 2021
18 2 1 2018
18 2 1 2019
18 2 1 2020
18 2 1 2021
19 3 1 2018
19 3 1 2019
19 3 1 2020
19 3 1 2021
20 1 1 2018
20 2 1 2019
20 2 1 2020
20 2 1 2021
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

You need to summarize the data to create the proportion.

Using a 1/0 coded value when you use the mean you get the percentage (proportion) of 1 values. I used proc summary, other methods are possible, to create data set with the summary and the other values.

The Xaxis and Yaxis statements to make the X values "nicer" and to force 0 into the graph for the Y value axis.

The format is very optional but sets a nice appearance. Labels on your variables would get nicer axis labels.

 

data have;
  input id age_cat past_month_use year;
datalines;
1 1 1 2018
1 1 1 2019
1 1 1 2020
1 2 0 2021
2 2 0 2018
2 2 0 2019
2 2 1 2020
2 2 1 2021
3 3 1 2018
3 3 1 2019
3 3 1 2020
3 3 1 2021
4 2 1 2018
4 3 1 2019
4 3 1 2020
4 3 1 2021
5 1 0 2018
5 1 0 2019
5 1 0 2020
5 1 0 2021
6 3 0 2018
6 3 1 2019
6 3 0 2020
6 3 1 2021
7 2 1 2018
7 2 1 2019
7 2 1 2020
7 2 1 2021
8 2 0 2018
8 2 1 2019
8 3 0 2020
8 3 1 2021
9 3 1 2018
9 3 1 2019
9 3 1 2020
9 3 1 2021
10 1 1 2018
10 1 0 2019
10 1 0 2020
10 1 0 2021
11 2 0 2018
11 2 0 2019
11 2 1 2020
11 2 1 2021
12 1 1 2018
12 1 0 2019
12 1 1 2020
12 2 1 2021
13 3 1 2018
13 3 1 2019
13 3 1 2020
13 3 1 2021
14 2 1 2018
14 2 1 2019
14 2 0 2020
14 3 0 2021
15 1 0 2018
15 1 0 2019
15 1 1 2020
15 1 1 2021
16 2 1 2018
16 2 0 2019
16 2 0 2020
16 2 0 2021
17 2 0 2018
17 3 1 2019
17 3 1 2020
17 3 1 2021
18 2 1 2018
18 2 1 2019
18 2 1 2020
18 2 1 2021
19 3 1 2018
19 3 1 2019
19 3 1 2020
19 3 1 2021
20 1 1 2018
20 2 1 2019
20 2 1 2020
20 2 1 2021
;

proc summary data=have nway;
   class age_cat year;
   var past_month_use ;
   output out=summary mean=;
run;

proc sgplot data=summary;
   series x=year y=past_month_use/ group=age_cat;
   xaxis values=(2018 to 2021 by 1) ;
   yaxis values=(0 to 1 by .1);
   format past_month_use percent6.;
run;

View solution in original post

4 REPLIES 4
yabwon
Onyx | Level 15

Do you mean something like this:

data have;
input id age_cat past_month_use year;
cards;
1 1 1 2018
1 1 1 2019
1 1 1 2020
1 2 0 2021
2 2 0 2018
2 2 0 2019
2 2 1 2020
2 2 1 2021
3 3 1 2018
3 3 1 2019
3 3 1 2020
3 3 1 2021
4 2 1 2018
4 3 1 2019
4 3 1 2020
4 3 1 2021
5 1 0 2018
5 1 0 2019
5 1 0 2020
5 1 0 2021
6 3 0 2018
6 3 1 2019
6 3 0 2020
6 3 1 2021
7 2 1 2018
7 2 1 2019
7 2 1 2020
7 2 1 2021
8 2 0 2018
8 2 1 2019
8 3 0 2020
8 3 1 2021
9 3 1 2018
9 3 1 2019
9 3 1 2020
9 3 1 2021
10 1 1 2018
10 1 0 2019
10 1 0 2020
10 1 0 2021
11 2 0 2018
11 2 0 2019
11 2 1 2020
11 2 1 2021
12 1 1 2018
12 1 0 2019
12 1 1 2020
12 2 1 2021
13 3 1 2018
13 3 1 2019
13 3 1 2020
13 3 1 2021
14 2 1 2018
14 2 1 2019
14 2 0 2020
14 3 0 2021
15 1 0 2018
15 1 0 2019
15 1 1 2020
15 1 1 2021
16 2 1 2018
16 2 0 2019
16 2 0 2020
16 2 0 2021
17 2 0 2018
17 3 1 2019
17 3 1 2020
17 3 1 2021
18 2 1 2018
18 2 1 2019
18 2 1 2020
18 2 1 2021
19 3 1 2018
19 3 1 2019
19 3 1 2020
19 3 1 2021
20 1 1 2018
20 2 1 2019
20 2 1 2020
20 2 1 2021
;
run;

proc sort data = have;
  by year age_cat past_month_use id;
run;

data want;
  set have;
  by year age_cat;
  if first.age_cat then
    do;
      d = 0;
      n = 0;
    end;

  d + past_month_use;
  n + 1;

  if last.age_cat then
    do;
      prop = divide(d,n);
      if prop then output;
      format prop percent10.2;
    end;
run;

proc sgplot data = want;
  series x = year y = prop / group=age_cat;
  xaxis integer;
run;

?

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



ballardw
Super User

You need to summarize the data to create the proportion.

Using a 1/0 coded value when you use the mean you get the percentage (proportion) of 1 values. I used proc summary, other methods are possible, to create data set with the summary and the other values.

The Xaxis and Yaxis statements to make the X values "nicer" and to force 0 into the graph for the Y value axis.

The format is very optional but sets a nice appearance. Labels on your variables would get nicer axis labels.

 

data have;
  input id age_cat past_month_use year;
datalines;
1 1 1 2018
1 1 1 2019
1 1 1 2020
1 2 0 2021
2 2 0 2018
2 2 0 2019
2 2 1 2020
2 2 1 2021
3 3 1 2018
3 3 1 2019
3 3 1 2020
3 3 1 2021
4 2 1 2018
4 3 1 2019
4 3 1 2020
4 3 1 2021
5 1 0 2018
5 1 0 2019
5 1 0 2020
5 1 0 2021
6 3 0 2018
6 3 1 2019
6 3 0 2020
6 3 1 2021
7 2 1 2018
7 2 1 2019
7 2 1 2020
7 2 1 2021
8 2 0 2018
8 2 1 2019
8 3 0 2020
8 3 1 2021
9 3 1 2018
9 3 1 2019
9 3 1 2020
9 3 1 2021
10 1 1 2018
10 1 0 2019
10 1 0 2020
10 1 0 2021
11 2 0 2018
11 2 0 2019
11 2 1 2020
11 2 1 2021
12 1 1 2018
12 1 0 2019
12 1 1 2020
12 2 1 2021
13 3 1 2018
13 3 1 2019
13 3 1 2020
13 3 1 2021
14 2 1 2018
14 2 1 2019
14 2 0 2020
14 3 0 2021
15 1 0 2018
15 1 0 2019
15 1 1 2020
15 1 1 2021
16 2 1 2018
16 2 0 2019
16 2 0 2020
16 2 0 2021
17 2 0 2018
17 3 1 2019
17 3 1 2020
17 3 1 2021
18 2 1 2018
18 2 1 2019
18 2 1 2020
18 2 1 2021
19 3 1 2018
19 3 1 2019
19 3 1 2020
19 3 1 2021
20 1 1 2018
20 2 1 2019
20 2 1 2020
20 2 1 2021
;

proc summary data=have nway;
   class age_cat year;
   var past_month_use ;
   output out=summary mean=;
run;

proc sgplot data=summary;
   series x=year y=past_month_use/ group=age_cat;
   xaxis values=(2018 to 2021 by 1) ;
   yaxis values=(0 to 1 by .1);
   format past_month_use percent6.;
run;
_maldini_
Barite | Level 11

Thank you!

Can you accomplish the same thing using PROC FREQ instead of PROC SUMMARY?

 

 

ballardw
Super User

@_maldini_ wrote:

Thank you!

Can you accomplish the same thing using PROC FREQ instead of PROC SUMMARY?

 

 


Suggestion is always "try and see".

The main thing with proc freq is it COUNTS all formatted values. So you would end up with counts/percents of 0 as well as 1. Also Proc Freq shifts percentages calculated by *100 so the Y axis values change.

 

proc freq data=have noprint;
  tables age_cat*year*past_month_use / outpct out=freqout;

run;

proc sgplot data=freqout;
   where past_month_use=1;
   series x=year y=pct_row/ group=age_cat;
   xaxis values=(2018 to 2021 by 1) ;
   yaxis values=(0 to 100 by 10);
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1602 views
  • 3 likes
  • 3 in conversation