BookmarkSubscribeRSS Feed
marghe
Fluorite | Level 6

Hello,

I have been having troubles with density plots in sgplot/sgpanel. My dataset has 4 subgroups and, when I use the density statement with the group option, a density within each group is computed without taking in consideration the proportion of each subgroup with the whole dataset. How can I "resize" these curves, so that each density reflects its actual size with respect to the whole dataset?

Thanks in advance.

 

 

 

5 REPLIES 5
PaigeMiller
Diamond | Level 26

I am not sure I understand the request. But please look at this example and scroll down to the second example, is that what you want? https://documentation.sas.com/?docsetId=grstatproc&docsetTarget=p0o7d7xxvzs9qmn1vctlufpz1448.htm&doc...

 
--
Paige Miller
marghe
Fluorite | Level 6
Thanks for your reply but unfortunately I couldn’t find what I was looking for. I’ll make an example to better explain :
I need to analyse the distribution of a continuous variable (let’s call it X) according to these 4 subgroups:

TRT GENDER #SUBJECTS
A M 20
A F 30
B M 10
B F 40
____
100
If we take the first subgroup “A - M”, for example, the density of X will be computed on a total of 20 subjects instead of 100. As a result from the procedure, we will get 4 density functions conditional to each subgroup; while I would like to have a joint density function on (X,TRT,GENDER) - made up 4 densities. It’s really up to the total used to do computations.
Hope this example gives you some more insight on my question.
Thank you!
PaigeMiller
Diamond | Level 26

Well, probably I still don't understand. Can you explain why the example I mentioned is NOT what you want?

 
--
Paige Miller
ballardw
Super User

@marghe wrote:

Hello,

I have been having troubles with density plots in sgplot/sgpanel. My dataset has 4 subgroups and, when I use the density statement with the group option, a density within each group is computed without taking in consideration the proportion of each subgroup with the whole dataset. How can I "resize" these curves, so that each density reflects its actual size with respect to the whole dataset?

Thanks in advance.


Please at least show the code you have attempted. Without that we have no clue what options you may already be using (or not using) that might be part of a solution. Post the code in a code box opened with the forum's </> or "running man" icon to preserve formatting.

 

And it would help a lot to have a set of data to display.

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the <> icon or attached as text to show exactly what you have and that we can test code against.

 

Only include the variables needed to make the plot.

marghe
Fluorite | Level 6

Thanks for your replies, here is my code (it's easier for me to use a sashelp dataset):

data dataset;
set sashelp.class;
run;

/*Age categorization*/
data class;
set dataset;
if Age<=13 then Age_cat='<=13';
else Age_cat='>13';
run;

/*Sex and Age category combined to get a group with 4 levels*/
data class_group;
set class;
if Sex='M' then do;
if Age_cat='<=13' then cat='M - <=13';
else cat='M - >13';
end;
else do;
if Age_cat='<=13' then cat='F - <=13';
else cat='F - >13';
end;
run;

proc sort data=class_group;
by cat;
run;

/*Density plot by group (Plot1)*/
title 'Height of boys and girls according to age category';
proc sgplot data=class_group;
density height /group=cat type=kernel;
xaxis label='HEIGHT';
run;

/*From class_group, 5 subjects have cat='F - <=13' and the total dataset has 19 subjects*/
/*In a histogram, frequencies are computed within each subgroup (Plot 2)*/
title 'Height of boys and girls according to age category';
proc sgplot data=class_group(where=(cat='F - <=13'));
histogram height /group=cat binwidth=10;
xaxis label='HEIGHT' values=(50 to 70 by 10);
run;

Plot1

Plot1.PNG

Plot 2

Plot2.PNG

The red lines indicate the bars' heights if frequency of 'F - <=13' was computed on the total dataset (19 subjects) instead of the size of the subgroup (5 subjects).

I would like that in the computation of densities for each group these 19 subjects were used, not only those belonging to that particular subgroup. Do I have to use weights?

Hope this clarifies a bit my question, thanks you all!!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 679 views
  • 0 likes
  • 3 in conversation