Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- How to divide the frequency of a variable's category

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 06-28-2018 01:19 PM
(2869 views)

Hi everyone,

I have an age group variable that I am looking at... they have categories as follows:

We need to create a category of 0 to 5 years, but unfortunately 5 year olds are lumped with the 6 year olds. How do I create a variable or even a category within the same variable (AGEGRP), where the **FREQUENCY** is divided by 2? (we have a rationale as to why we are doing this). Such that the number of 5 year olds show up with a frequency of ~9435?

Thank you very much!!

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

See if this gets you started.

Proc freq data=raw noprint; tables agegrp/out=work.summary; run; data work.temp; set work.summary; if agegrp=2 then do; count5= floor(count/2); count6= count-count5; count = count5; agegrp=1; output; count= count6; agegrp=2; output; end; else output; end; proc summary data=work.temp nway; class agegrp; var count; output out=want (drop= _:) sum=; run; /* WANT will have two variable: agegrp and the count(or frequency) per agegrp with the 5 years combined to the agegrp=1 and agegrp=2 only the 6 year olds Create a new format to show agegrp=1 as 0-5, agegrp=2 as 6 */ Proc freq data=want; tables agegrp; format agegrp thenewformat.; weight count; run;

Since you haven't provided SAS variable names, data sets I'm guessing. If you don't have a sas data set then you need to read that csv.

Your desired output shows that at some point you have set the age group 88 to "missing" I suspect.

You will need to create a new format to show the new age group meanings for agegrp 1 and 2.

10 REPLIES 10

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@kneeduh wrote:

Hi everyone,

I have an age group variable that I am looking at... they have categories as follows:

We need to create a category of 0 to 5 years, but unfortunately 5 year olds are lumped with the 6 year olds. How do I create a variable or even a category within the same variable (AGEGRP), where the

FREQUENCYis divided by 2? (we have a rationale as to why we are doing this). Such that the number of 5 year olds show up with a frequency of ~9435?

Thank you very much!!

I can't make heads or tails out of this. Show us the desired output, and explain the steps to get there.

By the way, it's nearly impossible to decipher "How do I create a variable or even a category ... " because variables and categories are different things.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Sorry - am fairly new when it comes to working with SAS.

Essentially I want the frequency of 0 to 5 year olds to equal 55249.5, this would be derived from adding the number of 0 to 4 year olds (45815) with half of the number of 5 to 6 year olds (18869/2).

So far, this is a code that is getting me part way there.

data children;

set folder.census2006;

if AGEGRP = 1 then oldcount04 = 1;

if AGEGRP ^= 1 then oldcount04 = 0;

if AGEGRP = 2 then oldcount56 = 1;

if AGEGRP ^= 2 then oldcount56 = 0;

run;

data children;

set children;

newcount05= oldcount04+round(oldcount56/2);

run;

PROC FREQ DATA=CHILDREN;

TABLES AGEGRP oldcount04 oldcount56 newcount05;

RUN;

However- seems that the 'oldcount56' variable is not dividing by 2 successfully. As a result, I'm seeing 0 to 4 year olds just adding together with the entire 5 to 6 year old category.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@kneeduh wrote:

Sorry - am fairly new when it comes to working with SAS.

Essentially I want the frequency of 0 to 5 year olds to equal 55249.5, this would be derived from adding the number of 0 to 4 year olds (45815) with half of the number of 5 to 6 year olds (18869/2).

Reading this literally, you want a single calculation of 45815 + 18869/2.

However, I get the feeling you want a table with many calculations. Do you?

Anyway, I asked you to SHOW us the desired output. Type it in. The only thing you have shown us is this single calculation. We're not going to be able to provide code for this table (if that's what you want) unless you SHOW us what you want. And I don't really think we have to show you how to code that single calculation.

So I await your reply.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I've attached the output here. Hope this is what you meant.

Thanks!

I actually only want this calculation above. I simply need the number of 0 to 5 year olds to show up, so I use this it subset my sample and get rid of anyone above the age of 5. I just cannot get the frequency to reflect the 5-6 year age group to be divided by 2, for the addition. 0 to 4 year old and 0 to 5 year old groups add together instead.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

So the output you want, as shown in sas_output.docx, has additional columns that are not in the input data, these additional columns are percent, cumulative frequency and cumulative percent. Nowhere do I see an output that moves the 5 yr olds into a 0 to 5 category, nor do I see what would happen to the other categories.

You haven't shown me the desired output, as I understand what you are asking. You have shown me what is probably an output from PROC FREQ given your existing data, not what I asked for.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@kneeduh wrote:

Hi everyone,

I have an age group variable that I am looking at... they have categories as follows:

We need to create a category of 0 to 5 years, but unfortunately 5 year olds are lumped with the 6 year olds. How do I create a variable or even a category within the same variable (AGEGRP), where the

FREQUENCYis divided by 2? (we have a rationale as to why we are doing this). Such that the number of 5 year olds show up with a frequency of ~9435?

Thank you very much!!

The first thing is to start with a data set. That is a picture. What does your data actually look like?

Personally if the base data that generated that "report" does not have individual ages then you get to have some fun.

One possibly defensible approach is to assume that proportion of ages within each age group is uniform then 5 year olds would be half of the 5 and 6 group. But Having worked with single age US Census population estimates I know that is extremely unlikely.

A possibly better approach if this data is for the US is to find the single age population estimates for the appropriate year and state, calculate the proportion between 5 and 6 from that file to apply it to your data.

How to do that would depend on what your data set looks like.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks for all of your input! Dividing by 2 is fairly similar to the number of children born in Canada in 2001 (who would be 5 years in 2006), so that was why we just decided to stick with dividing by 2. But I would certainly be open to any further suggestions/advice you have!

I am struggling to upload the dataset as the file is far too large. Could you possibly advise on how I can go around this?

Thanks again!

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Giving this a shot

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

See if this gets you started.

Proc freq data=raw noprint; tables agegrp/out=work.summary; run; data work.temp; set work.summary; if agegrp=2 then do; count5= floor(count/2); count6= count-count5; count = count5; agegrp=1; output; count= count6; agegrp=2; output; end; else output; end; proc summary data=work.temp nway; class agegrp; var count; output out=want (drop= _:) sum=; run; /* WANT will have two variable: agegrp and the count(or frequency) per agegrp with the 5 years combined to the agegrp=1 and agegrp=2 only the 6 year olds Create a new format to show agegrp=1 as 0-5, agegrp=2 as 6 */ Proc freq data=want; tables agegrp; format agegrp thenewformat.; weight count; run;

Since you haven't provided SAS variable names, data sets I'm guessing. If you don't have a sas data set then you need to read that csv.

Your desired output shows that at some point you have set the age group 88 to "missing" I suspect.

You will need to create a new format to show the new age group meanings for agegrp 1 and 2.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you so very much! This helped immensely. I really appreciate it!

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.