- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I'm new to SAS and wanted to see if someone can help me with a Frequency Table. The following code is working but
the tables do not look correct for internetuserate and incomeperperson variables. Can someone help me add code to create ranges for these values? (ie. ranges, bins).
Thanks,
Tim
DATA new; set mydata.gapminder;
PROC SORT; by Country;
LABEL internetuserate = " Internet User Rate"
polityscore = " Democracy Score"
incomeperperson = "Income Per Person";
PROC FREQ; TABLES internetuserate polityscore incomeperperson;
RUN;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@mulltr66 wrote:
The following code is working but the tables do not look correct for internetuserate and incomeperperson variables.
We don't know what this means. Can you explain further about what is wrong; and show us the exact thing that is incorrect so we can see what you are seeing?
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
attached is the data set. I'm trying to create buckets for two variables. example below for internetuseragerate variable
PROC SORT; by Country;
proc format;
value internetuserate_adj
low - 10 = '1'
11 - 20 = '2'
21 - 30 = '3'
31 - 40 = '4'
41 - 50 = '5'
51 - 60 = '6'
61 - 70 = '7'
71 - 80 = '8'
81 - 90 = '9'
91 - 100 = '10'
format internetuserate internetuserate_adj; * applies the format
LABEL internetuserate = " Internet User Rate "
PROC FREQ; TABLES internetuserate_adj;
RUN;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@mulltr66 wrote:
attached is the data set. I'm trying to create buckets for two variables. example below for internetuseragerate variable
PROC SORT; by Country;proc format;
value internetuserate_adj
low - 10 = '1'
11 - 20 = '2'
21 - 30 = '3'
31 - 40 = '4'
41 - 50 = '5'
51 - 60 = '6'
61 - 70 = '7'
71 - 80 = '8'
81 - 90 = '9'
91 - 100 = '10'
format internetuserate internetuserate_adj; * applies the format
LABEL internetuserate = " Internet User Rate "
PROC FREQ; TABLES internetuserate_adj;
RUN;
If you ran that code as shown you have lots of problems.
There isn't anywhere I see that you have told Proc Freq to use the format with the variable. And your Format statement above is incorrect as you need to have the variable followed by the format name and a PERIOD at the end of the format name when used.
If you have a statement like:
Format somevarname ;
without an actual format name then you are clearing the previously assigned format, if any, to a default for the variable type. For numeric variables that is usually BEST12.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
internetuserate has distinct values. I'm trying to put the results into a 10 bucket frequency table.
I had the naming wrong because I though I needed to create a variable
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here's an example of how to create a format to bin your data. You can run this code on your system and see the output.
/*this is an example of creating a custom format and then applying it to a data set*/
*create the format;
proc format;
value age_group
low - 13 = 'Pre-Teen'
13 - 15 = 'Teen'
16 - high = 'Adult';
run;
title 'Example of an applied format';
proc print data=sashelp.class;
format age age_group.; *applies the format;
run;
data class;
set sashelp.class;
age_category = put(age, age_group.); *creates a character variable with the age category;
label age_category = 'Age Category'; *adds a nice label for the printed output;
run;
title 'Example of creating a new variable with the format';
proc print data=class label;
run;
*show format used directly;
proc freq data=sashelp.class;
table age / out= formatted_age;
format age age_group.;
run;
https://github.com/statgeek/SAS-Tutorials/blob/master/proc_format_example.sas
@mulltr66 wrote:
Hello,
I'm new to SAS and wanted to see if someone can help me with a Frequency Table. The following code is working but
the tables do not look correct for internetuserate and incomeperperson variables. Can someone help me add code to create ranges for these values? (ie. ranges, bins).
Thanks,
Tim
DATA new; set mydata.gapminder;
PROC SORT; by Country;
LABEL internetuserate = " Internet User Rate"
polityscore = " Democracy Score"
incomeperperson = "Income Per Person";
PROC FREQ; TABLES internetuserate polityscore incomeperperson;
RUN;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The answers that have been posted contain the right information. I'm just not sure if, as a new user, the explanations might be difficult to digest.
Here is the change in context:
PROC FREQ;
TABLES internetuserate;
format internetuserate internetuserate_adj. ;
LABEL internetuserate = " Internet User Rate ";
RUN;
This code assumes that you have already run PROC FORMAT to create the format named internetuserate_adj.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
yes just difficult as a new user. I got the following to run but the results have different values than just 1-10. I think decimals are causing the issue. Do you know how to do greater and less than in the proc format;
LIBNAME mydata "/courses/d1406ae5ba27fe300 " access=readonly;
DATA new; set mydata.gapminder;
proc format;
value internetuserate_adj
low - 10 = '1'
11 - 20 = '2'
21 - 30 = '3'
31 - 40 = '4'
41 - 50 = '5'
51 - 60 = '6'
61 - 70 = '7'
71 - 80 = '8'
81 - 90 = '9'
91 - 100 = '10';
PROC FREQ;
TABLES internetuserate;
format internetuserate internetuserate_adj. ;
LABEL internetuserate = " Internet User Rate ";
RUN;
help was much appreciated.
Thanks,
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Once you know the syntax, it's very do-able. However, defining the problem comes first.
What groups should these values fall into?
10.001
0
-99
Here's the strangest part of the PROC FORMAT syntax:
proc format;
value internetuserate_adj
low - 10 = '1'
10 <- 20 = '2'
20 <- 30 = '3'
30 <- 40 = '4'
40 <- 50 = '5'
50 <- 60 = '6'
60 <- 70 = '7'
70 <- 80 = '8'
80 <- 90 = '9'
90 <- 100 = '10';
run;
The strange placement of "<" means (for example) that 80 exactly is part of group "8" and 80.001 is part of group "9".
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
10 would go in bucket 2
no negative numbers
0 would go into bucket 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If 10 exactly should go into bucket 2, a slight change is required:
proc format;
value internetuserate_adj
low - < 10 = '1'
10 - < 20 = '2'
20 - < 30 = '3'
30 - < 40 = '4'
40 - < 50 = '5'
50 - < 60 = '6'
60 - < 70 = '7'
70 - < 80 = '8'
80 - < 90 = '9'
90 - 100 = '10';
run;
In general, it's a mistake to make assumptions about what will or won't be in the data. But if you are confident about the values all falling in the range of 0 to 100, that part is your decision.