BookmarkSubscribeRSS Feed
mulltr66
Calcite | Level 5

Hello,

 

I'm new to SAS and wanted to see if someone can help me with a Frequency Table.  The following code is working but

the tables do not look correct for internetuserate and incomeperperson variables.  Can someone help me add code to create ranges for these values?  (ie. ranges, bins).

 

Thanks,

 

Tim

 

 

DATA new; set mydata.gapminder;
PROC SORT; by Country;
LABEL internetuserate = " Internet User Rate"
polityscore = " Democracy Score"
incomeperperson = "Income Per Person";

PROC FREQ; TABLES internetuserate polityscore incomeperperson;
RUN;

  

11 REPLIES 11
PaigeMiller
Diamond | Level 26

@mulltr66 wrote:

The following code is working but the tables do not look correct for internetuserate and incomeperperson variables.  


We don't know what this means. Can you explain further about what is wrong; and show us the exact thing that is incorrect so we can see what you are seeing?

--
Paige Miller
mulltr66
Calcite | Level 5

attached is the data set.  I'm trying to create buckets for two variables.  example below for internetuseragerate variable

 


PROC SORT; by Country;

proc format;
value internetuserate_adj
low - 10 = '1'
11 - 20 = '2'
21 - 30 = '3'
31 - 40 = '4'
41 - 50 = '5'
51 - 60 = '6'
61 - 70 = '7'
71 - 80 = '8'
81 - 90 = '9'
91 - 100 = '10'


format internetuserate internetuserate_adj; * applies the format


LABEL internetuserate = " Internet User Rate "



PROC FREQ; TABLES internetuserate_adj;
RUN;

ballardw
Super User

@mulltr66 wrote:

attached is the data set.  I'm trying to create buckets for two variables.  example below for internetuseragerate variable

 


PROC SORT; by Country;

proc format;
value internetuserate_adj
low - 10 = '1'
11 - 20 = '2'
21 - 30 = '3'
31 - 40 = '4'
41 - 50 = '5'
51 - 60 = '6'
61 - 70 = '7'
71 - 80 = '8'
81 - 90 = '9'
91 - 100 = '10'


format internetuserate internetuserate_adj; * applies the format


LABEL internetuserate = " Internet User Rate "



PROC FREQ; TABLES internetuserate_adj;
RUN;


If you ran that code as shown you have lots of problems.

There isn't anywhere I see that you have told Proc Freq to use the format with the variable. And your Format statement above is incorrect as you need to have the variable followed by the format name and a PERIOD at the end of the format name when used.

 

If you have a statement like:

Format somevarname ;

without an actual format name then you are clearing the previously assigned format, if any, to a default for the variable type. For numeric variables that is usually BEST12.

Reeza
Super User
Your format and label statement need to be moved into your PROC FREQ proc. See the examples I posted above.
mulltr66
Calcite | Level 5

internetuserate has distinct values.  I'm trying to put the results into a 10 bucket frequency table.

 

I had the naming wrong because I though I needed to create a variable

Reeza
Super User

Here's an example of how to create a format to bin your data. You can run this code on your system and see the output.

 

/*this is an example of creating a custom format and then applying it to a data set*/

*create the format;
proc format;
value age_group
low - 13 = 'Pre-Teen'
13 - 15 = 'Teen'
16 - high = 'Adult';
run;

title 'Example of an applied format';
proc print data=sashelp.class;
format age age_group.; *applies the format;
run;


data class;
set sashelp.class;
age_category = put(age, age_group.); *creates a character variable with the age category;
label age_category = 'Age Category'; *adds a nice label for the printed output;
run;

title 'Example of creating a new variable with the format';
proc print data=class label;
run;

*show format used directly;

proc freq data=sashelp.class;
table age / out= formatted_age;
format age age_group.;
run;

https://github.com/statgeek/SAS-Tutorials/blob/master/proc_format_example.sas

 


@mulltr66 wrote:

Hello,

 

I'm new to SAS and wanted to see if someone can help me with a Frequency Table.  The following code is working but

the tables do not look correct for internetuserate and incomeperperson variables.  Can someone help me add code to create ranges for these values?  (ie. ranges, bins).

 

Thanks,

 

Tim

 

 

DATA new; set mydata.gapminder;
PROC SORT; by Country;
LABEL internetuserate = " Internet User Rate"
polityscore = " Democracy Score"
incomeperperson = "Income Per Person";

PROC FREQ; TABLES internetuserate polityscore incomeperperson;
RUN;

  


 

Astounding
PROC Star

The answers that have been posted contain the right information.  I'm just not sure if, as a new user, the explanations might be difficult to digest.  


Here is the change in context:

 


PROC FREQ; 
TABLES internetuserate;
format internetuserate internetuserate_adj. ; 
LABEL internetuserate = " Internet User Rate ";
RUN;

This code assumes that you have already run PROC FORMAT to create the format named internetuserate_adj.

mulltr66
Calcite | Level 5

yes just difficult as a new user.  I got the following to run but the results have different values than just 1-10.   I think decimals are causing the issue.  Do you know how to do greater and less than in the proc format;

 

LIBNAME mydata "/courses/d1406ae5ba27fe300 " access=readonly;
DATA new; set mydata.gapminder;


proc format;
value internetuserate_adj
low - 10 = '1'
11 - 20 = '2'
21 - 30 = '3'
31 - 40 = '4'
41 - 50 = '5'
51 - 60 = '6'
61 - 70 = '7'
71 - 80 = '8'
81 - 90 = '9'
91 - 100 = '10';


PROC FREQ;
TABLES internetuserate;
format internetuserate internetuserate_adj. ;
LABEL internetuserate = " Internet User Rate ";

RUN;

 

help was much appreciated.

 

Thanks,

 

Tim

Astounding
PROC Star

Once you know the syntax, it's very do-able.  However, defining the problem comes first.

 

What groups should these values fall into?

 

10.001

0

-99

 

Here's the strangest part of the PROC FORMAT syntax:

proc format;
value internetuserate_adj
low - 10 = '1'
10 <- 20 = '2'
20 <- 30 = '3'
30 <- 40 = '4'
40 <- 50 = '5'
50 <- 60 = '6'
60 <- 70 = '7'
70 <- 80 = '8'
80 <- 90 = '9'
90 <- 100 = '10';
run;

The strange placement of "<" means (for example) that 80 exactly is part of group "8" and 80.001 is part of group "9".

mulltr66
Calcite | Level 5

10 would go in bucket 2

 

no negative numbers

 

0 would go into bucket 1

Astounding
PROC Star

If 10 exactly should go into bucket 2, a slight change is required:

 

proc format;
value internetuserate_adj
low - < 10 = '1'
10 - < 20 = '2'
20 - < 30 = '3'
30 - < 40 = '4'
40 - < 50 = '5'
50 - < 60 = '6'
60 - < 70 = '7'
70 - < 80 = '8'
80 - < 90 = '9'
90 - 100 = '10';
run;

In general, it's a mistake to make assumptions about what will or won't be in the data.  But if you are confident about the values all falling in the range of 0 to 100, that part is your decision.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 1316 views
  • 2 likes
  • 5 in conversation