Hi, I'm new to SAS and I'm struggling with this homework problem. Any help is greatly appreciated! Thank you.
1. For each department, use a two-way table to show both the number of employees whose salar01 is above 30,000
and also the number of employees whose salarie01 is not above 30,000;
Employee_id Name gender years dept salary01 salary02 salary03; 1 Mitchell, Jane A f 6 shoe 22,450 23,000 26,600 2 Miller, Frances T f 8 appliance . 32,500 33,000 3 Evans, Richard A m 9 appliance 42,900 43,900 . 4 Fair, Suzanne K f 3 clothing 29,700 32,900 34,500 5 Meyers, Thomas D m 5 appliance 33,700 34,400 37,000 6 Rogers, Steven F m 3 shoe 27,000 27,800 . 7 Anderson, Frank F m 5 clothing 33,000 35,100 36,000 10 Baxter, David T m 2 shoe 23,900 . 31,300 11 Wood, Brenda L f 3 clothing 33,000 34,000 35,700 12 Wheeler, Vickie M f 7 appliance 31,500 33,200 35,600 13 Hancock, Sharon T f 1 clothing 21,000 . 22,500 14 Looney, Roger M m 10 appliance 42,900 36,200 37,800 15 Fry, Marie E f 6 clothing 29,700 30,500 31,200
You should post code on this forum in a code box opened with either the {I} or "running man" icon to prevent the message windows from reformatting text. If your code runs in you SAS system then likely there are spaces that were reformatted out by the message window.
You aren't using a long enough informat for dept to hold the entire value.
One way to group continuous variables is to create a format to display a range of values with given text. An example that should work:
proc format library=work; value sal low - 30000= '30k or less' 30000 < high='Greater than 30k' ; run; proc freq data= aa; tables dept*salary01; format salary01 sal.; run;
Or create variable and conditionally assign a value based on the range you want:<this does not run as the data step input code is malformed>
data aa; infile datalines; input Employee_id Name $17. gender$ years dept$ salary01: comma6. salary02: comma6. salary03: comma6.; if salary01 > 30000 then Salaryrange = 'Greater than 30k'; else Salaryrange = '30k or less'; datalines; 1 Mitchell, Jane A f 6 shoe 22,450 23,000 26,600 2 Miller, Frances T f 8 appliance . 32,500 33,000 3 Evans, Richard A m 9 appliance 42,900 43,900 . 4 Fair, Suzanne K f 3 clothing 29,700 32,900 34,500 5 Meyers, Thomas D m 5 appliance 33,700 34,400 37,000 6 Rogers, Steven F m 3 shoe 27,000 27,800 . 7 Anderson, Frank F m 5 clothing 33,000 35,100 36,000 10 Baxter, David T m 2 shoe 23,900 . 31,300 11 Wood, Brenda L f 3 clothing 33,000 34,000 35,700 12 Wheeler, Vickie M f 7 appliance 31,500 33,200 35,600 13 Hancock, Sharon T f 1 clothing 21,000 . 22,500 14 Looney, Roger M m 10 appliance 42,900 36,200 37,800 15 Fry, Marie E f 6 clothing 29,700 30,500 31,200 ; run; proc freq data= aa; tables dept*salaryrange; run;
Procs Report and Tabulate will also do this using formats for the groups.
You can create code to manually count the values but and output them as a data step + Proc print but that's not the best use of the SAS system or your time. Proc Freq will have things in the output you may not want, such as a bunch of percentages and row/column summaries. The syntax to suppress them is pretty simple to find in the Tables statement options for Proc Freq in the documentation.
You should post code on this forum in a code box opened with either the {I} or "running man" icon to prevent the message windows from reformatting text. If your code runs in you SAS system then likely there are spaces that were reformatted out by the message window.
You aren't using a long enough informat for dept to hold the entire value.
One way to group continuous variables is to create a format to display a range of values with given text. An example that should work:
proc format library=work; value sal low - 30000= '30k or less' 30000 < high='Greater than 30k' ; run; proc freq data= aa; tables dept*salary01; format salary01 sal.; run;
Or create variable and conditionally assign a value based on the range you want:<this does not run as the data step input code is malformed>
data aa; infile datalines; input Employee_id Name $17. gender$ years dept$ salary01: comma6. salary02: comma6. salary03: comma6.; if salary01 > 30000 then Salaryrange = 'Greater than 30k'; else Salaryrange = '30k or less'; datalines; 1 Mitchell, Jane A f 6 shoe 22,450 23,000 26,600 2 Miller, Frances T f 8 appliance . 32,500 33,000 3 Evans, Richard A m 9 appliance 42,900 43,900 . 4 Fair, Suzanne K f 3 clothing 29,700 32,900 34,500 5 Meyers, Thomas D m 5 appliance 33,700 34,400 37,000 6 Rogers, Steven F m 3 shoe 27,000 27,800 . 7 Anderson, Frank F m 5 clothing 33,000 35,100 36,000 10 Baxter, David T m 2 shoe 23,900 . 31,300 11 Wood, Brenda L f 3 clothing 33,000 34,000 35,700 12 Wheeler, Vickie M f 7 appliance 31,500 33,200 35,600 13 Hancock, Sharon T f 1 clothing 21,000 . 22,500 14 Looney, Roger M m 10 appliance 42,900 36,200 37,800 15 Fry, Marie E f 6 clothing 29,700 30,500 31,200 ; run; proc freq data= aa; tables dept*salaryrange; run;
Procs Report and Tabulate will also do this using formats for the groups.
You can create code to manually count the values but and output them as a data step + Proc print but that's not the best use of the SAS system or your time. Proc Freq will have things in the output you may not want, such as a bunch of percentages and row/column summaries. The syntax to suppress them is pretty simple to find in the Tables statement options for Proc Freq in the documentation.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.