Lesson 4 creating new columns practice p104p04.sas

melissagodfrey · Posted 01-15-2019 04:40 PM

Instructions in Green

***********************************************************;
* LESSON 4, PRACTICE 4 *;
* a) Create a new column named SqMiles by multiplying *;
* Acres by .0015625. *;
* b) Create a new column named Camping as the sum of *;
* OtherCamping, TentCampers, RVCampers, and *;
* BackcountryCampers. *;
* c) Format SqMiles and Camping to include commas and *;
* zero decimal places. *;
* d) Modify the KEEP statement to include the new *;
* columns. Run the program. *;
***********************************************************;

My answer in Pink:

data np_summary_update;
set pg1.np_summary;
keep Reg ParkName DayVisits OtherLodging Acres SqMiles Camping;
*Add assignment statements;
SqMiles=Acres*.0015625;
Camping=sum(OtherCamping, TentCampers, RVCampers, BackcountryCampers);
format SqMiles Camping comma.
run;

What is the difference between format SqMiles Camping comma. and the solution format below?

format SqMiles comma6. Camping comma10.;

It appears comma. is more flexible as it will simply add commas to the number and the . with no number after it assures no decimals. Why would you need to specify the number of digits ex comma6. ?

Thanks!

Reeza · Posted 01-15-2019 07:21 PM

A format takes the form:

formatNameW.d

formatName is the formatName, ie comma

W is the width of the variable that tells SAS how many digits or character to show

d is decimal portion, and how many decimals to show.

Note that if you have a width of 10 and d=3 that the decimal is actually 4 spaces, so you have

F10.3 that means you have a number of the form ######.###

This is all documented here

https://support.sas.com/documentation/cdl/en/leforinforref/64790/HTML/default/viewer.htm#n134ahpcz8m...

PS. Instead of using colour to highlight your code please consider using the code blocks (7th icon in editor) which ensures your code is posted correctly.

@melissagodfrey wrote:

Instructions in Green

***********************************************************;
* LESSON 4, PRACTICE 4 *;
* a) Create a new column named SqMiles by multiplying *;
* Acres by .0015625. *;
* b) Create a new column named Camping as the sum of *;
* OtherCamping, TentCampers, RVCampers, and *;
* BackcountryCampers. *;
* c) Format SqMiles and Camping to include commas and *;
* zero decimal places. *;
* d) Modify the KEEP statement to include the new *;
* columns. Run the program. *;
***********************************************************;

My answer in Pink:

data np_summary_update;
set pg1.np_summary;
keep Reg ParkName DayVisits OtherLodging Acres SqMiles Camping;
*Add assignment statements;
SqMiles=Acres*.0015625;
Camping=sum(OtherCamping, TentCampers, RVCampers, BackcountryCampers);
format SqMiles Camping comma.
run;

What is the difference between format SqMiles Camping comma. and the solution format below?
format SqMiles comma6. Camping comma10.; 
It appears comma. is more flexible as it will simply add commas to the number and the . with no number after it assures no decimals. Why would you need to specify the number of digits ex comma6. ?

Thanks!

ballardw · Posted 01-15-2019 07:25 PM

If you read the documentation for the COMMAw.d format it has an important section regarding W:

w

specifies the width of the output field.

Default 6

Which means that if you do not specify a W value that the format will attempt to display the value in only 6 characters, ie COMMA6.

When you use Comma10. then it allows for more characters in display.

You need to always consider what you want to display for output. If you want decimals and the value is large enough if your width, or W, is not large enough to include the digits before the decimal, the commas (or parantheses, % sign or what have you ) and the decimal itself then the result will get truncated.

Consider this code (look in the log for the result):

data _null_;
   x=123456789.45678;
   put x=comma6. x=comma10. x=f13.5 x=f15.5 x=comma15.5 x=comma19.5;
run;

The result shows the same value displayed with a number of formats. You might find the 1st and 5th very interesting as SAS takes alternate approaches to "best fitting" the displayed value to your request. Also see the note about format too small.

Remember also then when you have a list of variables that the first defined format is applied to all of the preceding variables.

Every format has a default length and sometimes you may find funny results such as 2E3 instead of an expected 2019 because the format has been set to BEST3. Which reduces the number of spaces allowed so SAS goes to scientific notation.

melissagodfrey · Posted 01-15-2019 08:18 PM

wonderful thank you that was exactly the answer i was looking for, it makes sense that the default is 6 digits, now i understand why they allow you to specify. thanks a bunch!

Lesson 4 creating new columns practice p104p04.sas

Re: Lesson 4 creating new columns practice p104p04.sas

Re: Lesson 4 creating new columns practice p104p04.sas

w

Re: Lesson 4 creating new columns practice p104p04.sas

Lesson 4 creating new columns practice p104p04.sas

Re: Lesson 4 creating new columns practice p104p04.sas

Re: Lesson 4 creating new columns practice p104p04.sas

w

Re: Lesson 4 creating new columns practice p104p04.sas

SAS Innovate 2025: Call for Content