BookmarkSubscribeRSS Feed
KatieMessi
Calcite | Level 5

I'm using SAS Studio online, I"m having trouble setting the character length and in the last if statement of my loop also won't work, any help would be greatly appreciated.

This is my code...

 

Data golf;
infile '/home/c153469110/golf.txt';
input Distance_yards 18-22
Golfer_ID 28-29
Brand $ 39;
run;

proc print data=golf;
run;

Data golf;

set golf;
length Distance_Catagory $20;


if Distance_Yards<290 then do;
Distance_Catagory='Poor';
end;

if Distance_Yards>=290 and Distance_Yards<=310 then do;
Distance_Catagory='Average';
end;

if Distance_Yards<=310 then do;
Distance Catagory='Above Average';
end;


run;
4 REPLIES 4
r_behata
Barite | Level 11

Your if and else if conditions should be mutually exclusive :

 

Data golf;
	
	set golf;
	length Distance_Catagory $20;
	 
	 
	 if Distance_Yards<290 then do;
	 Distance_Catagory='Poor';
	 end;
	 
	 if 290 <= Distance_Yards <=310 then do;
	 Distance_Catagory='Average';
	 end;
	
	 if Distance_Yards > 310 then do;
	 Distance_Catagory='Above Average';
	 end;
	 
	 
run;
	
Reeza
Super User

Just going to add two things to this.

 

When you code with the input and output data set having the same name, it becomes harder to debug and you destroy your source data so you have to re-read it in again. For small data sets this is fine, for larger ones it can be quite problematic. And Category is spelled incorrectly 😞

 

Data golf; *choose a different name to avoid overwriting your original data;
	
	set golf;
	length Distance_Catagory $20; *spelled incorrectly;

@KatieMessi wrote:

I'm using SAS Studio online, I"m having trouble setting the character length and in the last if statement of my loop also won't work, any help would be greatly appreciated.

This is my code...

 

Data golf;
infile '/home/c153469110/golf.txt';
input Distance_yards 18-22
Golfer_ID 28-29
Brand $ 39;
run;

proc print data=golf;
run;

Data golf;

set golf;
length Distance_Catagory $20;


if Distance_Yards<290 then do;
Distance_Catagory='Poor';
end;

if Distance_Yards>=290 and Distance_Yards<=310 then do;
Distance_Catagory='Average';
end;

if Distance_Yards<=310 then do;
Distance Catagory='Above Average';
end;


run;

 

Shmuel
Garnet | Level 18

1) As your longest  Distance_yards is 'Above Average' then its minimum required length is $13.

    Defining length as $20 will not take space of 20 char at print but 13 chars only.

 

2) The proposed code by @r_behata can be shorten to:

Data golf;
	set golf;
	length Distance_Catagory $20;
	 
	 if Distance_Yards <  290 then Distance_Catagory='Poor'; else
	 if Distance_Yards <= 310 then Distance_Catagory='Average'; else
	                               Distance Catagory='Above Average';
run;

 

ballardw
Super User

For a large number of uses you do not need to add a variable to accomplish this:

	 
	 if Distance_Yards<290 then do;
	 Distance_Catagory='Poor';
	 end;
	 
	 if Distance_Yards>=290 and Distance_Yards<=310 then do;
	 Distance_Catagory='Average';
	 end;
	
	 if Distance_Yards<=310 then do;
	 Distance Catagory='Above Average';
	 end;

when a single variable needs to be displayed differently or groups created a format is often a more flexible tool. Please see:

 

proc format libray=work;
value gd
0   - < 290 = 'Poor'
290 -   310 = 'Average'
310<- high  = 'Above Average'
;
run;

proc freq data=golf;
   tables distance_yards;
   format distance_yards gd.;
run;

One of the major flexibilities is that if I decide that I want to have a 5 category response I create a new format and then use the new format with the analysis or data display. No need to go back through the data to create another variable. Formats may also be made from data sets. So you could possibly use something like Proc Rank to create 10 or 15 groups.

 

With numeric values the < are analogous to greater than or less than depending on location

0 - < 290  means [0, 290) where 0 is included and values up to but not including 290

290 - 310 means [290,310]  the endpoints are included

310 <- high means (310, infinity) are included.

The key words LOW and HIGH are used for the smallest and largest values SAS will recognize.

This approach also makes it easy to ensure non-overlapping groups.

 

BTW you can rewrite

Distance_Yards>=290 and Distance_Yards<=310 

as either of these forms

290 <= Distance_yards <= 310
290 le Distance_yards le 310

which many programming languages don't allow.

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1145 views
  • 3 likes
  • 5 in conversation