BookmarkSubscribeRSS Feed
unwashedhelimix
Obsidian | Level 7

I have a concatenated data set from 2 data sets that looks something like this:

 

 

ID     Gender   Starting_Weight    End_Weight

1        M            170                       180
2        M            182                       175
3        M            155                       162
4        M            203                       196
5        F            124                       135
6        F            152                       155
7        F            118                       133
8        F            160                       152

There are more observations, but what I want to do is remove all rows where the change between the weights is less than 10. Here is my atttempt:

data WeightsCombined;
	set WeightsMale WeightsFemale;
	Weight_change = sum(-Starting_Weight, End_Weight);
	if Weight_change<10 then delete;
run;

proc print data = WeightsCombined;
run;

For some reason, this outputs nothing. Why is an output not showing and how can I remove the rows that meet my condition?

 

3 REPLIES 3
Patrick
Opal | Level 21

Your logic should at least delete some rows from the sample data you've provided. 

If it's just about change then I'd be using the abs() function. To not mask cases with missings I wouldn't use the sum() function.

data weight_data;
  input ID Gender $ Starting_Weight End_Weight;
  del_flg=  0<=abs(Starting_Weight-End_Weight)<10;
/*  if 0<=abs(Starting_Weight-End_Weight)<10 then delete;*/
  datalines;
1 M 170 180
2 M 182 175
3 M 155 162
4 M 203 196
5 F 124 135
6 F 152 155
7 F 118 133
8 F 160 152
9 F . 152
;
run;

proc print data=weight_data;
run;
A_Kh
Lapis Lazuli | Level 10

Another approach using ABS function. If we need to read only observations where change is greater than or equals 10 then  where statement is more efficient in terms of system processing. 

data want;
	set have;
	where abs(starting_weight - end_weight) ge 10;
run; 
Tom
Super User Tom
Super User

Your SET statement looks wrong to me.

set WeightsMale WeightsFemale;

Do you really have the data in two separate datasets?  Do they have the same variables? 

 

Why use the SUM() function?  That will yield invalid results when only one of the readings is missing.

 

Let's save the intermediate calculations into variables.

data have;
  input ID Gender :$1. Starting_Weight End_Weight;
cards;
1 M 170 180
2 M 182 175
3 M 155 162
4 M 203 196
5 F 124 135
6 F 152 155
7 F 118 133
8 F 160 152
;

data want;
  set have;
  weight_gain = end_weight - starting_weight;
  lessthan10 = weight_gain < 10;
run;

Here is the result for your example:

Tom_0-1729511830097.png

Which observations do you want to delete?  Your current logic would eliminate all but 3 of the observations, the ones where LESSTHAN10 variable has value 0.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 297 views
  • 0 likes
  • 4 in conversation