I have a concatenated data set from 2 data sets that looks something like this:
ID Gender Starting_Weight End_Weight 1 M 170 180 2 M 182 175 3 M 155 162 4 M 203 196 5 F 124 135 6 F 152 155 7 F 118 133 8 F 160 152
There are more observations, but what I want to do is remove all rows where the change between the weights is less than 10. Here is my atttempt:
data WeightsCombined; set WeightsMale WeightsFemale; Weight_change = sum(-Starting_Weight, End_Weight); if Weight_change<10 then delete; run; proc print data = WeightsCombined; run;
For some reason, this outputs nothing. Why is an output not showing and how can I remove the rows that meet my condition?
Your logic should at least delete some rows from the sample data you've provided.
If it's just about change then I'd be using the abs() function. To not mask cases with missings I wouldn't use the sum() function.
data weight_data;
input ID Gender $ Starting_Weight End_Weight;
del_flg= 0<=abs(Starting_Weight-End_Weight)<10;
/* if 0<=abs(Starting_Weight-End_Weight)<10 then delete;*/
datalines;
1 M 170 180
2 M 182 175
3 M 155 162
4 M 203 196
5 F 124 135
6 F 152 155
7 F 118 133
8 F 160 152
9 F . 152
;
run;
proc print data=weight_data;
run;
Another approach using ABS function. If we need to read only observations where change is greater than or equals 10 then where statement is more efficient in terms of system processing.
data want;
set have;
where abs(starting_weight - end_weight) ge 10;
run;
Your SET statement looks wrong to me.
set WeightsMale WeightsFemale;
Do you really have the data in two separate datasets? Do they have the same variables?
Why use the SUM() function? That will yield invalid results when only one of the readings is missing.
Let's save the intermediate calculations into variables.
data have;
input ID Gender :$1. Starting_Weight End_Weight;
cards;
1 M 170 180
2 M 182 175
3 M 155 162
4 M 203 196
5 F 124 135
6 F 152 155
7 F 118 133
8 F 160 152
;
data want;
set have;
weight_gain = end_weight - starting_weight;
lessthan10 = weight_gain < 10;
run;
Here is the result for your example:
Which observations do you want to delete? Your current logic would eliminate all but 3 of the observations, the ones where LESSTHAN10 variable has value 0.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.