Hi.
I am reviewing SAS from the book Learning SAS by example. I am trying to use everything without looking in the book. However, on this problem, several issues came up.
1. Why can't I include the assignment statement 'combined'' from the Set learn.blood?
2. Why don't the output statement generate subset_A and subset_B as written in the code below? When I proc print subset_?A and subset_B SAS tells me that they do not exist. Thanks for your help.
Libname Review'/folders/myfolders/Review' ; Libname Learn'/folders/myfolders/Learn' ; Libname myformat'/folders/myfolders/sasuser.v94' ; Options fmtsearch=(myformat) ; Data Review.Prob10_1 ; Set Learn.Blood ; Combined = Sum(.001*WBC + RBC) ; Where Gender = 'Female' AND BloodType = 'AB' ; Output = Subset_A ; Where Gender = 'Female' AND BloodType = 'AB' AND Combined ge 14 ; Output = Subset_B ; run ; Proc Print data=Subset_A noobs ; run ; Proc Print data=Subset_B noobs ; run ;
use if instead of where
Where Gender = 'Female' AND BloodType = 'AB' AND Combined ge 14 ; ERROR: Variable Combined is not on file LEARN.BLOOD.
if Gender = 'Female' AND BloodType = 'AB' AND Combined ge 14 ;
Data Review.Prob10_1 subset_a subset_b ; Set Learn.Blood ; Combined = Sum(.001*WBC + RBC) ; if Gender = 'Female' AND BloodType = 'AB' ; Output Subset_A ; else if Gender = 'Female' AND BloodType = 'AB' AND Combined ge 14 ; Output Subset_B ; run ;
I should also include the original data set learn.blood. It is below.
data learn.blood; length Gender $ 6 BloodType $ 2 AgeGroup $ 5; input Subject Gender BloodType AgeGroup WBC RBC Chol; label Gender = "Gender" BloodType = "Blood Type" AgeGroup = "Age Group" Chol = "Cholesterol"; Datalines ; 1 Female AB Young 7710 7.4 258 2 Male AB Old 6560 4.7 . 3 Male A Young 5690 7.53 184 4 Male B Old 6680 6.85 . 5 Male A Young . 7.72 187 6 Male A Old 6140 3.69 142 7 Female A Young 6550 4.78 290 8 Male O Old 5200 4.96 151 9 Male O Young . 5.66 311 10 Female O Young 7710 5.55 . 11 Male B Young . 5.62 152 12 Female O Young 7410 5.85 241 13 Male O Young 5780 4.37 .
Below is the LOG:
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 61 62 Libname Review'/folders/myfolders/Review' ; NOTE: Libref REVIEW was successfully assigned as follows: Engine: V9 Physical Name: /folders/myfolders/Review 63 Libname Learn'/folders/myfolders/Learn' ; NOTE: Libref LEARN refers to the same physical library as LEARN2. NOTE: Libref LEARN was successfully assigned as follows: Engine: V9 Physical Name: /folders/myfolders/Learn 64 Libname myformat'/folders/myfolders/sasuser.v94' ; NOTE: Libref MYFORMAT refers to the same physical library as SASUSER. NOTE: Libref MYFORMAT was successfully assigned as follows: Engine: V9 Physical Name: /folders/myfolders/sasuser.v94 65 Options fmtsearch=(myformat) ; 66 67 Data Review.Prob10_1 ; 68 Set Learn.Blood ; 69 Combined = Sum(.001*WBC + RBC) ; 70 Where Gender = 'Female' AND BloodType = 'AB' ; 71 Output = Subset_A ; 72 Where Gender = 'Female' AND BloodType = 'AB' AND Combined ge 14 ; ERROR: Variable Combined is not on file LEARN.BLOOD. 73 Output = Subset_B ; 74 run ; NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set REVIEW.PROB10_1 may be incomplete. When this step was stopped there were 0 observations and 11 variables. WARNING: Data set REVIEW.PROB10_1 was not replaced because this step was stopped. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.01 seconds 75 76 77 Proc Print data=Subset_A noobs ; ERROR: File WORK.SUBSET_A.DATA does not exist. 78 run ; NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE PRINT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 79 80 Proc Print data=Subset_B noobs ; ERROR: File WORK.SUBSET_B.DATA does not exist. 81 run ; NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE PRINT used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 82 83 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 96
use if instead of where
Where Gender = 'Female' AND BloodType = 'AB' AND Combined ge 14 ; ERROR: Variable Combined is not on file LEARN.BLOOD.
if Gender = 'Female' AND BloodType = 'AB' AND Combined ge 14 ;
Data Review.Prob10_1 subset_a subset_b ; Set Learn.Blood ; Combined = Sum(.001*WBC + RBC) ; if Gender = 'Female' AND BloodType = 'AB' ; Output Subset_A ; else if Gender = 'Female' AND BloodType = 'AB' AND Combined ge 14 ; Output Subset_B ; run ;
Several features that you will need to learn ...
You can only output to data sets that you are actually creating. For example, if you want to output to Subset_A and Subset_B, your DATA statement needs to say:
data subset_a subset_b;
The WHERE statement can only refer to variables that exist in the incoming data, not to variable that you create along the way. When using multiple WHERE statements, the second one replaces the first. The first has absolutely no impact on the program.
The OUTPUT statement doesn't use an equal sign.
The idea of converting to IF/THEN is a good one. For example (assuming all the other mistakes are corrected):
if Gender = 'Female' AND BloodType = 'AB' then output Subset_A;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.