I'm trying to create a table using specific variables from a dataset. The issue I'm having is that I need to use a certain subset of one specific variable and cannot figure out how.
Here is a sample of the original dataset:
DATA Study;
INPUT SSN $11. WtLb BMI SBP CODCd $13.;
DATALINES;
999-99-9999 150 22.7 176 Stroke
000-00-0000 130 26.5 160 Heart Failure
...
;
For the table, I want to use only the values where CODCd is Heart Failure.
Here is the code I already have, which essentially only formats the table:
TITLE1 "Subjects Dead from Heart Failure";
PROC PRINT DATA = Study
ID SSN;
VAR WtLb BMI SBP;
RUN;
TITLE;
How can I print only the observations where CODCd is Heart Failure? I'm not sure if it matters, but in the actual original dataset the variable CODCd is numeric.
@sdevenny wrote:
How can I print only the observations where CODCd is Heart Failure? I'm not sure if it matters, but in the actual original dataset the variable CODCd is numeric.
Hi @sdevenny,
This sounds like "Heart Failure" is only the formatted value of the numeric variable CODCd. In this case use the PUT function and the appropriate format name (maybe something like CODCdFmt, see PROC CONTENTS output) in the WHERE statement:
where put(CODCd, CODCdFmt.)='Heart Failure';
Or define the subset in a preliminary DATA step
data hf / view=hf;
set study;
if vvalue(CODCd)='Heart Failure';
run;
and then use hf (without a WHERE statement) in the PROC PRINT step:
proc print data=hf ...; ...
TITLE1 "Subjects Who Died from Heart Failure"; PROC PRINT DATA = Study SPLIT = '*'; Where CODCd ="Heart Failure"; LABEL WtLb = 'Subject*Weight' BMI = 'Body*Mass*Index' SBP = 'Systolic*Blood*Pressure' SSN = 'Soc Sec Number'; ID SSN; VAR WtLb BMI SBP; RUN; TITLE;
The Where statement is available in many procedures to process only records where a condition is true.
I've tried the WHERE statement, and got this in the log:
ERROR: WHERE clause operator requires compatible variables.
@sdevenny wrote:
I've tried the WHERE statement, and got this in the log:
ERROR: WHERE clause operator requires compatible variables.
Your example data has the variable of concern as a character variable. When you misrepresent your data it makes it much harder to come up with proper answers.
@sdevenny wrote:
How can I print only the observations where CODCd is Heart Failure? I'm not sure if it matters, but in the actual original dataset the variable CODCd is numeric.
Hi @sdevenny,
This sounds like "Heart Failure" is only the formatted value of the numeric variable CODCd. In this case use the PUT function and the appropriate format name (maybe something like CODCdFmt, see PROC CONTENTS output) in the WHERE statement:
where put(CODCd, CODCdFmt.)='Heart Failure';
Or define the subset in a preliminary DATA step
data hf / view=hf;
set study;
if vvalue(CODCd)='Heart Failure';
run;
and then use hf (without a WHERE statement) in the PROC PRINT step:
proc print data=hf ...; ...
Hello!
A WHERE statement is necessary. You will want to make sure you use whatever unformatted value you want for CODc.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.