I have a dataset called records where each row is an individual patient (no duplicates) and I have a column for number of patient visits during 2015 and a column for number of patient visits during 2016.
I want to create a new table that only contains patients who had 2 or more (≥2) patient visits during 2015 and/or during 2016.
e.g. include patients who had either:
1. ≥2 visits during 2015 and 0 visits during 2016
or
2. 0 visits during 2015 and ≥ 2 visits during 2016
or
3. 1 visit during 2015 and 1 visit during 2016
This code has to be incorrect:
PROC SQL;
create table records_15_16 as
select *
from records
WHERE (visits15 >= 2 AND visits16 = 0) OR (visits15 = 0 AND visits16 >=2) OR (visits15 = 1 AND visits16 = 1) ;
QUIT;
Okay, ignore the syntax error. I was reading off the column name from Excel and I overlooked that SAS automatically changed visits15.pcp to visits15_pcp
I corrected my code as shown in red.
PROC SQL;
create table records_15_16 as
select *
from records
WHERE (visits15_pcp >= 2 AND visits16_pcp = 0) OR (visits15_pcp = 0 AND visits16_pcp >=2) OR (visits15_pcp >= 1 AND visits16_pcp >= 1) ;
QUIT;
and the proc sql command above gives the same output as the proc sql below:
proc sql;
create table records_15_16 AS
select *, visits15_pcp + visits16_pcp AS total
from records
where calculated total >=2;
quit;
Why do you say it's incorrect? Was the output you got not what you were expecting?
Well my column names are spelled visits15.pcp and visits16.pcp and SAS gives a syntax error for the period in the column name.
Columns names with nonalphanum characters must be specified as name literals, i.e. quoted string followed by the letter n : 'visits16.pcp'n . Name literals can be used anywhere a column name is required.
or would this work?
proc sql;
create table records_15_16 AS
select *, visits15 + visits16 AS total
from records
where calculated total >=2;
quit;
If it was me, I would run the query and then open the dataset and check it out or use proc print and print for the 3 cases you mentioned above.
If the period in column name is an issue, then you might have to rename it.
Well, using the rename option is not working.
DATA records (rename=(visits15.pcp=visits15 visits16.pcp=visits16));
set records_2 ;
RUN;
proc datasets;
contents data=records_2 order=collate;
quit;
Variable name visits15.pcp is not valid.
ERROR: Invalid value for the RENAME option.
Okay, ignore the syntax error. I was reading off the column name from Excel and I overlooked that SAS automatically changed visits15.pcp to visits15_pcp
I corrected my code as shown in red.
PROC SQL;
create table records_15_16 as
select *
from records
WHERE (visits15_pcp >= 2 AND visits16_pcp = 0) OR (visits15_pcp = 0 AND visits16_pcp >=2) OR (visits15_pcp >= 1 AND visits16_pcp >= 1) ;
QUIT;
and the proc sql command above gives the same output as the proc sql below:
proc sql;
create table records_15_16 AS
select *, visits15_pcp + visits16_pcp AS total
from records
where calculated total >=2;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.