I have a dataset of hospital visits, an identifier for each unique patient, and the diagnosis codes related to that visit. The code flags any records with a binary 0/1 indicator for a visit that has any injury-related DX codes.
I would like to sum the number of visits for each unique patient and keep it as a static value so I can tell which patients had 1, 2, 3+ etc. visits in the dataset.
I've tired Proc SQL, but it ends up counting even records without any injury flags as one; I can't figure out how to make it conditionally sum. I've also tried using the first. and .last data step approach, but that removes records from my dataset that I need to keep.
PROC FREQ is the tool for counting. You can use the WHERE statement in PROC FREQ to tell it to not count records with specific injury flags.
If you have one observation per VISIT your SQL should work.
Otherwise if you have unique visit ID you could count the number of distinct visits where the condition is true by nesting a CASE clause inside the aggregate function.
count(distinct case when cause_any_inj then visitid else null end ) as cause_inj_cnt
And using just BY group processing in a data step will not remove any observations, unless you have a DELETE statement, or a subsetting IF statement, or a conditional OUTPUT statement.
Can you share a simple example dataset (does not need to be the real data) and the expected result for that input.
Can you make a little example, showing the data you have, and the desired output.
Your SQL code will error because you have the option NOSQLREMERGE but the select * will trigger a remerge. If you change the SELECT statement to select only ID and aggregated variables, your code looks reasonable to me.
Are you trying to create a new variable that is not shown in your code?
data have ;
input ID cause_any_inj nat_any_inj;
cards ;
1 0 1
1 1 1
2 0 1
2 0 1
;
proc sql noremerge;
create table want as
select
ID
,sum(cause_any_inj) as cause_inj_cnt
,sum(nat_any_inj) as nat_inj_cnt
from have
group by ID
;
quit ;
proc print data=want ;
run ;
<Pedantic mode:on>
Sun in your subject line and Count in your question are quite different mathematical concepts.
<Pedantice mode:off>
It is a good idea to describe, better provide some example, the starting data and the desired result from the given data. You should also include the LOG when you get errors such as you code will throw.
Copy from the log, open a text box using the </> icon that appears above the main message window and paste the copied text. Example:
24 proc sql nosqlremerge;
------------
1
WARNING 1-322: Assuming the symbol NOREMERGE was misspelled as
nosqlremerge.
25 create table junk as
26 select *,
27 sum(age) as agesum
28 from sashelp.class
29 group by sex
30 ;
ERROR: The query requires remerging summary statistics back with the
original data. This is disallowed due to the NOREMERGE proc option
or NOSQLREMERGE system option.
31 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
My version of SAS also doesn't like the option you specified as NOSQLREMERGE but does use the NOREMERGE option and attempts the code with that option and then errors because of the required remerege.
Removing the noremerge to run the code:
48 proc sql ;
49 create table junk as
50 select *,
51 sum(age) as agesum
52 from sashelp.class
53 group by sex
54 ;
NOTE: The query requires remerging summary statistics back with the
original data.
NOTE: Table WORK.JUNK created, with 19 rows and 6 columns.
55 quit;
You can copy the code from the above and run it on your system as you should have the SASHELP.CLASS data set available.
Does your desired output data want to have the same "sum" value repeated for each value of the groupid variable? If you look at the above created data set JUNK you will see 9 values of 119 for the Agesum variable and 10 of 134 .
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.