06-05-2013 03:14 PM
I have typically used the first last statement for unique counting. I wanted to ask if there was a way to use the first last statemetn for duplicate counting as well? For example, patientA visited clinic1 clinic2 clinic3 clinic1 clinic4 clinic1 (for a total visit of 6 visits for patientA) in month1 and PatientB visitted clinic2 clinic3 clinic1 clinic2 clinic2 (total 5 visits) in month1 for patientB and PatientC visitted clinic1 clinic3 clinic1 clinic2 clinic2 (total 5 visits) in month1 for patientC. I want to know the number of visits by month.
So I would have in month1 column
6 visits=1 patient.
Can I use first last or is there an alternative way to approach this?
06-05-2013 03:47 PM
Data looks like this but at a more grander scale. The dollar amounts are co-pays but Im not interested in that. I want to know how many patients made 5 visits to clinics or 6 visits to clinics etc....
06-05-2013 03:41 PM
You could use first. and last. variables, if your data were sorted by MONTH PATIENT. It wouldn't be a bad exercise to try that. But another important tool to learn is that PROC FREQ is built to count, and can create output data sets. This report won't be pretty, but it will have the information in it:
proc freq data=have;
tables month * patient / noprint out=counts;
proc freq data=counts;
tables month * count;
label count='Patients with this Number of Visits';
06-05-2013 03:44 PM
To add to Astounding's idea, if you don't have the month but have a datevalue associate a format such as MONYY. to do the grouping without having to add additional varialbes.
If you data set is of long enough time you probably want include the year also and the above will take care of it.
06-05-2013 03:55 PM
Yes I have considered proc freq but i am interested in the number of patients making the total number of clinic visits per month.
In my data example above I would want to know
there were 2 patients visited the clinic 5 times in January and 1 patient visiting 6 clinics in January.
5 clinic visits= 2 patients
6 clinic visits=1 patient
06-05-2013 04:11 PM
Yes, that's what the second PROC FREQ gives you. If you actually have a separate file for each month, but don't have a variable called MONTH, the program becomes easier:
proc freq data=january;
tables patient / noprint out=counts;
proc freq data=counts;
title 'Number of Patients with N Visits in January';
06-05-2013 04:18 PM
Ohh double wow! Thanks Astounding- thats an even easier approach. No separate file for each month- so how would I produce the counts by month using the proc freq procedure?
06-05-2013 04:40 PM
Could add month easy as well
proc summary noprint nway n
data = visits ;
class month patient ;
output out = want (
drop = _type_
rename = (_freq_ = visits)) ;
06-05-2013 05:46 PM
Thank you for responding. This one isnt quite what I had in mind. I did not need each patient displayed. I need the number of patients and the number of clinics visited each month. So below would read- two patients made 5 clinic visits and 1 patient made 6 clinic visits.
number of visits number of patients
06-05-2013 04:49 PM
To include MONTH as a variable, just use the first posted solution. You might want to play with the format of the output from the second PROC FREQ, such as:
tables month * count / LIST;
Steve, that's just my bias. I prefer to use PROC FREQ to count and PROC SUMMARY to generate statistics. But there is an area where their capabilities overlap. At any rate, you would need a second summarization of your output data set to count how many patients had 5 visits, how many had 6 visits, etc.
06-05-2013 05:49 PM
Astounding- I just realized that the frequency unduplicated the counts (counted a clinic only once). I need every clinic counted for each patient. Do you think I can still use the frequency procedure? .
06-05-2013 03:55 PM
Yes, it can be done with .last variables. I assumed the data has the structure you presented :
length patient visit $10;
infile datalines missover;
input patient visit @;
do while (not missing(visit));
input visit @;
patientA clinic1 clinic2 clinic3 clinic1 clinic4 clinic1
PatientB clinic2 clinic3 clinic1 clinic2 clinic2
PatientC clinic1 clinic3 clinic1 clinic2 clinic2
set visits; by patient notsorted;
nbVisits = sum(nbVisits, 1);
proc sort data=nbVisits; by nbVisits; run;
set nbVisits; by nbVisits;
nbPatients = sum(nbPatients, 1);
proc print noobs; run;