Write a program that calculates the total number of records (all screening records in the file), the total number of partial records (Vstatus), the total number of complete records (Vstatus), and the total number of complete records that are eligible and ineligible (ELIG) for each site (SITE). The idea here is not to create separate datasets/subsets for each of these, but to produce a single dataset with these totals.
Vstatus is coded as “Complete” or “Partial”
ELIG is coded as 0 (no) or 1 (yes)
SITE is coded as “City A” or “City B”
My professor is very specific about how she wants this done but when I write my code all my values for the "Status" variable change to "Complete" and the output is simply counting 1-50
My code:
PROC SORT DATA = sarah.screener_data; by site; RUN; DATA TOTVSTATUS; SET Sarah.screener_data; COUNT + 1; BY Site; IF Vstatus = partial then COUNT=0; If vstatus = complete then count= 1; RUN; DATA TOTVSTATUS2; SET TOTVSTATUS; BY Vstatus Elig Site; IF LAST.Vstatus ; If Last.elig; If last.site; RUN; PROC SORT DATA = sarah.screener_data; by Vstatus Elig site; RUN; DATA TOTVSTATUS; SET Sarah.screener_data; COUNT + 1; BY Vstatus Elig Site; IF FIRST.Vstatus then COUNT=1; RUN; DATA TOTVSTATUS2; SET TOTVSTATUS; BY Vstatus Elig Site; IF LAST.Vstatus ; run;
... View more