I have following dataset "Have". I want to create a table "want" where the ID is only counted once in a category ClassStartQuarter. If the ClassStartQuarter is missing then the individual did not take a class and therefore do not count in any quarter. Proc freq will give duplicated count, is there other option to get a distinct count for each quarter?
Thank you.
| Dataset: Have | |
| ID | ClassStartQuarter |
| 1 | |
| 1 | |
| 1 | |
| 1 | |
| 1 | |
| 2 | 2017Q3 |
| 2 | 2017Q3 |
| 2 | 2017Q3 |
| 2 | 2017Q3 |
| 3 | 2018Q1 |
| 3 | 2018Q1 |
| 3 | 2018Q1 |
| 3 | 2018Q1 |
| 3 | 2018Q1 |
| 3 | 2018Q1 |
| 3 | 2018Q1 |
| 3 | 2018Q1 |
| 4 | 2018Q2 |
| 4 | 2018Q2 |
| 4 | 2018Q2 |
| 4 | 2018Q2 |
| 4 | 2018Q2 |
| 4 | 2018Q2 |
| 5 | 2015Q4 |
| 5 | 2017Q1 |
| 5 | 2017Q3 |
| 5 | 2018Q1 |
| 5 | 2015Q4 |
| 5 | 2017Q1 |
| 5 | 2017Q3 |
| 5 | 2018Q1 |
| 5 | 2015Q4 |
| 5 | 2017Q1 |
| 5 | 2017Q3 |
| 5 | 2018Q1 |
| 5 | 2015Q4 |
| 5 | 2017Q1 |
| 5 | 2017Q3 |
| 5 | 2018Q1 |
| 5 | 2015Q4 |
| 5 | 2017Q1 |
| 5 | 2017Q3 |
| 5 | 2018Q1 |
| 5 | 2015Q4 |
| 5 | 2017Q1 |
| 5 | 2017Q3 |
| 5 | 2018Q1 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 6 | 2017Q2 |
| 7 | 2017Q2 |
| 7 | 2017Q3 |
| 7 | 2018Q3 |
| 7 | 2017Q2 |
| 7 | 2017Q3 |
| 7 | 2018Q3 |
| 7 | 2017Q2 |
| 7 | 2017Q3 |
| 7 | 2018Q3 |
| 7 | 2017Q2 |
| 7 | 2017Q3 |
| 7 | 2018Q3 |
| 7 | 2017Q2 |
| 7 | 2017Q3 |
| 7 | 2018Q3 |
| 8 | |
| 8 | |
| 8 | |
| 8 | |
| 8 | |
| 8 | |
| 9 | 2016Q1 |
| 9 | 2017Q1 |
| 9 | 2017Q4 |
| 9 | 2016Q1 |
| 9 | 2017Q1 |
| 9 | 2017Q4 |
| 9 | 2016Q1 |
| 9 | 2017Q1 |
| 9 | 2017Q4 |
| 9 | 2016Q1 |
| 9 | 2017Q1 |
| 9 | 2017Q4 |
| 9 | 2016Q1 |
| 9 | 2017Q1 |
| 9 | 2017Q4 |
| 9 | 2016Q1 |
| 9 | 2017Q1 |
| 9 | 2017Q4 |
| 10 | 2017Q4 |
| 10 | 2017Q4 |
| 10 | 2018Q2 |
| 10 | 2018Q2 |
| 10 | 2017Q4 |
| 10 | 2017Q4 |
| 10 | 2018Q2 |
| 10 | 2018Q2 |
| 11 | 2017Q4 |
| 11 | 2017Q4 |
| 11 | 2017Q4 |
| 11 | 2017Q4 |
| 12 | 2014Q3 |
| 12 | 2017Q1 |
| 12 | 2014Q3 |
| 12 | 2017Q1 |
| 12 | 2014Q3 |
| 12 | 2017Q1 |
| 12 | 2014Q3 |
| 12 | 2017Q1 |
| Table: Want (This is just an example) | |
| Quarter | No.of Distinct ID |
| 2013Q1 | 0 |
| 2013Q2 | 0 |
| 2013Q3 | 12 |
| 2013Q4 | 4 |
| 2014Q1 | 4 |
| 2014Q2 | 3 |
| 2014Q3 | 16 |
| 2014Q4 | 28 |
| 2015Q1 | 20 |
| 2015Q2 | 21 |
| 2015Q3 | 0 |
| 2015Q4 | 15 |
| 2016Q1 | 28 |
| 2016Q2 | 29 |
| 2016Q3 | 0 |
| 2016Q4 | 0 |
| 2017Q1 | 28 |
| 2017Q2 | 27 |
| 2017Q3 | 60 |
| 2017Q4 | 25 |
| 2018Q1 | 27 |
| 2018Q2 | 33 |
| 2018Q3 | 13 |
| 2018Q4 | 13 |
This might work for you. It counts the Occurrences and the number if ID(subjects) per Quarter. You may not need the IDGROUP output.
data class;
infile cards missover firstobs=2;
input id Quarter:yyq.;
format quarter yyq.;
cards;
ID ClassStartQuarter
1
1
1
1
1
2 2017Q3
2 2017Q3
2 2017Q3
2 2017Q3
3 2018Q1
3 2018Q1
3 2018Q1
3 2018Q1
3 2018Q1
3 2018Q1
3 2018Q1
3 2018Q1
4 2018Q2
4 2018Q2
4 2018Q2
4 2018Q2
4 2018Q2
4 2018Q2
5 2015Q4
5 2017Q1
5 2017Q3
5 2018Q1
5 2015Q4
5 2017Q1
5 2017Q3
5 2018Q1
6 2017Q3
6 2017Q3
6 2018Q1
6 2018Q1
6 2018Q1
6 2018Q1
6 2018Q1
6 2018Q1
6 2018Q1
6 2018Q1
;;;;
run;
proc print;
run;
proc summary nway data=class missing;
class id quarter;
output out=u1(drop=_type_ rename=_freq_=occ);
run;
proc print;
run;
data quarter;
do year=2015 to 2018;
do q=1 to 4;
Quarter=yyq(year,q);
output;
end;
end;
keep quarter;
format quarter yyq.;
run;
proc summary nway missing data=u1 classdata=quarter;
class quarter;
output out=u2(drop=_type_) sum(occ)= idgroup(out[5](id)=);
run;
proc print label;
label _freq_='No.of Distinct ID';
run;
PROC FREQ can do this. Start by getting the (unsatisfactory) duplicated counts:
proc freq data=have;
where ClassStartQuarter > ' ';
tables id * ClassStartQuarter / noprint out=unique;
run;
These counts contain duplicates, but the output also contains just one observation per ID. Just count the result:
proc freq data=unique;
tables ClassStartQuarter;
run;
proc sql;
create table want as
select Quarter,count(distinct id) as n_distinct_id
from class
group by Quarter;
quit;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.