dear experts,
I have data in the following format:
id | upc_code | year |
2000351 | 1230070413 | 2004 |
2000351 | . | 2004 |
2000351 | 1230000018 | 2004 |
2000351 | 1230070499 | 2004 |
2000351 | . | 2004 |
2000351 | . | 2005 |
2000351 | 1230011813 | 2005 |
2000351 | . | 2005 |
2000351 | . | 2005 |
2000351 | . | 2005 |
2000351 | 1230070413 | 2005 |
2000351 | . | 2005 |
2000351 | . | 2005 |
2000351 | . | 2005 |
2000351 | . | 2005 |
2000351 | 1230011813 | 2005 |
I want to create something like this using sql for each id number:
id | year | count |
2000351 | 2004 | 3 |
2000351 | 2005 | 3 |
count counts each nonempty upc_code as 1.
can someone please tell me how this can be done? it's my first time using sas to make data summary such as this.
many thanks,
c
How about if you try something like:
proc sql;
create table my_count as
select id,sum(not missing(upc_code)) as
num_purchase_year, year
from have
group by id,year
;
quit;
FYI: I tried something like this.
proc sql;
create table my_count as
select *, sum(yes_purchase) as num_purchase_year
(select id, (upc_code ne .) as yes_purchase)
group by year;
quit;
How about if you try something like:
proc sql;
create table my_count as
select id,sum(not missing(upc_code)) as
num_purchase_year, year
from have
group by id,year
;
quit;
thank you very much!
Or
data temp; input id upc_code year ; cards; 2000351 1230070413 2004 2000351 . 2004 2000351 1230000018 2004 2000351 1230070499 2004 2000351 . 2004 2000351 . 2005 2000351 1230011813 2005 2000351 . 2005 2000351 . 2005 2000351 . 2005 2000351 1230070413 2005 2000351 . 2005 2000351 . 2005 2000351 . 2005 2000351 . 2005 2000351 1230011813 2005 ; run; proc sql noprint; create table want as select id,year,count(*) as freq from temp where upc_code is not missing group by id,year; quit;
Ksharp
data temp;
input id upc_code year ;
cards;
2000351 1230070413 2004
2000351 . 2004
2000351 1230000018 2004
2000351 1230070499 2004
2000351 . 2004
2000351 . 2005
2000351 1230011813 2005
2000351 . 2005
2000351 . 2005
2000351 . 2005
2000351 1230070413 2005
2000351 . 2005
2000351 . 2005
2000351 . 2005
2000351 . 2005
2000351 1230011813 2005
;
run;
proc format;
value upcfmt
0-high='1'
other='0';
run;
proc freq data=temp;
tables upc_code /out=want(drop=upc_code percent) noprint;
by id year;
where upc_code > 0;
format upc_code upcfmt;
run;
or
proc means data=temp n noprint;
var upc_code;
by id year;
output out=want(drop=_type_ _freq_) n=count;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.