dear experts,
I have data in the following format:
id | upc_code | year |
2000351 | 1230070413 | 2004 |
2000351 | . | 2004 |
2000351 | 1230000018 | 2004 |
2000351 | 1230070499 | 2004 |
2000351 | . | 2004 |
2000351 | . | 2005 |
2000351 | 1230011813 | 2005 |
2000351 | . | 2005 |
2000351 | . | 2005 |
2000351 | . | 2005 |
2000351 | 1230070413 | 2005 |
2000351 | . | 2005 |
2000351 | . | 2005 |
2000351 | . | 2005 |
2000351 | . | 2005 |
2000351 | 1230011813 | 2005 |
I want to create something like this using sql for each id number:
id | year | count |
2000351 | 2004 | 3 |
2000351 | 2005 | 3 |
count counts each nonempty upc_code as 1.
can someone please tell me how this can be done? it's my first time using sas to make data summary such as this.
many thanks,
c
How about if you try something like:
proc sql;
create table my_count as
select id,sum(not missing(upc_code)) as
num_purchase_year, year
from have
group by id,year
;
quit;
FYI: I tried something like this.
proc sql;
create table my_count as
select *, sum(yes_purchase) as num_purchase_year
(select id, (upc_code ne .) as yes_purchase)
group by year;
quit;
How about if you try something like:
proc sql;
create table my_count as
select id,sum(not missing(upc_code)) as
num_purchase_year, year
from have
group by id,year
;
quit;
thank you very much!
Or
data temp; input id upc_code year ; cards; 2000351 1230070413 2004 2000351 . 2004 2000351 1230000018 2004 2000351 1230070499 2004 2000351 . 2004 2000351 . 2005 2000351 1230011813 2005 2000351 . 2005 2000351 . 2005 2000351 . 2005 2000351 1230070413 2005 2000351 . 2005 2000351 . 2005 2000351 . 2005 2000351 . 2005 2000351 1230011813 2005 ; run; proc sql noprint; create table want as select id,year,count(*) as freq from temp where upc_code is not missing group by id,year; quit;
Ksharp
data temp;
input id upc_code year ;
cards;
2000351 1230070413 2004
2000351 . 2004
2000351 1230000018 2004
2000351 1230070499 2004
2000351 . 2004
2000351 . 2005
2000351 1230011813 2005
2000351 . 2005
2000351 . 2005
2000351 . 2005
2000351 1230070413 2005
2000351 . 2005
2000351 . 2005
2000351 . 2005
2000351 . 2005
2000351 1230011813 2005
;
run;
proc format;
value upcfmt
0-high='1'
other='0';
run;
proc freq data=temp;
tables upc_code /out=want(drop=upc_code percent) noprint;
by id year;
where upc_code > 0;
format upc_code upcfmt;
run;
or
proc means data=temp n noprint;
var upc_code;
by id year;
output out=want(drop=_type_ _freq_) n=count;
run;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.