Below is a sample dataset similiar to the data I am working with. I would like to have a month for each birdtype even if that bird does not have any
information for that month. My months are based on a 13 month. For example, robin has no data for the months jun_2019 and nov_2019 and sparrow has no data for
APR_2019 and AUG_2019.I would like to add a row for those months with zeros across. I have also attached a excel sheet with the sample data.
THIS IS THE DATA I HAVE
BIRDTYPES MON_YEAR RARE NON-RARE TOTAL
BLUEBIRD JAN_2018 5 2 7
BLUEBIRD FEB_2018 10 4 14
BLUEBIRD MAR_2018 15 6 21
BLUEBIRD APR_2019 20 8 28
BLUEBIRD MAY_2019 25 10 35
BLUEBIRD JUN_2019 30 12 42
BLUEBIRD JUL_2019 35 14 49
BLUEBIRD AUG_2019 40 16 56
BLUEBIRD SEP_2019 45 18 63
BLUEBIRD OCT_2019 50 20 70
BLUEBIRD NOV_2019 55 22 77
BLUEBIRD DEC_2019 60 24 84
BLUEBIRD JAN_2019 65 26 91
ROBIN JAN_2018 5 2 7
ROBIN FEB_2018 10 4 14
ROBIN MAR_2018 15 6 21
ROBIN APR_2019 20 8 28
ROBIN MAY_2019 25 10 35
ROBIN JUL_2019 35 14 49
ROBIN AUG_2019 40 16 56
ROBIN SEP_2019 45 18 63
ROBIN OCT_2019 50 20 70
ROBIN DEC_2019 60 24 84
ROBIN JAN_2019 65 26 91
SPARROW JAN_2018 5 2 7
SPARROW FEB_2018 10 4 14
SPARROW MAR_2018 15 6 21
SPARROW MAY_2019 25 10 35
SPARROW JUN_2019 30 12 42
SPARROW JUL_2019 35 14 49
SPARROW SEP_2019 45 18 63
SPARROW OCT_2019 50 20 70
SPARROW NOV_2019 55 22 77
SPARROW DEC_2019 60 24 84
SPARROW JAN_2019 65 26 91
DATA WANT
BLUEBIRD JUN_2019 30 12 42
BLUEBIRD JUL_2019 35 14 49
BLUEBIRD AUG_2019 40 16 56
BLUEBIRD SEP_2019 45 18 63
BLUEBIRD OCT_2019 50 20 70
BLUEBIRD NOV_2019 55 22 77
BLUEBIRD DEC_2019 60 24 84
BLUEBIRD JAN_2019 65 26 91
ROBIN JAN_2018 5 2 7
ROBIN FEB_2018 10 4 14
ROBIN MAR_2018 15 6 21
ROBIN APR_2019 20 8 28
ROBIN MAY_2019 25 10 35
ROBIN JUN_2019 0 0 0
ROBIN JUL_2019 35 14 49
ROBIN AUG_2019 40 16 56
ROBIN SEP_2019 45 18 63
ROBIN OCT_2019 50 20 70
ROBIN NOV_2019 0 0 0
ROBIN DEC_2019 60 24 84
ROBIN JAN_2019 65 26 91
SPARROW JAN_2018 5 2 7
SPARROW FEB_2018 10 4 14
SPARROW MAR_2018 15 6 21
SPARROW APR_2019 0 0 0
SPARROW MAY_2019 25 10 35
SPARROW JUN_2019 30 12 42
SPARROW JUL_2019 35 14 49
SPARROW AUG_2019 0 0 0
SPARROW SEP_2019 45 18 63
SPARROW OCT_2019 50 20 70
SPARROW NOV_2019 55 22 77
SPARROW DEC_2019 60 24 84
SPARROW JAN_2019 65 26 91
HI @luvscandy27 Keeping it simple with SPARSE
data have;
input BIRDTYPES :$12. MON_YEAR :monyy9. RARE NON_RARE TOTAL;
format mon_year monyy7.;
cards;
BLUEBIRD JAN_2018 5 2 7
BLUEBIRD FEB_2018 10 4 14
BLUEBIRD MAR_2018 15 6 21
BLUEBIRD APR_2019 20 8 28
BLUEBIRD MAY_2019 25 10 35
BLUEBIRD JUN_2019 30 12 42
BLUEBIRD JUL_2019 35 14 49
BLUEBIRD AUG_2019 40 16 56
BLUEBIRD SEP_2019 45 18 63
BLUEBIRD OCT_2019 50 20 70
BLUEBIRD NOV_2019 55 22 77
BLUEBIRD DEC_2019 60 24 84
BLUEBIRD JAN_2019 65 26 91
ROBIN JAN_2018 5 2 7
ROBIN FEB_2018 10 4 14
ROBIN MAR_2018 15 6 21
ROBIN APR_2019 20 8 28
ROBIN MAY_2019 25 10 35
ROBIN JUL_2019 35 14 49
ROBIN AUG_2019 40 16 56
ROBIN SEP_2019 45 18 63
ROBIN OCT_2019 50 20 70
ROBIN DEC_2019 60 24 84
ROBIN JAN_2019 65 26 91
SPARROW JAN_2018 5 2 7
SPARROW FEB_2018 10 4 14
SPARROW MAR_2018 15 6 21
SPARROW MAY_2019 25 10 35
SPARROW JUN_2019 30 12 42
SPARROW JUL_2019 35 14 49
SPARROW SEP_2019 45 18 63
SPARROW OCT_2019 50 20 70
SPARROW NOV_2019 55 22 77
SPARROW DEC_2019 60 24 84
SPARROW JAN_2019 65 26 91
;
proc freq data=have noprint;
tables birdtypes*mon_year/sparse out=temp(drop=percent count);
run;
proc sort data=have;
by birdtypes mon_year;
run;
data want;
merge temp(in=a) have(in=b);
by BIRDTYPES mon_year;
if not(a and b) then do;
RARE=0; NON_RARE=0; TOTAL=0;
end;
run;
Note: I have assumed MON_YEAR as a numeric variable to have the right order. To my mind, that seems to make sense rather than having as a character variable
You can make use of the COMPLETETYPES option available for SUMMARY/MEANS, then turn the nulls into zeroes:
data have ;
input bird_type $ mon_year $ rare non_rare total ;
cards ;
bluebird jan_2018 5 2 7
bluebird feb_2018 10 4 14
bluebird mar_2018 15 6 21
bluebird apr_2019 20 8 28
bluebird may_2019 25 10 35
bluebird jun_2019 30 12 42
bluebird jul_2019 35 14 49
bluebird aug_2019 40 16 56
bluebird sep_2019 45 18 63
bluebird oct_2019 50 20 70
bluebird nov_2019 55 22 77
bluebird dec_2019 60 24 84
bluebird jan_2019 65 26 91
robin jan_2018 5 2 7
robin feb_2018 10 4 14
robin mar_2018 15 6 21
robin apr_2019 20 8 28
robin may_2019 25 10 35
robin jul_2019 35 14 49
robin aug_2019 40 16 56
robin sep_2019 45 18 63
robin oct_2019 50 20 70
robin dec_2019 60 24 84
robin jan_2019 65 26 91
sparrow jan_2018 5 2 7
sparrow feb_2018 10 4 14
sparrow mar_2018 15 6 21
sparrow may_2019 25 10 35
sparrow jun_2019 30 12 42
sparrow jul_2019 35 14 49
sparrow sep_2019 45 18 63
sparrow oct_2019 50 20 70
sparrow nov_2019 55 22 77
sparrow dec_2019 60 24 84
sparrow jan_2019 65 26 91
;
run ;
proc summary data = have nway completetypes ;
class bird_type mon_year ;
var rare non_rare total ;
output out = want (drop = _:) sum= ;
run ;
proc stdize out = want reponly missing = 0 ;
var rare non_rare total ;
run ;
Kind regards
Paul D.
And just for fun here an approach using hash tables.
data have;
input BIRDTYPES :$12. MON_YEAR :monyy9. RARE NON_RARE TOTAL;
format mon_year monyy7.;
cards;
BLUEBIRD JAN_2018 5 2 7
BLUEBIRD FEB_2018 10 4 14
BLUEBIRD MAR_2018 15 6 21
BLUEBIRD APR_2019 20 8 28
BLUEBIRD MAY_2019 25 10 35
BLUEBIRD JUN_2019 30 12 42
BLUEBIRD JUL_2019 35 14 49
BLUEBIRD AUG_2019 40 16 56
BLUEBIRD SEP_2019 45 18 63
BLUEBIRD OCT_2019 50 20 70
BLUEBIRD NOV_2019 55 22 77
BLUEBIRD DEC_2019 60 24 84
BLUEBIRD JAN_2019 65 26 91
ROBIN JAN_2018 5 2 7
ROBIN FEB_2018 10 4 14
ROBIN MAR_2018 15 6 21
ROBIN APR_2019 20 8 28
ROBIN MAY_2019 25 10 35
ROBIN JUL_2019 35 14 49
ROBIN AUG_2019 40 16 56
ROBIN SEP_2019 45 18 63
ROBIN OCT_2019 50 20 70
ROBIN DEC_2019 60 24 84
ROBIN JAN_2019 65 26 91
SPARROW JAN_2018 5 2 7
SPARROW FEB_2018 10 4 14
SPARROW MAR_2018 15 6 21
SPARROW MAY_2019 25 10 35
SPARROW JUN_2019 30 12 42
SPARROW JUL_2019 35 14 49
SPARROW SEP_2019 45 18 63
SPARROW OCT_2019 50 20 70
SPARROW NOV_2019 55 22 77
SPARROW DEC_2019 60 24 84
SPARROW JAN_2019 65 26 91
;
%let start_month=%sysfunc(inputn(JAN_2018,monyy9.));
%let stop_month =%sysfunc(inputn(DEC_2019,monyy9.));
data _null_;
/* map vars from source table */
if 0 then set have;
/* initialize vars to zero */
array _nvars {*} rare non_rare total (3*0);
/* hash table with one entry per birdtypes */
dcl hash birds(dataset:'have', ordered:'y');
dcl hiter hbirds('birds');
birds.defineKey('BIRDTYPES');
birds.defineDone();
/* hash table with all data from source */
dcl hash tbl(dataset:'have', ordered:'y');
tbl.defineKey('BIRDTYPES','MON_YEAR');
tbl.defineData(all:'y');
tbl.defineDone();
/* loop over birdtypes */
if hbirds.first() = 0 then
do until(hbirds.next());
/* loop over dates per birdtypes */
mon_year=&start_month;
do while(mon_year<=&stop_month);
/* add row to hash if no match with key */
if tbl.check() ne 0 then tbl.add();
mon_year=intnx('month',mon_year,1,'b');
end;
end;
/* write hash to SAS table want at end of processing */
tbl.output(dataset:'want');
stop;
run;
proc print data=want;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.