Below is a sample dataset similiar to the data I am working with. I would like to have a month for each birdtype even if that bird does not have any
information for that month. My months are based on a 13 month. For example, robin has no data for the months jun_2019 and nov_2019 and sparrow has no data for
APR_2019 and AUG_2019.I would like to add a row for those months with zeros across. I have also attached a excel sheet with the sample data.
THIS IS THE DATA I HAVE
BIRDTYPES MON_YEAR RARE NON-RARE TOTAL
BLUEBIRD JAN_2018 5 2 7
BLUEBIRD FEB_2018 10 4 14
BLUEBIRD MAR_2018 15 6 21
BLUEBIRD APR_2019 20 8 28
BLUEBIRD MAY_2019 25 10 35
BLUEBIRD JUN_2019 30 12 42
BLUEBIRD JUL_2019 35 14 49
BLUEBIRD AUG_2019 40 16 56
BLUEBIRD SEP_2019 45 18 63
BLUEBIRD OCT_2019 50 20 70
BLUEBIRD NOV_2019 55 22 77
BLUEBIRD DEC_2019 60 24 84
BLUEBIRD JAN_2019 65 26 91
ROBIN JAN_2018 5 2 7
ROBIN FEB_2018 10 4 14
ROBIN MAR_2018 15 6 21
ROBIN APR_2019 20 8 28
ROBIN MAY_2019 25 10 35
ROBIN JUL_2019 35 14 49
ROBIN AUG_2019 40 16 56
ROBIN SEP_2019 45 18 63
ROBIN OCT_2019 50 20 70
ROBIN DEC_2019 60 24 84
ROBIN JAN_2019 65 26 91
SPARROW JAN_2018 5 2 7
SPARROW FEB_2018 10 4 14
SPARROW MAR_2018 15 6 21
SPARROW MAY_2019 25 10 35
SPARROW JUN_2019 30 12 42
SPARROW JUL_2019 35 14 49
SPARROW SEP_2019 45 18 63
SPARROW OCT_2019 50 20 70
SPARROW NOV_2019 55 22 77
SPARROW DEC_2019 60 24 84
SPARROW JAN_2019 65 26 91
DATA WANT
BLUEBIRD JUN_2019 30 12 42
BLUEBIRD JUL_2019 35 14 49
BLUEBIRD AUG_2019 40 16 56
BLUEBIRD SEP_2019 45 18 63
BLUEBIRD OCT_2019 50 20 70
BLUEBIRD NOV_2019 55 22 77
BLUEBIRD DEC_2019 60 24 84
BLUEBIRD JAN_2019 65 26 91
ROBIN JAN_2018 5 2 7
ROBIN FEB_2018 10 4 14
ROBIN MAR_2018 15 6 21
ROBIN APR_2019 20 8 28
ROBIN MAY_2019 25 10 35
ROBIN JUN_2019 0 0 0
ROBIN JUL_2019 35 14 49
ROBIN AUG_2019 40 16 56
ROBIN SEP_2019 45 18 63
ROBIN OCT_2019 50 20 70
ROBIN NOV_2019 0 0 0
ROBIN DEC_2019 60 24 84
ROBIN JAN_2019 65 26 91
SPARROW JAN_2018 5 2 7
SPARROW FEB_2018 10 4 14
SPARROW MAR_2018 15 6 21
SPARROW APR_2019 0 0 0
SPARROW MAY_2019 25 10 35
SPARROW JUN_2019 30 12 42
SPARROW JUL_2019 35 14 49
SPARROW AUG_2019 0 0 0
SPARROW SEP_2019 45 18 63
SPARROW OCT_2019 50 20 70
SPARROW NOV_2019 55 22 77
SPARROW DEC_2019 60 24 84
SPARROW JAN_2019 65 26 91
HI @luvscandy27 Keeping it simple with SPARSE
data have;
input BIRDTYPES :$12. MON_YEAR :monyy9. RARE NON_RARE TOTAL;
format mon_year monyy7.;
cards;
BLUEBIRD JAN_2018 5 2 7
BLUEBIRD FEB_2018 10 4 14
BLUEBIRD MAR_2018 15 6 21
BLUEBIRD APR_2019 20 8 28
BLUEBIRD MAY_2019 25 10 35
BLUEBIRD JUN_2019 30 12 42
BLUEBIRD JUL_2019 35 14 49
BLUEBIRD AUG_2019 40 16 56
BLUEBIRD SEP_2019 45 18 63
BLUEBIRD OCT_2019 50 20 70
BLUEBIRD NOV_2019 55 22 77
BLUEBIRD DEC_2019 60 24 84
BLUEBIRD JAN_2019 65 26 91
ROBIN JAN_2018 5 2 7
ROBIN FEB_2018 10 4 14
ROBIN MAR_2018 15 6 21
ROBIN APR_2019 20 8 28
ROBIN MAY_2019 25 10 35
ROBIN JUL_2019 35 14 49
ROBIN AUG_2019 40 16 56
ROBIN SEP_2019 45 18 63
ROBIN OCT_2019 50 20 70
ROBIN DEC_2019 60 24 84
ROBIN JAN_2019 65 26 91
SPARROW JAN_2018 5 2 7
SPARROW FEB_2018 10 4 14
SPARROW MAR_2018 15 6 21
SPARROW MAY_2019 25 10 35
SPARROW JUN_2019 30 12 42
SPARROW JUL_2019 35 14 49
SPARROW SEP_2019 45 18 63
SPARROW OCT_2019 50 20 70
SPARROW NOV_2019 55 22 77
SPARROW DEC_2019 60 24 84
SPARROW JAN_2019 65 26 91
;
proc freq data=have noprint;
tables birdtypes*mon_year/sparse out=temp(drop=percent count);
run;
proc sort data=have;
by birdtypes mon_year;
run;
data want;
merge temp(in=a) have(in=b);
by BIRDTYPES mon_year;
if not(a and b) then do;
RARE=0; NON_RARE=0; TOTAL=0;
end;
run;
Note: I have assumed MON_YEAR as a numeric variable to have the right order. To my mind, that seems to make sense rather than having as a character variable
You can make use of the COMPLETETYPES option available for SUMMARY/MEANS, then turn the nulls into zeroes:
data have ;
input bird_type $ mon_year $ rare non_rare total ;
cards ;
bluebird jan_2018 5 2 7
bluebird feb_2018 10 4 14
bluebird mar_2018 15 6 21
bluebird apr_2019 20 8 28
bluebird may_2019 25 10 35
bluebird jun_2019 30 12 42
bluebird jul_2019 35 14 49
bluebird aug_2019 40 16 56
bluebird sep_2019 45 18 63
bluebird oct_2019 50 20 70
bluebird nov_2019 55 22 77
bluebird dec_2019 60 24 84
bluebird jan_2019 65 26 91
robin jan_2018 5 2 7
robin feb_2018 10 4 14
robin mar_2018 15 6 21
robin apr_2019 20 8 28
robin may_2019 25 10 35
robin jul_2019 35 14 49
robin aug_2019 40 16 56
robin sep_2019 45 18 63
robin oct_2019 50 20 70
robin dec_2019 60 24 84
robin jan_2019 65 26 91
sparrow jan_2018 5 2 7
sparrow feb_2018 10 4 14
sparrow mar_2018 15 6 21
sparrow may_2019 25 10 35
sparrow jun_2019 30 12 42
sparrow jul_2019 35 14 49
sparrow sep_2019 45 18 63
sparrow oct_2019 50 20 70
sparrow nov_2019 55 22 77
sparrow dec_2019 60 24 84
sparrow jan_2019 65 26 91
;
run ;
proc summary data = have nway completetypes ;
class bird_type mon_year ;
var rare non_rare total ;
output out = want (drop = _:) sum= ;
run ;
proc stdize out = want reponly missing = 0 ;
var rare non_rare total ;
run ;
Kind regards
Paul D.
And just for fun here an approach using hash tables.
data have;
input BIRDTYPES :$12. MON_YEAR :monyy9. RARE NON_RARE TOTAL;
format mon_year monyy7.;
cards;
BLUEBIRD JAN_2018 5 2 7
BLUEBIRD FEB_2018 10 4 14
BLUEBIRD MAR_2018 15 6 21
BLUEBIRD APR_2019 20 8 28
BLUEBIRD MAY_2019 25 10 35
BLUEBIRD JUN_2019 30 12 42
BLUEBIRD JUL_2019 35 14 49
BLUEBIRD AUG_2019 40 16 56
BLUEBIRD SEP_2019 45 18 63
BLUEBIRD OCT_2019 50 20 70
BLUEBIRD NOV_2019 55 22 77
BLUEBIRD DEC_2019 60 24 84
BLUEBIRD JAN_2019 65 26 91
ROBIN JAN_2018 5 2 7
ROBIN FEB_2018 10 4 14
ROBIN MAR_2018 15 6 21
ROBIN APR_2019 20 8 28
ROBIN MAY_2019 25 10 35
ROBIN JUL_2019 35 14 49
ROBIN AUG_2019 40 16 56
ROBIN SEP_2019 45 18 63
ROBIN OCT_2019 50 20 70
ROBIN DEC_2019 60 24 84
ROBIN JAN_2019 65 26 91
SPARROW JAN_2018 5 2 7
SPARROW FEB_2018 10 4 14
SPARROW MAR_2018 15 6 21
SPARROW MAY_2019 25 10 35
SPARROW JUN_2019 30 12 42
SPARROW JUL_2019 35 14 49
SPARROW SEP_2019 45 18 63
SPARROW OCT_2019 50 20 70
SPARROW NOV_2019 55 22 77
SPARROW DEC_2019 60 24 84
SPARROW JAN_2019 65 26 91
;
%let start_month=%sysfunc(inputn(JAN_2018,monyy9.));
%let stop_month =%sysfunc(inputn(DEC_2019,monyy9.));
data _null_;
/* map vars from source table */
if 0 then set have;
/* initialize vars to zero */
array _nvars {*} rare non_rare total (3*0);
/* hash table with one entry per birdtypes */
dcl hash birds(dataset:'have', ordered:'y');
dcl hiter hbirds('birds');
birds.defineKey('BIRDTYPES');
birds.defineDone();
/* hash table with all data from source */
dcl hash tbl(dataset:'have', ordered:'y');
tbl.defineKey('BIRDTYPES','MON_YEAR');
tbl.defineData(all:'y');
tbl.defineDone();
/* loop over birdtypes */
if hbirds.first() = 0 then
do until(hbirds.next());
/* loop over dates per birdtypes */
mon_year=&start_month;
do while(mon_year<=&stop_month);
/* add row to hash if no match with key */
if tbl.check() ne 0 then tbl.add();
mon_year=intnx('month',mon_year,1,'b');
end;
end;
/* write hash to SAS table want at end of processing */
tbl.output(dataset:'want');
stop;
run;
proc print data=want;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.