Hello
I have dataset in following format.
Partner Year Value
Algeria 2010 20.00
Algeria 2011 40.00
Afghanistan 2013 40.00
Afghanistan 2017 30.00
China 2010 20.00
China 2011 20.00
China 2012 40.00
China 2013 25.00
China 2014 30.00
China 2015 20.00
...
---
--
What I am looking for is For each country, I want to have data raw for each year (2010 to 2019) with 0 if the value is missing.
Thanks
data have;
input Partner :$20. Year Value;
datalines;
Algeria 2010 20.00
Algeria 2011 40.00
Afghanistan 2013 40.00
Afghanistan 2017 30.00
China 2010 20.00
China 2011 20.00
China 2012 40.00
China 2013 25.00
China 2014 30.00
China 2015 20.00
;
data want(drop=rc);
if _N_=1 then do;
declare hash h (dataset : 'have');
h.definekey ('Partner', 'Year');
h.definedata ('Value');
h.definedone();
end;
set have;
by Partner notsorted;
if first.Partner then do Year=2010 to 2019;
Value=0;
rc = h.find();
output;
end;
run;
data have;
input Partner $12. Year Value;
cards;
Algeria 2010 20.00
Algeria 2011 40.00
Afghanistan 2013 40.00
Afghanistan 2017 30.00
China 2010 20.00
China 2011 20.00
China 2012 40.00
China 2013 25.00
China 2014 30.00
China 2015 20.00
run;
proc freq data=have noprint;
tables Partner * Year / sparse out=want0(drop=count percent);
run;
proc sort data=want0;
by Partner Year;
run;
proc sort data=have;
by Partner Year;
run;
data want1;
merge have want0;
by Partner Year;
run;
proc stdize data=want1 out=want reponly missing=0;
var Value;
run;
Although I am a great user of hash object, I would suggest a single data step that uses 2 lock-step input data streams:
data have;
input Partner :$20. Year Value ;
datalines;
Algeria 2010 20.00
Algeria 2011 40.00
Afghanistan 2013 40.00
Afghanistan 2017 30.00
China 2010 20.00
China 2011 20.00
China 2012 40.00
China 2013 25.00
China 2014 30.00
China 2015 20.00
;
data want (drop=sentinel: nxt_year );
set have (keep=partner year);
by partner notsorted;
if first.partner=1 and year>2010 then do year=2010 to year-1;
output;
end;
retain sentinel1 .;
merge have have (firstobs=2 keep=year rename=(year=nxt_year));
retain sentinel2 .;
do year=year to ifn(last.partner,2019,nxt_year-1);
output;
call missing(of sentinel1 -- sentinel2);
end;
run;
This program assumes that records within each PARTNER, are sorted by YEAR, but PARTNER need not be sorted.
Note it takes advantage of the fact that by declaring SENTINEL1 and SENTINEL2 variables just before and after the MERGE statement, they are placed just to the left and to the right of all the incoming variables (except PARTNER and YEAR). That in turn makes creating a dummy record (i.e. all variables set to missing EXCEPT partner and year) easy - just by using CALL MISSING(of sentinel1 -- sentinel2).
The IFN statement sets the upper-limit of year for the DO loop. If the record-in-hand is the last for a partner output every year to 2019, otherwise only go up to NXT_YEAR-1.
data have;
input Partner :$20. Year Value;
datalines;
Algeria 2010 20.00
Algeria 2011 40.00
Afghanistan 2013 40.00
Afghanistan 2017 30.00
China 2010 20.00
China 2011 20.00
China 2012 40.00
China 2013 25.00
China 2014 30.00
China 2015 20.00
;
data temp;
do year=2010 to 2019;
output;
end;
run;
proc sql;
select a.*,coalesce(value,0) as value
from (select * from
(select distinct Partner from have),(select distinct year from temp)) as a
left join have as b on a.Partner=b.Partner and a.year=b.year;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.