Hi,
I have monthly dataset for 6 years, an example is given below. I have all data for each year (june). Each month variable popul increases by a contanst, for example for sex=1 and age_group=1, popul in Jul2010 would be 20+5, in Aug2010 would be 25+5 and so on... How could I do it?
sex age_group popul date
1 1 20 Jun2010
1 2 45 Jun2010
1 3 15 Jun2010
2 1 9 Jun2010
2 2 5 Jun2010
2 3 12 Jun2010
1 1 . Jul2010
1 2 . Jul2010
1 3 . Jul2010
2 1 . Jul2010
2 2 . Jul2010
2 3 . Jul2010
........
1 1 16 Jul2011
1 2 25 Jun2011
1 3 5 Jun2011
2 1 8 Jun2011
2 2 15 Jun2011
2 3 12 Jun2011
..................
1 1 16 Jul2012
1 2 25 Jun2012
1 3 5 Jun2012
2 1 8 Jun2012
2 2 15 Jun2012
2 3 12 Jun2012
...................................
So you need a way to interpolate the monthly values between the years? They most likely won't line up, how to account for that?
Can you clearly show what you expect as output based on the data you've posted. And preferable post in a data step in the future.
Is your date a SAS date or a Character variable?
So I have data similar to this format:
1 1 20 Jun2010
1 2 45 Jun2010
1 3 15 Jun2010
2 1 9 Jun2010
2 2 5 Jun2010
2 3 12 Jun2010
1 1 . Jul2010
1 2 . Jul2010
1 3 . Jul2010
2 1 . Jul2010
2 2 . Jul2010
2 3 . Jul2010
1 1 . Aug2010
1 2 . Aug2010
1 3 . Aug2010
2 1 . Aug2010
2 2 . Aug2010
2 3 . Aug2010
1 1 . Sep2010
1 2 . Sep2010
1 3 . Sep2010
2 1 . Sep2010
2 2 . Sep2010
2 3 . Sep2010
And I want it to be like this:
1 1 20 Jun2010
1 2 45 Jun2010
1 3 15 Jun2010
2 1 9 Jun2010
2 2 5 Jun2010
2 3 12 Jun2010
1 1 25 Jul2010
1 2 46 Jul2010
1 3 17 Jul2010
2 1 8 Jul2010
2 2 6 Jul2010
2 3 17 Jul2010
1 1 30 Aug2010
1 2 47 Aug2010
1 3 19 Aug2010
2 1 7 Aug2010
2 2 7 Aug2010
2 3 22 Aug2010
1 1 35 Sep2010
1 2 47 Sep2010
1 3 21 Sep2010
2 1 6 Sep2010
2 2 8 Sep2010
2 3 27 Sep2010
My date variable is in MONYY7. format
How do you know what constant to use or how many times to loop?
You can use a multiple output statement per record you have but need to know how to change it.
Output;
Do i=1 to 11;
date=INTNX('month', DATE, I, , 'S');
value = value + 5;
output;
end;
If you really have every month in your input data, then this will work:
data have;
input sex agegroup popul monyyyy :$7.;
datalines;
1 1 20 Jun2010
1 2 45 Jun2010
1 3 15 Jun2010
2 1 9 Jun2010
2 2 5 Jun2010
2 3 12 Jun2010
1 1 . Jul2010
1 2 . Jul2010
1 3 . Jul2010
2 1 . Jul2010
2 2 . Jul2010
2 3 . Jul2010
1 1 . Aug2010
1 2 . Aug2010
1 3 . Aug2010
2 1 . Aug2010
2 2 . Aug2010
2 3 . Aug2010
1 1 . Sep2010
1 2 . Sep2010
1 3 . Sep2010
2 1 . Sep2010
2 2 . Sep2010
2 3 . Sep2010
run;
data want (drop=increment);
set have;
by monyyyy notsorted;
if first.monyyyy then increment+5;
array s_by_a {2,3} _temporary_;
if monyyyy=:'Jun' then do;
if first.monyyyy then call missing(of s_by_a{*});
s_by_a{sex,agegroup}=popul;
increment=0;
end;
else popul=s_by_a{sex,agegroup}+increment;
run;
This program creates a 2x3 matrix s_by_a to hold the popul values from each Jun, where SEX is the row dimension and agegroup is the column dimension. Declaring the array as _temporary_ means its value are automatically retained, so can be used for retrieval later. The INCREMENT var starts at 0 for 'Junxxxx', then 5 for 'Julxxxx', 10 for 'Augxxxx', etc.
But the increament for different sex and different age group is different
@viollete wrote:
But the increament for different sex and different age group is different
You didn't say that, you said a constant such as 5.
What are the rules? Where are they stored?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.