Hello Friends. I have follwing problem. I hope there will be some easy way to do this in sas.
thanks in advance
I want to create a variable on monthly basis but my data have inconsistent intervalas. Like if I have a value in decemebr 2015 as B and next value is on June 2015 as C and then January, 2016 as A. what I want to do is create a file in which I will have all the missing year and month columns with previous available value. Like
Data Type Have:
CompanyName Year month TYPE
Abc 2016 1 A
ABC 2016 10 C
Abc 2015 2 B
WANT TYPE :
abc | 2016 | 1 | A |
abc | 2016 | 2 | A |
abc | 2016 | 3 | A |
abc | 2016 | 4 | A |
abc | 2016 | 5 | A |
abc | 2016 | 6 | A |
abc | 2016 | 7 | A |
abc | 2016 | 8 | A |
abc | 2016 | 9 | A |
abc | 2016 | 10 | C |
abc | 2016 | 11 | C |
abc | 2016 | 12 | C |
abc | 2015 | 2 | B |
abc | 2015 | 3 | B |
abc | 2015 | 4 | B |
abc | 2015 | 5 | B |
abc | 2015 | 6 | B |
abc | 2015 | 7 | B |
abc | 2015 | 8 | B |
ABC | 2015 | 9 | B |
Abc | 2015 | 10 | B |
ABC | 2015 | 11 | B |
abc | 2015 | 12 | B |
There are many ways to do this.
Here's one that's a little long, but the logic is clear.
Tom
/* Set up a reference date of first of the month */
data Have;
set Have;
TestDate = mdy(Month, 1, Year);
drop Year Month;
run;
/* Generate every first of the month from Jan 1 2014 to Dec 31 2017 */
data AllMonths;
do Year = 2014 to 2017;
do Month = 1 to 12;
DateToCheck = mdy(Month, 1, Year);
format DateToCheck date.;
keep DateToCheck;
output;
end;
end;
/* Join the two, to get a full list of months with Type */
proc sql noprint;
create table Want as
select * from Have h right join AllMonths a on(h.TestDate = a.DateToCheck)
order by a.DateToCheck;
quit;
/* Ripple the values of CompanyName and Type */
data Want;
set Want;
length OldCompanyName $8 OldType $1;
retain OldCompanyName OldType;
if ^missing(Type) then
OldType = Type;
else Type = OldType;
if ^missing(CompanyName) then
OldCompanyName = CompanyName;
else CompanyName = OldCompanyName;
drop OldCompanyName OldType TestDate;
run;
Post test data in the form of a datastep.
As such, this is just an example:
proc sort data=have out=years nodupkey; by year; run; data years; set years; do month=1 to 12; output; end; run; proc sql; create table WANT as select A.NAME, A.YEAR, B.MONTH, B.TYPE from HAVE A full join YEARS B on A.YEAR=B.YEAR and A.MONTH >= B.MONTH; quit;
There are many ways to do this.
Here's one that's a little long, but the logic is clear.
Tom
/* Set up a reference date of first of the month */
data Have;
set Have;
TestDate = mdy(Month, 1, Year);
drop Year Month;
run;
/* Generate every first of the month from Jan 1 2014 to Dec 31 2017 */
data AllMonths;
do Year = 2014 to 2017;
do Month = 1 to 12;
DateToCheck = mdy(Month, 1, Year);
format DateToCheck date.;
keep DateToCheck;
output;
end;
end;
/* Join the two, to get a full list of months with Type */
proc sql noprint;
create table Want as
select * from Have h right join AllMonths a on(h.TestDate = a.DateToCheck)
order by a.DateToCheck;
quit;
/* Ripple the values of CompanyName and Type */
data Want;
set Want;
length OldCompanyName $8 OldType $1;
retain OldCompanyName OldType;
if ^missing(Type) then
OldType = Type;
else Type = OldType;
if ^missing(CompanyName) then
OldCompanyName = CompanyName;
else CompanyName = OldCompanyName;
drop OldCompanyName OldType TestDate;
run;
hi.
data have;
input CompanyName $ Year month TYPE $;
datalines;
Abc 2016 1 A
Abc 2016 10 C
Abc 2015 2 B
;
proc sort data=have;
by companyname year month;
run;
data want(rename=(_month=month));
length _type _type1 $1;
do _n_=1 by 1 until(last.year);
set have;
by year ;
if first.year and last.year then
do;
do _month=month to 12;
output;
end;
end;
if _n_=1 and not last.year then
do;
call missing(_month,_type,_month1,_type1);
_month=month;
_type=type;
end;
if _n_>1 then
do;
_type1=type ;
do _month=_month to month-1;
_month1=month;
type=_type;
output;
end;
type=_type1;
output;
end;
if (_n_>1 and last.year and month<=12) or (first.year and last.year) then
do;
do _month=_month+1 to 12;
output;
end;
end;
end;
drop month _type _type1 _month1;
run;
Regards,
Naveen Srinivasan
It's a little trickier than that...
Instead of just having one list of "first of the month" dates, we need to have a list of these dates for each company name. Then, after the merge, we have to modify the ripple to reset for each new company.
I think this is pretty close: (not fully tested)
Tom
/* Set up a reference date of first of the month */
data Have;
set Have;
TestDate = mdy(Month, 1, Year);
format TestDate date9.;
drop Year Month;
run;
/* Get a list of unique company names */
proc sql noprint;
create table Names as
select distinct CompanyName from Have h;
quit;
/* Generate every first of the month from Jan 1 2014 to Dec 31 2017 */
data AllMonths;
do Year = 2014 to 2017;
do Month = 1 to 12;
DateToCheck = mdy(Month, 1, Year);
format DateToCheck date9.;
keep DateToCheck;
output;
end;
end;
/* Cartesian join to get a list of months and company names */
proc sql noprint;
create table AllMonthsNames as
select n.CompanyName, a.DateToCheck from Names n cross join AllMonths a;
quit;
/* Join the two, to get a full list of months with Type */
proc sql noprint;
create table Want as
select h.*, a.DateToCheck from Have h right join AllMonthsNames a on(h.TestDate = a.DateToCheck and h.CompanyName = a.CompanyName)
order by a.CompanyName, a.DateToCheck;
quit;
/* Ripple the values of CompanyName and Type */
data Want;
set Want;
length OldCompanyName $8 OldType $1 OldDate 8;
retain OldCompanyName OldType OldDate;
if _n_ = 1 then
OldDate = DateToCheck;
if DateToCheck < OldDate /* We've started on a new company */
then do;
OldDate = DateToCheck;
call missing(OldType, OldCompanyName);
OldDate = DateToCheck;
end;
else OldDate = DateToCheck;
if ^missing(Type) then
OldType = Type;
else Type = OldType;
if ^missing(CompanyName) then
OldCompanyName = CompanyName;
else CompanyName = OldCompanyName;
drop OldCompanyName OldType OldDate TestDate;
run;
Thank you for your acknowledgement. It 's almost 11:00pm at Chennai and I have had a few beers this evening and feeling so tired and knackered. I will look into your requirement and properly modify the code tomorrow at least for the sake of sharing knowledge. Meanwhile, I hope @TomKari 's code is useful for you. He is a genius and I have paid attention to many of his posts since I watched his interview on youtube few years ago.
Regards,
Naveen Srinivasan
@Kyojik Hi, Sorry for not being able to give you the much needed solution last night as I was awefully tired. Here you go, the below will work for multiple companies. Please do let me know should if it doesn't quite meet your requirement. It's actually easy. So feel free to reach out.
Regards,
Naveen Srinivasan
proc sort data=have;
by companyname year month;
run;
data want(rename=(_month=month));
do until(last.companyname);
do until(last.year);
set have;
by companyname year month;
length _type __type $1;
if first.companyname or first.year then call missing(_month,_type,__type);
if first.year and last.year and month=12 then
do;
_month=month;
output;
end;
else if (first.year and last.year) and month<12 then
do;
do _month=month to 12;
output;
end;
end;
else if (first.year and not last.year) and month<12 then
do;
_month=month;
_type=type;
end;
else if not first.year then
do;
__type=type;
do _month=_month to month-1;
type=_type;
output;
end;
_type=__type;
if last.year then do;
do _month=month to 12;
type=_type;
output;
end;
end;
end;
end;
end;
drop month _type __type;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.