SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

data generation

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 10
Accepted Solution

data generation

[ Edited ]

Hello Friends. I have follwing problem. I hope there will be some easy way to do this in sas.

thanks in advance

 I want to create a variable on monthly basis but my data have inconsistent intervalas. Like if I have a value in decemebr 2015 as B and next value is on June 2015 as C and then January, 2016 as A. what I want to do is create a file in which I will have all the missing year and month columns with previous available value. Like

Data Type Have:

CompanyName Year                      month                 TYPE

Abc                        2016                       1                              A

ABC                        2016                       10                           C

Abc                        2015                       2                           B

 

WANT TYPE :

abc

2016

1

A

abc

2016

2

A

abc

2016

3

A

abc

2016

4

A

abc

2016

5

A

abc

2016

6

A

abc

2016

7

A

abc

2016

8

A

abc

2016

9

A

abc

2016

10

C

abc

2016

11

C

abc

2016

12

C

    

abc

2015

2

B

abc

2015

3

B

abc

2015

4

B

abc

2015

5

B

abc

2015

6

B

abc

2015

7

B

abc

2015

8

B

ABC

2015

9

B

Abc

2015

10

B

ABC

2015

11

B

abc

2015

12

B


Accepted Solutions
Solution
‎04-10-2017 12:07 PM
PROC Star
Posts: 1,090

Re: data generation

There are many ways to do this.

 

Here's one that's a little long, but the logic is clear.

 

Tom

 

/* Set up a reference date of first of the month */

data Have;

set Have;

TestDate = mdy(Month, 1, Year);

drop Year Month;

run;

/* Generate every first of the month from Jan 1 2014 to Dec 31 2017 */

data AllMonths;

do Year = 2014 to 2017;

do Month = 1 to 12;

DateToCheck = mdy(Month, 1, Year);

format DateToCheck date.;

keep DateToCheck;

output;

end;

end;

/* Join the two, to get a full list of months with Type */

proc sql noprint;

create table Want as

select * from Have h right join AllMonths a on(h.TestDate = a.DateToCheck)

order by a.DateToCheck;

quit;

/* Ripple the values of CompanyName and Type */

data Want;

set Want;

length OldCompanyName $8 OldType $1;

retain OldCompanyName OldType;

if ^missing(Type) then

OldType = Type;

else Type = OldType;

if ^missing(CompanyName) then

OldCompanyName = CompanyName;

else CompanyName = OldCompanyName;

drop OldCompanyName OldType TestDate;

run;

View solution in original post


All Replies
Super User
Super User
Posts: 7,392

Re: data generation

Post test data in the form of a datastep.  

 

As such, this is just an example:

proc sort data=have out=years nodupkey;
  by year;
run;

data years;
  set years;
  do month=1 to 12;
    output;
  end;
run;

proc sql;
  create table WANT as
  select A.NAME,
            A.YEAR,
            B.MONTH,
            B.TYPE
  from   HAVE A
  full join YEARS B
  on      A.YEAR=B.YEAR 
  and    A.MONTH >= B.MONTH;
quit;
Occasional Contributor
Posts: 10

Re: data generation

Thanks for your help unfortunately it was not what I was looking for. someone else solved this. Thanks a lot
Solution
‎04-10-2017 12:07 PM
PROC Star
Posts: 1,090

Re: data generation

There are many ways to do this.

 

Here's one that's a little long, but the logic is clear.

 

Tom

 

/* Set up a reference date of first of the month */

data Have;

set Have;

TestDate = mdy(Month, 1, Year);

drop Year Month;

run;

/* Generate every first of the month from Jan 1 2014 to Dec 31 2017 */

data AllMonths;

do Year = 2014 to 2017;

do Month = 1 to 12;

DateToCheck = mdy(Month, 1, Year);

format DateToCheck date.;

keep DateToCheck;

output;

end;

end;

/* Join the two, to get a full list of months with Type */

proc sql noprint;

create table Want as

select * from Have h right join AllMonths a on(h.TestDate = a.DateToCheck)

order by a.DateToCheck;

quit;

/* Ripple the values of CompanyName and Type */

data Want;

set Want;

length OldCompanyName $8 OldType $1;

retain OldCompanyName OldType;

if ^missing(Type) then

OldType = Type;

else Type = OldType;

if ^missing(CompanyName) then

OldCompanyName = CompanyName;

else CompanyName = OldCompanyName;

drop OldCompanyName OldType TestDate;

run;

Occasional Contributor
Posts: 10

Re: data generation

Thanks a lot you ace it.
PROC Star
Posts: 166

Re: data generation

[ Edited ]

 

 

hi.

 

data have;

input CompanyName $ Year                      month                 TYPE $;

datalines;

Abc                        2016                       1                              A

Abc                        2016                       10                           C

Abc                        2015                       2                           B

;

 

proc sort data=have;

by companyname year month;

run;

 

data want(rename=(_month=month));

length     _type _type1 $1;

do _n_=1 by 1 until(last.year);

  set have;

  by year ;

  if first.year and last.year then

   do;

      do _month=month to 12;

         output;

      end;

  end;

  if _n_=1 and not  last.year then

  do;

    call missing(_month,_type,_month1,_type1);

    _month=month;

    _type=type;

  end;

if _n_>1    then

    do;

     _type1=type     ;

        do _month=_month to month-1;

           _month1=month;

         type=_type;

          output;

       end;

         type=_type1;

         output;

    end;

 if  (_n_>1 and last.year and month<=12) or (first.year and last.year)  then

   do;

      do _month=_month+1 to 12;

         output;

      end;

  end;

end; 

drop month _type _type1 _month1;

run;

 

Regards,

Naveen Srinivasan

Occasional Contributor
Posts: 10

Re: data generation

Thank you for your help. This works perfect when there is one company. how can I do this with multiple companies ?
I tried to add company name in by statement but it's not working in this way.
thank you very much again.
PROC Star
Posts: 1,090

Re: data generation

It's a little trickier than that...

 

Instead of just having one list of "first of the month" dates, we need to have a list of these dates for each company name. Then, after the merge, we have to modify the ripple to reset for each new company.

 

I think this is pretty close: (not fully tested)

 

Tom

 

/* Set up a reference date of first of the month */

data Have;

set Have;

TestDate = mdy(Month, 1, Year);

format TestDate date9.;

drop Year Month;

run;

/* Get a list of unique company names */

proc sql noprint;

create table Names as

select distinct CompanyName from Have h;

quit;

/* Generate every first of the month from Jan 1 2014 to Dec 31 2017 */

data AllMonths;

do Year = 2014 to 2017;

do Month = 1 to 12;

DateToCheck = mdy(Month, 1, Year);

format DateToCheck date9.;

keep DateToCheck;

output;

end;

end;

/* Cartesian join to get a list of months and company names */

proc sql noprint;

create table AllMonthsNames as

select n.CompanyName, a.DateToCheck from Names n cross join AllMonths a;

quit;

/* Join the two, to get a full list of months with Type */

proc sql noprint;

create table Want as

select h.*, a.DateToCheck from Have h right join AllMonthsNames a on(h.TestDate = a.DateToCheck and h.CompanyName = a.CompanyName)

order by a.CompanyName, a.DateToCheck;

quit;

/* Ripple the values of CompanyName and Type */

data Want;

set Want;

length OldCompanyName $8 OldType $1 OldDate 8;

retain OldCompanyName OldType OldDate;

if _n_ = 1 then

OldDate = DateToCheck;

if DateToCheck < OldDate /* We've started on a new company */

then do;

OldDate = DateToCheck;

call missing(OldType, OldCompanyName);

OldDate = DateToCheck;

end;

else OldDate = DateToCheck;

if ^missing(Type) then

OldType = Type;

else Type = OldType;

if ^missing(CompanyName) then

OldCompanyName = CompanyName;

else CompanyName = OldCompanyName;

drop OldCompanyName OldType OldDate TestDate;

run;

PROC Star
Posts: 166

Re: data generation

Thank you for your acknowledgement. It 's almost 11:00pm at Chennai and I have had a few beers this evening and feeling so tired and knackered. I will look into your requirement and properly modify the code tomorrow at least for the sake of sharing knowledge. Meanwhile, I hope @TomKari 's code is useful for you. He is a genius and I have paid attention to many of his posts since I watched his interview on youtube few years ago. 

 

Regards,

Naveen Srinivasan

 

PROC Star
Posts: 166

Re: data generation

@Kyojik Hi, Sorry for not being able to give you the much needed solution last night as I was awefully tired. Here you go, the below will work for multiple companies. Please do let me know should if it doesn't quite meet your requirement. It's actually easy. So feel free to reach out.

 

Regards,

Naveen Srinivasan

 

proc sort data=have;

by companyname year month;

run;

 

data want(rename=(_month=month));

do until(last.companyname);

   do until(last.year);

      set have;

       by companyname year month;

       length _type __type $1;

       if first.companyname or first.year then call missing(_month,_type,__type);

       if first.year and last.year and month=12 then

         do;

              _month=month;

              output;

           end;

       else if (first.year and last.year) and month<12 then

         do;

              do _month=month to 12;

              output;

              end;

         end;

      else if (first.year and not last.year) and month<12 then

         do;

               _month=month;

                _type=type;

           end;

      else if not first.year then

         do;

              __type=type;

              do _month=_month to month-1;

              type=_type;

              output;

              end;

              _type=__type;

              if last.year then do;

              do _month=month to 12;

              type=_type;

              output;

              end;

           end;

      end;

  end;

end;

drop month _type __type;

run;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 348 views
  • 4 likes
  • 4 in conversation