SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Creating dummy variables based on dates

Accepted Solution Solved
Reply
Super Contributor
Posts: 441
Accepted Solution

Creating dummy variables based on dates

Hi,

suppose I have the following data:

Date
1/1/2010
23/6/2013
13/10/2008

What I would like to create are new dummy variables that will have a value of 1 for the corresponding year and 0 otherwise:

datedummy2010dummy2013dummy2008
1/1/2010100
23/6/2013010
13/10/2008001

Thank you


Accepted Solutions
Solution
‎01-13-2015 07:02 PM
Contributor
Posts: 57

Re: Creating dummy variables based on dates

There's a much easier way to do this without much coding.  Let SAS create a design or model matrix usling proc logisitc and the resulting model matrix will have the form you are looking for:

/*Recreate your data*/

data work.temp;

input mydate DDMMYY10.;

myyear=year(mydate);

cards;

1/1/2010

23/6/2013

13/10/2008

;

run;

/*This is all you need to do.  The work.xmatrix will contain your dummy coding*/

   proc logistic data=work.temp outdesign=xmatrix outdesignonly;;

      class myyear / param=glm;

      model myyear=myyear;

      run;

View solution in original post


All Replies
Regular Contributor
Posts: 180

Re: Creating dummy variables based on dates

Try this:

option missing=0;

data have;
  input date ddmmyy10.;
  format date ddmmyy10.;
  cards;
01/01/2010
23/06/2013
13/10/2008
;
run;
data temp;
  set have;
  year=year(date);
  N=1;
run;

proc transpose data=temp out=want(drop=_name_) prefix=dummy;
  by date notsorted;
  id year;
  var N;
run;

CTorres

Super Contributor
Posts: 441

Re: Creating dummy variables based on dates

Hi CTorres,

Thanks a lot for replying, I ran the code and it worked fine !!!

Staying on the same note, I would like to ask a small variation of my initial question please.

Suppose that instead of dates I had some categories and wanted to get the dummies for the categories, like the following:

ID

Category

Dummy A

Dummy B

1

A

1

0

2

B

0

1

Thank you!

SAS Employee
Posts: 340

Re: Creating dummy variables based on dates

Be carefull if you set  options missing=0  .

Those zeros in the "want: table are still missing values.

data want;

  set have;

  dummy2010=(year(date)=2010);

  dummy2013=(year(date)=2013);

  dummy2008=(year(date)=2008);

run;

Super Contributor
Posts: 441

Re: Creating dummy variables based on dates

Posted in reply to gergely_batho

Hi Gergely,

thank you for the reply, I ran the code and it worked good. But what if I have many categories for which I need to create corresponding dummies, and I don't know how many dummies there will be and what are they (and figuring this out and creating the dummies is tedious and time consuming) is there a way for SAS to do this for me instead of me actually creating dummy2010, dummy2013 etc?

Thank you

Super Contributor
Posts: 441

Re: Creating dummy variables based on dates

The code of CTorres actually finds what dummies need to be created, but when I have the same date appearing more than once, the corresponding dummy variable isn't created because I get an error message saying that there are more than one columns of the same name...

SAS Employee
Posts: 340

Re: Creating dummy variables based on dates

And a similar approach on your new question:

data want;

  set have;

  dummyA=(Category='A');

  dummyB=(Category='B');

run;

Regular Contributor
Posts: 180

Re: Creating dummy variables based on dates

Hi Ilikesas,

Although irrelevant at this point, the variation of my answer would be as follows:

option missing=0;

data have;
  input id category $;
  N=1;
  cards;
01 A 
02 B
;
run;

proc transpose data=have out=want(drop=_name_) prefix=dummy;
  by ID;
  id category;
  copy category;
  var N;
run;

CTorres

Super Contributor
Posts: 441

Re: Creating dummy variables based on dates

Hi CTorres, still thank you for replying because any additional knowledge is useful!

There is one problem when I use the transpose: if for example I have a date appearing several times then the transpose will give me an error because it is considering as if I want to create the same dummy column several times

Regular Contributor
Posts: 180

Re: Creating dummy variables based on dates

Hi Ilikesas,

Yes, maybe the PROC TRANSPOSE is not the rigth tool to solve your problem.

Anyway, you can use the LET option to "allows duplicate values of an ID variable. PROC TRANSPOSE transposes the observation that contains the last occurrence of a particular ID value within the data set or BY group.".


Check the following code witch includes two additional dates: one is duplicated and the other belongs to the same year. In this case the dates must be sorted. Maybe this is what you need:


data have;

  input date ddmmyy10.;

  format date ddmmyy10.;

  year=year(date);

  N=1;

  cards;

13/10/2008

01/01/2010

01/01/2010   <--- duplicated

23/06/2013

13/10/2013   <--- same year

;

run;

proc transpose data=have out=want(drop=_name_) let prefix=dummy;

  by date;

  id year;

  var N;

run;


SAS Log:


120  data have;
121    input date ddmmyy10.;
122    format date ddmmyy10.;
123    year=year(date);
124    N=1;
125    cards;

NOTE: The data set WORK.HAVE has 5 observations and 3 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


131  ;
132  run;
133  proc transpose data=have out=want(drop=_name_) let prefix=dummy;
134    by date;
135    id year;
136    var N;
137  run;

WARNING: The ID value "dummy2010" occurs twice in the same BY group.
NOTE: The above message was for the following BY group:
      date=01/01/2010
NOTE: There were 5 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 4 observations and 4 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

CTorres

Regular Contributor
Posts: 180

Re: Creating dummy variables based on dates

This is the result table:

          

datedummy2008dummy2010

dummy2013

113/10/2008100
201/01/2010010
323/06/2013001
413/10/2013001

CTorres

Super Contributor
Posts: 441

Re: Creating dummy variables based on dates

Hi CTorres,

thanks for still having an interest in the question.

In the code that you put above, the duplicated date was removed and as such the table has 4 observations instead of the initial 5, but the thing is that I still need the duplicated date because it corresponds to a different event, i.e, two events happened on that specific date.

Thank you

Solution
‎01-13-2015 07:02 PM
Contributor
Posts: 57

Re: Creating dummy variables based on dates

There's a much easier way to do this without much coding.  Let SAS create a design or model matrix usling proc logisitc and the resulting model matrix will have the form you are looking for:

/*Recreate your data*/

data work.temp;

input mydate DDMMYY10.;

myyear=year(mydate);

cards;

1/1/2010

23/6/2013

13/10/2008

;

run;

/*This is all you need to do.  The work.xmatrix will contain your dummy coding*/

   proc logistic data=work.temp outdesign=xmatrix outdesignonly;;

      class myyear / param=glm;

      model myyear=myyear;

      run;

Super Contributor
Posts: 441

Re: Creating dummy variables based on dates

Posted in reply to statistician13

Hi Statistician13,

thank you for your code, it combines what the two previous codes separately, namely it can both automatically create the dummy variables and it allows for the same date to be repeated.

In the output I get an intercept value, and I guess that this is due to the proc logistic, but I can drop it right?

Contributor
Posts: 57

Re: Creating dummy variables based on dates

Sure.  You can just use the drop statement:

   proc logistic data=work.temp outdesign=xmatrix(DROP=Intercept) outdesignonly;;

      class myyear / param=glm;

      model myyear=myyear;

      run;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 2005 views
  • 6 likes
  • 4 in conversation