Help using Base SAS procedures

Interesting Query

Reply
Occasional Contributor
Posts: 6

Interesting Query

Hello guys

I have an interesting query. I have a dataset with certain columns named as start date, end date, type, item, For instance the toy data set looks as:

start _date    end_date          type   category

1/1/2011       5/1/2011          A       1945

16/1/2011      30/1/2011        B      1945

2/2/2011        3/2/2011         c        1945

4/2/2011       5/2/2011          A       1945

16/1/2011      30/1/2011        B        1945

17/1/2011        25/1/2011       B        1945

Now basically this information I have for entire year up to December. Now I want to extract the information from the data set in vertical manner so that my output dataset looks like the following:

date                           Type

1/1/2011                     A             1945

2/1/2011                     A            1945

3/1/2011                     A            1945

4/1/2011                     A            1945

5/1/2011                     A            1945

so on

16/1/2011                    B               1945

17/1/2011                    B               1945

18/1/2011                    B               1945

19/1/2011                    B                1945

so on up to

30/1/2011                    B                1945

2/2/2011                       C

3/2/2011                       C

4/2/2011                       A

5/2/2011                       A

So essentially the tasks are two fold:

Firstly the information given in  range(start date :end date) should appear row wise with each date corresponding to one row.

Secondly, corresponding to input data set, I want to get the value of type on that particular day. For instance if for the range 16/1/2011-30/1/2011 type has value B then in the output data set, each row corresponding to one single date should as entry B

I want to automate the above process as the input data set is enormous, so basically if one can help me with any code/macro which would automatically accomplish the above task, that would be highly appreciated.

Thanks

Ravinder

Super User
Posts: 5,518

Re: Interesting Query

This should be straightforward, and won't involve macro language.  Assuming that your date variables are actual SAS dates, you could try:

data want;

  set have;

  do date = start_date to end_date;

      output;

  end;

  keep date type category;

  format date ddmmyy10.;

run;

Good luck.

Occasional Contributor
Posts: 6

Re: Interesting Query

Thanks For your response. That worked perfectly. Now I have one more tricky task. Once we have the data in following format

Date type category

1/2/2011 A  1945

ate                           Type

1/1/2011                     A             1945

2/1/2011                     A            1945

3/1/2011                     A            1945

4/1/2011                     A            1945

5/1/2011                     A            1945

so on

16/1/2011                    B               1945

17/1/2011                    B               1945

18/1/2011                    B               1945

19/1/2011                    B                1945

now for each type we need to create separate variable such as 1 occurs where it is present and zero occurs where it is absent...

date                                  Category              

1/1/2011                                1945   TypeA     TypeB

2/1/2011                                1945    1            0

3/1/2011                                1945     1            0

4/1/2011                                1945    1             0

so on...

16/1/2011                                1945   0    1

17/1/2011                                1945   0     1

So basically for each type column value we need to create new  binary columns having one for that row in which that particular value occurs.

Thanks

Ravinder

Occasional Contributor
Posts: 6

Re: Interesting Query

Also due to size of the dataset we cannot adhere to use of IF/else conditional processing. We need to automate in some other way.

Super User
Posts: 10,046

Re: Interesting Query

Just add two statements into Astounding's code .

data want;

  set have;

  do date = start_date to end_date;

   typeA=(type='A');

typeB=(type='B');

      output;

  end;

  keep date type category;

  format date ddmmyy10.;

run;

Ksharp

Occasional Contributor
Posts: 6

Re: Interesting Query

But in actual data set, there are about 150 type values(that too very long descriptions, type is just toy name to explain here), I cannot include typeA=(type='A') for each of them. It would be very tedious.

Thanks

Ravi

Super User
Posts: 11,343

Re: Interesting Query

Will this list of names change in the future?

Occasional Contributor
Posts: 6

Re: Interesting Query

no for now it is static only...150 different type or descriptor variables

Respected Advisor
Posts: 3,156

Re: Interesting Query

Try this array() approach, it is a  variation base on  Astounding and Ksharp's suggestion.

data have;

infile cards truncover;

input (start_date end_date) (:ddmmyy10.) (type category) (:$);

format start_date end_date ddmmyy10.;

cards;

1/1/2011 5/1/2011 A 1945

16/1/2011 30/1/2011 B 1945

2/2/2011 3/2/2011 c 1945

4/2/2011 5/2/2011 A 1945

16/1/2011 30/1/2011 B 1945

17/1/2011 25/1/2011 B 1945

;

proc sql noprint;

  select distinct cats('type','_', type) into :type separated by ' ' from have;

  select cats(count(distinct type)) into :ct from have;

  quit;

data want;

  set have;

  array tp(&ct) &type.;

  do date = start_date to end_date;

  do i=1 to dim(tp);

tp(i)=(scan(vname(tp(i)),-1,'_')=type);

end;

  output;

  end;

  keep date type: category;

  format date ddmmyy10.;

run;

Haikuo

Super User
Posts: 10,046

Re: Interesting Query

Yeah. HaiKuo have already give you answer.

Here is my version:

data have;
infile cards truncover;
input (start_date end_date) (:ddmmyy10.) (type category) (:$);
format start_date end_date ddmmyy10.;
cards;
1/1/2011 5/1/2011 A 1945
16/1/2011 30/1/2011 B 1945
2/2/2011 3/2/2011 c 1945
4/2/2011 5/2/2011 A 1945
16/1/2011 30/1/2011 B 1945
17/1/2011 25/1/2011 B 1945
;
run;
proc sql noprint;

  select distinct cats('type',type,'=(type="',type,'") ;') into :type separated by ' ' from have;
  quit;

  data want;
  set have;
  do date = start_date to end_date;
      &type 
      output;
  end;
  format date ddmmyy10.;
run;

Ksharp

Ask a Question
Discussion stats
  • 9 replies
  • 459 views
  • 1 like
  • 5 in conversation