DATA Step, Macro, Functions and more

Help in my sql code

Reply
Super Contributor
Posts: 325

Help in my sql code

Dear,

 

Some one already helped me in my code.

 

Link  https://communities.sas.com/t5/Base-SAS-Programming/Help-in-Proc-sql-code/m-p/460346#M117003

 

In order to get the output i need to do proc transpose after proc sql step. After the proc transpose step i had to do a data step to rename the variable name "NAME OF FORMER VARIABLE" to "Param" and I had to delete col 2 to col 4. Is there anything i can do to modify my sql code to get the output i need without additional transpose and data step. Thank you

data one;
input id trt$ date $18.;
DATALINES;
1 A 2014-04-26T13:10
1 A 2014-05-17T11:30
1 A 2014-06-07T11:45
1 A 2014-06-28T13:30
1 B 2014-04-26T13:10
1 B 2014-05-17T11:30
1 B 2014-06-07T11:45
1 B 2014-06-28T13:30
2 A 2014-04-26T13:10
2 A 2014-05-17T11:30
2 A 2014-06-07T11:45
2 A 2014-06-28T13:30
2 B 2014-04-26T13:10
2 B 2014-05-17T11:30
2 B 2014-06-07T11:45
2 B 2014-06-28T13:30
;
PROC SQL;
  CREATE TABLE TWO AS
    SELECT id,trt,
           input(date,is8601dt.) as exstdt format datetime18.,
           count(id) as numberofrecords,
           intck('week',min(datepart(calculated exstdt)),
              max(datepart(calculated exstdt)),'c') as durationinweeks,
           calculated numberofrecords/(round(calculated durationinweeks/3)+1) as compliance
      from one
        group by id,trt         
  ;
QUIT;

proc transpose data=two out=three;
by id trt;
var numberofrecords durationinweeks compliance;
run;

output needed:


id      trt      param                  aval
1      A    numberofrecords       4
1     A      durartioninweeks       9
1      A          compliance           1
1      B    numberofrecords        4
1      B        durartioninweeks     9
1      B         compliance              1
2      A       numberofrecords       4
2     A         durartioninweeks        9
2      A         compliance                 1
2      B          numberofrecords      4
2      B         durartioninweeks         9 
2     B             compliance              1

 

Valued Guide
Posts: 540

Re: Help in my sql code

Posted in reply to knveraraju91

You can add rename and drop options to the output dataset of your proc transpose:

 

proc transpose data=two out=three(rename=(_NAME_=param col1=aval) drop=col2-col4);
	by id trt;
	var numberofrecords durationinweeks compliance;
run;

 

As far as coding transpose in SQL: I wish anyone good luck. Many have tried and most (all?) have failed. You could consider a datastep but that would not necessarily improve your program as a whole. But it can be done.

 

Hope this helps,

- Jan.

Frequent Contributor
Posts: 142

Re: Help in my sql code

Posted in reply to knveraraju91

Hi,

 

I don't think its a bad idea to transpose as you want , but keeping In mind that you are deriving the required data in sql step, it would need additional step to get the required format i.e. transpose . here you can eliminate data step by using options.

 

proc transpose data=two out=three(rename=(_name_=param col1=aval) drop= col2 col3 col4);

by id trt;

var numberofrecords durationinweeks compliance;

run;

 

Thanks,

Vish

Occasional Contributor
Posts: 7

Re: Help in my sql code

[ Edited ]
Posted in reply to knveraraju91

Hi,

 

By "to get the output i need without additional transpose and data step", I assumed you expected to get the output within two steps of Data Step and/or PROC SQL. Basically, to achieve the result similar to that of PROC TRANSPOSE, the statement UNION is recommended in PROC SQL. Please see the example program below:

data one;
   input id trt $ date ymddttm16.;
   format date datetime18.;
   datalines;
1 A 2014-04-26T13:10
1 A 2014-05-17T11:30
1 A 2014-06-07T11:45
1 A 2014-06-28T13:30
1 B 2014-04-26T13:10
1 B 2014-05-17T11:30
1 B 2014-06-07T11:45
1 B 2014-06-28T13:30
2 A 2014-04-26T13:10
2 A 2014-05-17T11:30
2 A 2014-06-07T11:45
2 A 2014-06-28T13:30
2 B 2014-04-26T13:10
2 B 2014-05-17T11:30
2 B 2014-06-07T11:45
2 B 2014-06-28T13:30
   ;
run;

proc sql;
   create table two as
   select distinct id, trt, 'numberofrecords' as param length=32, 1 as paramn,
      count(*) as aval
   from one
   group by id, trt
   union
   select distinct id, trt, 'durationinweeks', 2,
      intck('week', min(datepart(date)), max(datepart(date)), 'c')
   from one
   group by id, trt
   select distinct id, trt, 'compliance', 3,
      count(*)/(round(intck('week', min(datepart(date)), max(datepart(date)), 'c')/3)+1)
   from one
   group by id, trt
   order by id, trt, paramn;
quit;

There is, of course, an obvious downside in this two-step programming, which is redundancy, e.g. the GROUP BY statement; and the CALCULATED statement would be no longer applicable since "number of records" and "duration in weeks" were not defined in the same SELECT statement any more. In addition, if the definition of variable PARAMN was skipped, observations in data set TWO would be sorted by ID, TRT, and PARAM alphabetically.

 

To avoid such redundancy, I'm afraid a third step of PROC TRANSPOSE is necessary. You can apply the NAME= option, along with the data set option RENAME= for the DATA= option in PROC TRANSPOSE statement, something similar to the following, so that a fourth Data Step can be skipped:

proc transpose out=three( rename=(col1=aval)) name=param;
   var numberofrecords durationinweeks compliance;
   by id trt;
quit;

To make sure COL2, COL3, and COL4 will not be generated, it is essential to apply SELECT DISTINCT, instead of SELECT statement in PROC SQL.

 

P.S. I took the liberty to modify the programming of Data Step ONE so that variable DATE is defined as numeric directly.

 

Hope this would Help!

Ask a Question
Discussion stats
  • 3 replies
  • 138 views
  • 4 likes
  • 4 in conversation