Solved: Re: summary proc sql (add a second subquery with different select dist...

michtka · Posted 06-21-2012 08:32 AM

Hi eaveryone, I

've got the next dataset have, my problem is that I want to make a second query to add 2 new variables pla_total and drug_total to my first query (5 variables).

But as difference for the first query (select distinct subno,trt,time) , this second query need to come from different select ( select distinct subno, trt) to summary

the right number of subjects 3, and 2, rather than 6 and 4, but I dont know how to do it.

data have;

length sex race trt time $20;

length subno weight height age 8;

input subno weight height sex race age trt time;

datalines;

                       1      63.7     141.85    Male      Caucasian     19    Drug     present
                       1      63.7     141.85    Male      Caucasian     19    Drug     past
                       2      72.6     149.22    Female    Black         27    Placebo present
                       2      72.6     149.22    Female    Black         27    Placebo past
                       3      60.7     147.64    Female    Caucasian     26    drug     present
                       3      60.7     147.64    Female    Caucasian     26    drug     past
                       4      64.2     150.93    Male      Caucasian     21    Placebo present
                       4      64.2     150.93    Male      Caucasian     21    Placebo past
                       5      60.6     153.56    Male      Asian         41    drug     present
                       5      60.6     153.56    Male      Asian         41    drug     past
                      run;

          proc sql;
          create table want as
          *first query;

          select 'n subjects' as col0,
          sum(upcase(trt)='DRUG' and upcase(time)='PAST') as drug_past,
          sum(upcase(trt)='DRUG' and upcase(time)='PRESENT') as drug_present,
          sum(upcase(trt)='PLACEBO' and upcase(time)='PAST') as pla_past,
          sum(upcase(trt)='PLACEBO' and upcase(time)='PRESENT') as pla_present,
          *second query, where need to be created from different select, not how it is at the moment;

          sum(upcase(trt)='DRUG') as drug_total,
          sum(upcase(trt)='PLACEBO') as pla_total
          from (select distinct subno,trt,time from have);
          quit;

Cheers,

V.

Alpay · Posted 06-21-2012 01:37 PM

You already have first part as a subquery. Do the same type of sub querying for the additional two variables and merge the results.

data have;

length sex race trt time $20;

length subno weight height age 8;

input subno weight height sex race age trt time;

datalines;

1 63.7 141.85 Male Caucasian 19 Drug present

1 63.7 141.85 Male Caucasian 19 Drug past

2 72.6 149.22 Female Black 27 Placebo present

2 72.6 149.22 Female Black 27 Placebo past

3 60.7 147.64 Female Caucasian 26 drug present

3 60.7 147.64 Female Caucasian 26 drug past

4 64.2 150.93 Male Caucasian 21 Placebo present

4 64.2 150.93 Male Caucasian 21 Placebo past

5 60.6 153.56 Male Asian 41 drug present

5 60.6 153.56 Male Asian 41 drug past

;

run;

proc sql;

create table want as

select 'n subjects' as col0,

*

from (

select

sum(upcase(trt)='DRUG' and upcase(time)='PAST') as drug_past,

sum(upcase(trt)='DRUG' and upcase(time)='PRESENT') as drug_present,

sum(upcase(trt)='PLACEBO' and upcase(time)='PAST') as pla_past,

sum(upcase(trt)='PLACEBO' and upcase(time)='PRESENT') as pla_present

from (select distinct subno,trt,time from have) ) ,

(select

sum(upcase(trt)='DRUG') as drug_total,

sum(upcase(trt)='PLACEBO') as pla_total

from (select distinct subno,trt from have)) ;

quit;

View solution in original post

Tom · Posted 06-21-2012 12:09 PM

In this simple case you could just add the two numbers you just calculated.

data have;

length subno weight height 8 sex race $20 age 8 trt time $20;

input subno weight height sex race age trt time;

datalines;

1 63.7 141.85 Male Caucasian 19 Drug present

1 63.7 141.85 Male Caucasian 19 Drug past

2 72.6 149.22 Female Black 27 Placebo present

2 72.6 149.22 Female Black 27 Placebo past

3 60.7 147.64 Female Caucasian 26 drug present

3 60.7 147.64 Female Caucasian 26 drug past

4 64.2 150.93 Male Caucasian 21 Placebo present

4 64.2 150.93 Male Caucasian 21 Placebo past

5 60.6 153.56 Male Asian 41 drug present

5 60.6 153.56 Male Asian 41 drug past

run;

proc sql;

create table want as

select

'n subjects' as col0

,sum(upcase(trt)='DRUG' and upcase(time)='PAST') as drug_past

,sum(upcase(trt)='DRUG' and upcase(time)='PRESENT') as drug_present

,sum(upcase(trt)='PLACEBO' and upcase(time)='PAST') as pla_past

,sum(upcase(trt)='PLACEBO' and upcase(time)='PRESENT') as pla_present

,calculated drug_past + calculated drug_present as drug_total

,calculated pla_past + calculated pla_present as pla_total

from (select distinct subno,trt,time from have)

;

quit;

data _null_;

set ;

put (_all_) (=/);

run;

Results:

col0=n subjects

drug_past=3

drug_present=3

pla_past=2

pla_present=2

drug_total=6

pla_total=4

michtka · Posted 06-21-2012 12:21 PM

Hi Tom, this is exactly like I dont want, I want just subject taking drug=3, and subject taking placebo=2, for this reason I have to play with

select distinct subno, trt, rather than subno, trt, time

Tom · Posted 06-21-2012 12:47 PM

Since you only want one row out you can just join the separate queries.

proc sql;

create table want as

select

'n subjects' as col0

,sum(trt='DRUG' and time='PAST') as drug_past

,sum(trt='DRUG' and time='PRESENT') as drug_present

,sum(trt='PLACEBO' and time='PAST') as pla_past

,sum(trt='PLACEBO' and time='PRESENT') as pla_present

,max(drug_total) as drug_total

,max(pla_total) as pla_total

from (select distinct subno,upcase(trt) as trt,upcase(time) as time from have)

, (select count(distinct subno) as drug_total from have where upcase(trt)='DRUG')

, (select count(distinct subno) as pla_total from have where upcase(trt)='PLACEBO')

;

quit;

col0=n subjects

drug_past=3

drug_present=3

pla_past=2

pla_present=2

drug_total=3

pla_total=2

You can also just skip SQL and use SAS procedures. Then you do not need to hard code values of TRT or TIME.

proc sql noprint ;

create view usubj as select distinct subno,trt,time from have;

quit;

proc summary data=usubj ;

class trt time ;

types trt trt*time;

output out=summary;

run;

proc transpose data=summary out=want1 (drop=_:) delimiter=_ ;

where _type_=3;

id trt time;

var _freq_;

run;

proc transpose data=summary out=want2 (drop=_:) suffix=_total ;

where _type_=2;

id trt;

var _freq_;

run;

data want;

merge want1 want2 ;

run;

michtka · Posted 06-21-2012 05:15 PM

Thank you Tom, I like you sql approach, it works well.

Just only, I am workin out the power of proc sql.

PGStats · Posted 06-21-2012 01:09 PM

How about :

   proc sql;
    create table uhave as
    select "n subjects" as col0, lowcase(trt) as utrt, lowcase(time) as utime, count(distinct subno) as n from have group by utrt, utime
    union all
    select "n subjects" as col0, lowcase(trt) as utrt, "total" as utime, count(distinct subno) from have group by utrt;

    proc transpose data=uhave out=want(drop=_:) delim=_;
    by col0;
    var n;
    id utrt utime;
    run;

PG

Alpay · Posted 06-21-2012 01:37 PM

You already have first part as a subquery. Do the same type of sub querying for the additional two variables and merge the results.

data have;

length sex race trt time $20;

length subno weight height age 8;

input subno weight height sex race age trt time;

datalines;

1 63.7 141.85 Male Caucasian 19 Drug present

1 63.7 141.85 Male Caucasian 19 Drug past

2 72.6 149.22 Female Black 27 Placebo present

2 72.6 149.22 Female Black 27 Placebo past

3 60.7 147.64 Female Caucasian 26 drug present

3 60.7 147.64 Female Caucasian 26 drug past

4 64.2 150.93 Male Caucasian 21 Placebo present

4 64.2 150.93 Male Caucasian 21 Placebo past

5 60.6 153.56 Male Asian 41 drug present

5 60.6 153.56 Male Asian 41 drug past

;

run;

proc sql;

create table want as

select 'n subjects' as col0,

*

from (

select

sum(upcase(trt)='DRUG' and upcase(time)='PAST') as drug_past,

sum(upcase(trt)='DRUG' and upcase(time)='PRESENT') as drug_present,

sum(upcase(trt)='PLACEBO' and upcase(time)='PAST') as pla_past,

sum(upcase(trt)='PLACEBO' and upcase(time)='PRESENT') as pla_present

from (select distinct subno,trt,time from have) ) ,

(select

sum(upcase(trt)='DRUG') as drug_total,

sum(upcase(trt)='PLACEBO') as pla_total

from (select distinct subno,trt from have)) ;

quit;

michtka · Posted 06-21-2012 05:18 PM

Alpay, thnak you very much, this is like I was looking for...the way to select variables from different select distinct, It works well,

but please, can you explain me the function of the asterisk after col0?

I will appreciate this, thanks.

Alpay · Posted 06-22-2012 05:24 AM

Asterisk will select all columns in underlying tables.

In this case, two sub queries in above sql script act as tables. One sub query returns 4 variables, and the other one returns 2 variables.

For more explanation, please refer to PROC SQL documentation in help file.

michtka · Posted 06-22-2012 05:28 AM

Thank you very much.

V.

summary proc sql (add a second subquery with different select distinct)

Re: summary proc sql (add a second subquery with different select distinct)

Re: summary proc sql (add a second subquery with different select distinct)

Re: summary proc sql (add a second subquery with different select distinct)

Re: summary proc sql (add a second subquery with different select distinct)

Re: summary proc sql (add a second subquery with different select distinct)

Re: summary proc sql (add a second subquery with different select distinct)

Re: summary proc sql (add a second subquery with different select distinct)

Re: summary proc sql (add a second subquery with different select distinct)

Re: summary proc sql (add a second subquery with different select distinct)

Re: summary proc sql (add a second subquery with different select distinct)

Registration is open

SAS Training: Just a Click Away