BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
michtka
Fluorite | Level 6

  Hi everyone, I got the next code that generate adv1 (a summary table):

  data new;
  length subjid leg_sort aeterm $20. trtrel 8.;
  input subjid leg_sort aeterm trtrel;
  datalines;
  100 drug drymouth  1
  101 drug drymouth  1
  102 placebo  drymount 0
  ;
  run;

proc sql noprint;
  create table adv1 as
  select count(distinct subjid) as n, leg_sort, 'Subjects evaluable for AE' as text
  from new
  where trtrel=1
  group by leg_sort;
quit;

my problem is that I would like to include the row placebo  with N= .

i.e, I  want to generate a dataset including all the possible values of leg_sort considering the values with missing data (placebo in my example).

Thnaks in advance.

V

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

OK, I understand now.

Then how about a union of the "where trtel"  where you count the subject id's with the "leg_sort eq 'placebo'" case with N specified as missing.  I only tested this SQL code with your 3 observations, but it worked in that case (I prefer proc summary).  I moved the respective WHERE clauses into data set name parameters so that I didn't have to figure out their influence as sql expressions.  SQL mavens might have a much more elegant way to do this.

proc sql;

  select count(distinct subjid) as n, leg_sort, 'Subjects Available for AE' as text

  from new (where=(trtrel eq 1))

  union

  select . as n, leg_sort, 'Subject Available for AE' as text

  from new (where=(leg_sort eq 'placebo'))

  group by leg_sort;

quit;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

13 REPLIES 13
mkeintz
PROC Star

Dear Mitchka:

You have not demonstrated a problem of dealing with missing data.  You've shown the effect of using "where trtel=1".

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
michtka
Fluorite | Level 6

The option where is neccessary, because i am interested in the records trtrel=1,

I know that there is not record of Placebo, but instead of not showing it, I am interested to consider not missing data (it is wrong, sorry, my mistake in the tittle), I am interested to consider the placebo raw with 0 value.

Something like that:

leg_sort    N

drug         2

Placebo   0

art297
Opal | Level 21

There are more combinations than shown in your desired output.  Will you get what you want by using proc freq?  e.g.:

proc freq data=new;

  tables trtrel*leg_sort*aeterm;

run;

MikeZdeb
Rhodochrosite | Level 12

hi ... three PROCs support preloading a format ... MEANS (SUMMARY), TABULATE, REPORT ... that make groups in your output dependent on a format rather than on your data

so, you can use SUMMARY with one or more CLASS variables to make a data set of counts, for example ...

proc format;

value $leg_sort 'drug' = 'drug' 'placebo' = 'placebo';

run;

data new;

input subjid (leg_sort aeterm) (:$20.) trtrel;

datalines;

100 drug drymouth  1

101 drug drymouth  1

102 placebo  drymount 0

;

proc summary data=new nway completetypes;

class leg_sort / preloadfmt;

output out=counts (drop=_type_ rename=(_freq_ = N));

where trtrel eq 1;

format leg_sort $leg_sort.;

run;


data set COUNTS ...

Obs    leg_sort    N

1     drug        2

2     placebo     0

mkeintz
PROC Star

Could you not then just add "or leg_sort='placebo'" to your where clause?

As in:

   where trtel=1 or leg_sort='placebo'

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
michtka
Fluorite | Level 6

Thnak you mkeintz for making this dataset with proc summary,...but there one way to do the same using proc sql?

the reason I insisted with proc sql is because it is part of a macro, and the macro consider always 2 rows, this the reason i need to create that extra row of placebo with 0 values.

thnakls.

mkeintz
PROC Star

I did not demonstrate the proc summary - that was Mike Zdeb.  My suggestion was explicitly for your PROC SQL which currently has where trtel=1 and could easily be changed to where trtel=1 or leg_sort='placebo'

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
MikeZdeb
Rhodochrosite | Level 12

Hi ... yes, I suggested SUMMARY.  I don't think that the suggestion

where trtel=1 changed to where trtel=1 or leg_sort='placebo'


proc sql;

  select count(distinct subjid) as n, leg_sort, 'Subjects evaluable for AE' as text

  from new

  where trtrel eq 1 or leg_sort eq 'placebo'

  group by leg_sort;

quit;

n  leg_sort              text

2  drug                  Subjects evaluable for AE

1  placebo               Subjects evaluable for AE


does what you want in that you get a count of 1  for placebo since rather than the 0 that you want. 


Given my meager SQL knowledge, I cannot think of an easy SQL approach since you are asking PROC SQL to produce a count for something that is not in your data set.

That is why I suggested SUMMARY so you could have a format rather than your data dictate the groups that appear in your table (including zero counts).

michtka
Fluorite | Level 6

Yers, you right, this "or upcase(leg_sort)=PLACEBO" i am not interested in....

In the way of proc sql you are right too, I was expected someone to help me with some condition to consider all

the possible values of leg_sort.

I dont think there any option can cover it using proc sql....but i think we can create a final dataset considering this extra record placebo with 0 value.

michtka
Fluorite | Level 6

I'm sorry for my confussion...but still not right...because

abviously It generate a record with placebo n=1, and I am interesting in a dataset just as Mike Zdeb did (counts), but using proc sql.

data set COUNTS ...

Obs    leg_sort    N

1     drug        2

2     placebo     0

Astounding
PROC Star

Assuming you change the WHERE clause to add "or upcase(leg_sort)='PLACEBO'" you have two ways to fix the result.

You could always use a DATA step before printing to change N for Placebo to 0.  Alternatively, you could select a more complex statistic that doesn't require a subsequent DATA step.  You'll have to check the syntax on this because SQL is a secondary language for me:

select min( count(distinct subjid), sum(trtrel) ) as N,

Good luck.

mkeintz
PROC Star

OK, I understand now.

Then how about a union of the "where trtel"  where you count the subject id's with the "leg_sort eq 'placebo'" case with N specified as missing.  I only tested this SQL code with your 3 observations, but it worked in that case (I prefer proc summary).  I moved the respective WHERE clauses into data set name parameters so that I didn't have to figure out their influence as sql expressions.  SQL mavens might have a much more elegant way to do this.

proc sql;

  select count(distinct subjid) as n, leg_sort, 'Subjects Available for AE' as text

  from new (where=(trtrel eq 1))

  union

  select . as n, leg_sort, 'Subject Available for AE' as text

  from new (where=(leg_sort eq 'placebo'))

  group by leg_sort;

quit;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
michtka
Fluorite | Level 6

Thanks mkeintz, this exactly like I wanted.

I need to add an if consition to your code, because not always n(placebo)=.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 2856 views
  • 7 likes
  • 5 in conversation