BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SAS_ter
Calcite | Level 5

Hello All,


Would someone be able to offer advice on how I can do the following:


I would like to take the following multiple records (in red) and make a single record for every study_numb. I would like to do this in order to proc freq how many people have a (1) on day 1, day 2, day 3 etc. I think that I need to make them single records in order not to double count.

Thanks!


      study_  admission  discharge date of     Patient                                                     

   Obs numb   date       date      death       Type      nlos   day1  day2 day3

     1 10     31AUG2009 03SEP2009 05SEP2009       2      3      0      0   0

     2 11     10AUG2009 13AUG2009 13AUG2009       2      3      1      0   0

     3 12     28JUL2009 14AUG2009 28AUG2009       2      17     0      1   1

     4 12     14AUG2009 28AUG2009 28AUG2009       2      14     1      0   0

     5 13     12AUG2009 21AUG2009 28AUG2009       2      9      0      1   0

     6 14     28JUN2009 29JUL2009 29JUL2009       2      31     1      1   1

     7 15     10AUG2009 11AUG2009 17AUG2009       2      1      0      0   0

     8 15     15AUG2009 17AUG2009 17AUG2009       2      2      1      0   0

     9 16     17AUG2009 20AUG2009 20AUG2009       2      3      1      0   0

1 ACCEPTED SOLUTION

Accepted Solutions
Vince28_Statcan
Quartz | Level 8

Hi,

I'm not 100% sure to seize what all you intend to do but assuming day1_hosp, day15_hosp day25_hosp are the only columns you will want summaries by, here's how I would do it:

proc sql;

     create table want as

     select study_numb, sum(day1_hosp) as day1_hosp, sum(day15_hosp) as day15_hosp, sum(day25_hosp) as day25_hosp

     from have

     group by study_numb;

     /* actual counts for day1, day15, day25 */

    select sum(day1_hosp), sum(day15_hosp), sum(day25_hosp)

     from want;

quit;

The second small querry will give you the count for each variable assuming your data has no defunct and the columns only have 0s and 1s.

*Edit - If it is possible for your data to have 3 or more records from a single individual such that the sum > 1 on a given day#_hosp, then you might want to replace the sum function in the first query by a max function so that the later "sum" counts individuals only once and not however often they have records showing as being present on a given day.

Vincent

View solution in original post

10 REPLIES 10
AncaTilea
Pyrite | Level 9

Hi.

If want you want is to count the study_numb you can do this in several ways:

data want;

     set have; *your data set;

     by study_numb;

     if first.study_numb;

run;

OR

proc sql;

     select count(distinct study_numb) as your_count

     from your_data;

quit;

*this should display the count in your output window.

I hope it helps.

Anca.

SAS_ter
Calcite | Level 5

Thank you for the response.

I don't want to count the study_numb. I want to know the total count for day1_hosp for each study_numb. So, how many people were in hospital on day1. does this make sense?

Thanks,

N.

Amir
PROC Star

Hi,

What results would you expect to get for the sample data you have provided. That might help people have a clearer understanding of the requirements.

Regards,

Amir.

Vince28_Statcan
Quartz | Level 8

Hi,

I'm not 100% sure to seize what all you intend to do but assuming day1_hosp, day15_hosp day25_hosp are the only columns you will want summaries by, here's how I would do it:

proc sql;

     create table want as

     select study_numb, sum(day1_hosp) as day1_hosp, sum(day15_hosp) as day15_hosp, sum(day25_hosp) as day25_hosp

     from have

     group by study_numb;

     /* actual counts for day1, day15, day25 */

    select sum(day1_hosp), sum(day15_hosp), sum(day25_hosp)

     from want;

quit;

The second small querry will give you the count for each variable assuming your data has no defunct and the columns only have 0s and 1s.

*Edit - If it is possible for your data to have 3 or more records from a single individual such that the sum > 1 on a given day#_hosp, then you might want to replace the sum function in the first query by a max function so that the later "sum" counts individuals only once and not however often they have records showing as being present on a given day.

Vincent

SAS_ter
Calcite | Level 5

Thanks Vince. Will give this a try.

Yes, I only want summaries for the day variables. For example, how many people were in hospital on day1 etc. and yes, the column's only have 0s and 1s.

SAS_ter
Calcite | Level 5

Hi Vince,

Question: when you say "have" and "want" below (bold),  do I replace those with the same data names ? namely, the data set I am working from? or how does it work?

also, when you mean "first query" are you referring to the "sum" in the first paragraph and by "second query" are you referring to the sum in the second paragraph?

How does using the CREATE TABLE AS statement affect data integrity?

Thanks,

N.

proc sql;

     create table want as

     select study_numb, sum(day1_hosp) as day1_hosp, sum(day15_hosp) as day15_hosp, sum(day25_hosp) as day25_hosp

     from have

     group by study_numb;

     /* actual counts for day1, day15, day25 */

    select sum(day1_hosp), sum(day15_hosp), sum(day25_hosp)

     from want;

quit;

Vince28_Statcan
Quartz | Level 8


Yes, sorry. Have and Want appear to be the commonly used names for datasets examples throughout this forum.

Have would be the libref.dataset of your data

The first SQL query creates the table WANT which is supposed to give you what you've asked. The second query is merely running the count statistics that you also want from the dataset created.

So you will want to replace the one occurence of have by your current dataset with the syntax libref.datasetname and replace BOTH occurences of want with the libref.newdatasetname you want to give it.

Vince

SAS_ter
Calcite | Level 5

Yes, I'm a newbie and didn't realize that.

oh yes, sorry. I renamed WANT in my program and then didn't read it properly when I was writing the message.

Thanks again,

It worked!

AncaTilea
Pyrite | Level 9

sorry, for some reason I dind't read the entire question.

would this work for you?

proc means data = your_data sum;

  var hosp:; **I am assuming that you've coded 0/1 the subject_numb that meet the criteria for being hospitalized on day 1...n;

  ods output summary = sums;

run;

:smileyconfused:

Tom
Super User Tom
Super User

Can't you just brute force it?  I am not sure what you day numbers are, but let's do it for actual date and then you can try modifying it to use study day instead.

data by_day / view=by_day ;

  set have ;

  do day = addate to sepdate ;

    output;

  end;

format day date9.;

keep study_num day ;

run;

proc summary data=by_day nway ;

  class day ;

  output out=want ;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1141 views
  • 0 likes
  • 5 in conversation