BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cb23_york
Obsidian | Level 7

I'm sure this is very basic, but I'm having a  mind block and would appreciate some help.

Suppose I had data like this which is a list of names, dates and fruit and veg consumption recorded by dummy variablese

NameDayBananaCarrotApple
Bob1000
Bob2100
Bob5110
Claire1001
Claire2001
Claire3101
Claire4101

And I wished to produce a summary table like this

NameDaysBananaCarrotApple
Bob30.670.330
Claire40.501

What would be the best way about it (and imagine I had 50 more types of fruit and veg consumption and don't want to type in all their names).

Many thanks, Chris

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Are there some missing days or duplicated days for a name ?


data have;
input name $ day banana carrot apple;
cards;
Bob     1     0     0     0
Bob     2     1     0     0
Bob     5     1     1     0
Claire     1     0     0     1
Claire     2     0     0     1
Claire     3     1     0     1
Claire     4     1     0     1
;
run;
proc sql;
 select cat('sum(',strip(name),')/count(*) as ',strip(name)) into : list separated by ','
  from dictionary.columns
   where libname='WORK' and memname='HAVE' and upcase(name) not in ('NAME' 'DAY');
 
 create table want as
  select name,count(*) as days,&list
   from have
    group by name;
quit;

Xia Keshan

View solution in original post

8 REPLIES 8
RW9
Diamond | Level 26 RW9
Diamond | Level 26

What does the summary table mean?  Why does Bob have 3 days and 0.67 for instance?  If you don't want to type each one, then use arrays and numeric suffix variables:

data tmp;

     array fruit{3} 8.;

     do i=1 to 3;

          ...

     end;

run;

cb23_york
Obsidian | Level 7

To clarify, the table should contain each individuals name, a column counting the number of distinct days that we have an observation for that individual and the following columns should summarise the proportion of times that individual consumed that fruit or veg.  Bob has 3 data entries ( days 1,2 and 5) and consumed a banana on 66.6% of those days a carrot on 33.3% of those days and an apple on 0% of those days.  Hope that is clearer, thanks.

Ksharp
Super User

Are there some missing days or duplicated days for a name ?


data have;
input name $ day banana carrot apple;
cards;
Bob     1     0     0     0
Bob     2     1     0     0
Bob     5     1     1     0
Claire     1     0     0     1
Claire     2     0     0     1
Claire     3     1     0     1
Claire     4     1     0     1
;
run;
proc sql;
 select cat('sum(',strip(name),')/count(*) as ',strip(name)) into : list separated by ','
  from dictionary.columns
   where libname='WORK' and memname='HAVE' and upcase(name) not in ('NAME' 'DAY');
 
 create table want as
  select name,count(*) as days,&list
   from have
    group by name;
quit;

Xia Keshan

cb23_york
Obsidian | Level 7

That's fantastic Xia, much appreciated.  In answer to your question there indeed some missing days, but never any duplication of days for a name.

Ksharp
Super User

So you want count this missing day or not ?

If you don't want count missing day , then change it as

proc sql;

select cat('sum(',strip(name),')/count(day) as ',strip(name)) into : list separated by ','

  from dictionary.columns

   where libname='WORK' and memname='HAVE' and upcase(name) not in ('NAME' 'DAY');

create table want as

  select name,count(day) as days,&list

   from have

    group by name;

quit;

Xia Keshan

cb23_york
Obsidian | Level 7

Thanks Xia,

I do indeed not want to count the missing days, but the first code works fine as well.  In the example Bob has missing days 3 and 4, but the original code does fine as counting that he has 3 days of observed data.

Ksharp
Super User

Nope. I mean

Bob     .     0     0     0

Bob     .     1     0     0

Bob     5     1     1     0

cb23_york
Obsidian | Level 7

Ok that's clear.  No, there are no observations with missing data on the day.  But thank you for the amended code which would work under such circumstances.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1080 views
  • 0 likes
  • 3 in conversation