<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: calculating mean by id and year in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914518#M360366</link>
    <description>Hi! &lt;BR /&gt;&lt;BR /&gt;Thank you for your reply. &lt;BR /&gt;&lt;BR /&gt;The reason that I want a wide format is that I need to use proc traj to do trajectory analysis which requires this kind of dataset. In my real data, I only have data for 5 years, so even using wide format my desired dataset won't look too messy..i guess</description>
    <pubDate>Mon, 05 Feb 2024 15:25:06 GMT</pubDate>
    <dc:creator>zihdonv19</dc:creator>
    <dc:date>2024-02-05T15:25:06Z</dc:date>
    <item>
      <title>calculating mean by id and year</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914042#M360190</link>
      <description>&lt;P&gt;I have a dataset like this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data your_dataset;
    input id year weight seq;
    datalines;
1 2000 12 0
1 2005 14 1
1 2006 23 2
2 2001 12 0
2 2001 45 1
2 2001 42 2
2 2002 75 3
2 2002 45 4
3 2003 97 0
3 2004 90 1
4 2010 56 0
4 2011 38 1
4 2012 91 2
4 2012 31 3
4 2014 16 4
5 2013 46 0
5 2013 46 1
5 2014 55 2
5 2015 64 3
5 2016 26 4
5 2017 36 5
5 2018 87 6
5 2019 46 7
;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This is what I would like to get:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data your_dataset;
    input id time0 time1 time2 time3 time4 time5 time6;
    datalines;
1 12 14 23 . . . .
2 33 60 . . . . .
3 97 90 . . . . .
4 56 38 61 16 . . .
5 46 55 64 26 36 87 46
;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The rule is:&lt;/P&gt;
&lt;P&gt;for people with same id and year, I only want the mean weight of this year. For example, for id2 with year 2001, the value in the desired table is 33, because (12+45+42)/3=33. Due to 2001 is the year that id2 had the first weight record, it is located in the colomn "time0" in the desired table.&lt;/P&gt;</description>
      <pubDate>Thu, 01 Feb 2024 16:28:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914042#M360190</guid>
      <dc:creator>zihdonv19</dc:creator>
      <dc:date>2024-02-01T16:28:55Z</dc:date>
    </item>
    <item>
      <title>Re: calculating mean by id and year</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914045#M360193</link>
      <description>&lt;P&gt;A big question is do you actually need a data set? A data set in that "wide" format is hard to use for almost any purpose.&lt;/P&gt;
&lt;P&gt;But a Report is actually fairly easy, and easier to understand.&lt;/P&gt;
&lt;P&gt;One quick example:&lt;/P&gt;
&lt;PRE&gt;proc tabulate data= your_dataset;
   class id year;
   var weight;
   table id,
         year*weight*mean
   ;
run;
&lt;/PRE&gt;
&lt;P&gt;Personally I find your use of Time0 etc very confusing as shown in your desired set as it completely hides details like for Id=1 there is a multiyear gap between time0 and time1.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There are lots of appearance option changes that can be made to the tabulate code, or alternately Proc Report will do similar.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You would have to convince me that there is a real advantage to that desired data set before actually making one.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 01 Feb 2024 17:20:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914045#M360193</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2024-02-01T17:20:58Z</dc:date>
    </item>
    <item>
      <title>Re: calculating mean by id and year</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914047#M360195</link>
      <description>&lt;P&gt;First, thankk you for providing the data as working SAS data step code. You make our job much easier, and thus you get a faster and more correct answer. Along those lines, there is no point in attaching data in a Microsoft Office document, as many of us will not download this attachment as it could be a security threat.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The first step is to create a sequence number from the original data. This is done in data set HAVE2. Once that is done, PROC REPORT does what PROC REPORT does.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
    input id year weight seq;
    datalines;
1 2000 12 0
1 2005 14 1
1 2006 23 2
2 2001 12 0
2 2001 45 1
2 2001 42 2
2 2002 75 3
2 2002 45 4
3 2003 97 0
3 2004 90 1
4 2010 56 0
4 2011 38 1
4 2012 91 2
4 2012 31 3
4 2014 16 4
5 2013 46 0
5 2013 46 1
5 2014 55 2
5 2015 64 3
5 2016 26 4
5 2017 36 5
5 2018 87 6
5 2019 46 7
;

data have2;
    set have;
    by id year;
    if first.id then sequence=0;
    if first.year and not first.id then sequence+1;
run;

proc report data=have2;
    columns id sequence,weight;
    define id/group;
    define sequence/across 'Time';
    define weight/mean;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 01 Feb 2024 17:25:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914047#M360195</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2024-02-01T17:25:27Z</dc:date>
    </item>
    <item>
      <title>Re: calculating mean by id and year</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914140#M360220</link>
      <description>&lt;P&gt;It might be better to number by the number of years since the first year for that ID.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have2;
  set have;
  by id year;
  if first.id then baseyr = year;
  retain baseyr ;
  sequence = year - baseyr;
run;

proc report data=have2;
  columns id baseyr weight,sequence;
  define id/group;
  define baseyr / group;
  define sequence/across 'Time';
  define weight/mean ' ';
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;                                                  Time
id     baseyr          0          1          2          3          4          5          6
 1       2000         12          .          .          .          .         14         23
 2       2001         33         60          .          .          .          .          .
 3       2003         97         90          .          .          .          .          .
 4       2010         56         38         61          .         16          .          .
 5       2013         46         55         64         26         36         87         46
&lt;/PRE&gt;</description>
      <pubDate>Fri, 02 Feb 2024 00:03:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914140#M360220</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2024-02-02T00:03:23Z</dc:date>
    </item>
    <item>
      <title>Re: calculating mean by id and year</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914518#M360366</link>
      <description>Hi! &lt;BR /&gt;&lt;BR /&gt;Thank you for your reply. &lt;BR /&gt;&lt;BR /&gt;The reason that I want a wide format is that I need to use proc traj to do trajectory analysis which requires this kind of dataset. In my real data, I only have data for 5 years, so even using wide format my desired dataset won't look too messy..i guess</description>
      <pubDate>Mon, 05 Feb 2024 15:25:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914518#M360366</guid>
      <dc:creator>zihdonv19</dc:creator>
      <dc:date>2024-02-05T15:25:06Z</dc:date>
    </item>
    <item>
      <title>Re: calculating mean by id and year</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914529#M360375</link>
      <description>&lt;P&gt;Once you have the sequence variable you can use it with the ID statement of PROC TRANSPOSE.&amp;nbsp; Use the PREFIX= option to set the base name of the new variables.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc transpose data=have out=want(drop=_name_) prefix=YEAR ;
  by id;
  id sequence ;
  var xxx ;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 05 Feb 2024 15:56:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914529#M360375</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2024-02-05T15:56:20Z</dc:date>
    </item>
    <item>
      <title>Re: calculating mean by id and year</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914533#M360378</link>
      <description>Thanks for your additional comments. but it gives error "ERROR: The ID value "YEAR0" occurs twice in the same BY group." I guess it's because for some person, they  have multiple records in the same year?</description>
      <pubDate>Mon, 05 Feb 2024 16:17:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculating-mean-by-id-and-year/m-p/914533#M360378</guid>
      <dc:creator>zihdonv19</dc:creator>
      <dc:date>2024-02-05T16:17:45Z</dc:date>
    </item>
  </channel>
</rss>

