<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Coding a Group Variable in SAS Health and Life Sciences</title>
    <link>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Coding-a-Group-Variable/m-p/38874#M1245</link>
    <description>I think it could be done like this&lt;BR /&gt;
data  one(where=(last.id=1);&lt;BR /&gt;
set yourdata;&lt;BR /&gt;
by id;&lt;BR /&gt;
if ~first.id and ~last.id then group=1;&lt;BR /&gt;
lagid=lag(id);&lt;BR /&gt;
lagc=lag(comment);&lt;BR /&gt;
if id=lagid and comment~="complete" then group=lag(group);&lt;BR /&gt;
if id=lagid and comment="complete" and lagc=TBI POS then group=lag(group)+1;&lt;BR /&gt;
if id=lagid and comment="complete" and lagc=~TBI POS then group=2;&lt;BR /&gt;
run;&lt;BR /&gt;
data next;&lt;BR /&gt;
merge yourdata new;&lt;BR /&gt;
by id;&lt;BR /&gt;
run;&lt;BR /&gt;
data final;&lt;BR /&gt;
set next;&lt;BR /&gt;
by id;&lt;BR /&gt;
lagid=lag(id);&lt;BR /&gt;
lagc=lag(comment);&lt;BR /&gt;
lagg=lag(group);&lt;BR /&gt;
if group=2 and id=lagid and lagc="complete" then group=9;&lt;BR /&gt;
if id=lagid and lagg=9 then group=9;&lt;BR /&gt;
run;&lt;BR /&gt;
I donot know whether it works , my idea is to use lag function.</description>
    <pubDate>Fri, 15 Aug 2008 05:20:57 GMT</pubDate>
    <dc:creator>deleted_user</dc:creator>
    <dc:date>2008-08-15T05:20:57Z</dc:date>
    <item>
      <title>Coding a Group Variable</title>
      <link>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Coding-a-Group-Variable/m-p/38870#M1241</link>
      <description>I have a dataset consisting of multiple records per ID (patient). For each ID I want to code the "Group" variable with 0, 1, 2 or 9. The coding would be based on values in the "Comment" variable. Below are the possible values for the "Comments" variable:&lt;BR /&gt;
&lt;BR /&gt;
TBI POS&lt;BR /&gt;
CANCELLED BY CLINIC&lt;BR /&gt;
NO-SHOW&lt;BR /&gt;
NO ACTION TAKEN&lt;BR /&gt;
CANCELLED BY PATIENT&lt;BR /&gt;
COMPELTE&lt;BR /&gt;
&lt;BR /&gt;
The dataset is sorted by id then CDATE. TBI POS will always be the first "Comment" value for each ID. Sometimes it may not, but should be ingored in this case. The "Group" variable should be coded according to the following:&lt;BR /&gt;
&lt;BR /&gt;
For each ID (patient)&lt;BR /&gt;
If TBI POS is the only field then "Group" = 0&lt;BR /&gt;
If TBI POS immediately followed by COMPLETE then "Goup" = 1&lt;BR /&gt;
IF TBI POS followed by anything except COMPLETE but eventually has COMPLETE then "Group" = 2.&lt;BR /&gt;
IF TBI POS followed by anything except COMPLETE and COMPLETE nvever shows up then "Group" = 0.&lt;BR /&gt;
&lt;BR /&gt;
After an ID is determined to be in Group 2 then all values after the COMPLETE should be coded "Group" = 9. So "Group" 9 is a subset of "Group" 2.&lt;BR /&gt;
 &lt;BR /&gt;
Below is example output&lt;BR /&gt;
ID	Comment	Group&lt;BR /&gt;
875	TBI POS	2&lt;BR /&gt;
875	NO-SHOW	2&lt;BR /&gt;
875	COMPLETE	2&lt;BR /&gt;
886	TBI POS	1&lt;BR /&gt;
886	COMPLETE	1&lt;BR /&gt;
912	TBI POS	1&lt;BR /&gt;
912	COMPLETE	1&lt;BR /&gt;
912	UNSCHEDULED	9&lt;BR /&gt;
931	TBI POS	0&lt;BR /&gt;
946	TBI POS	0&lt;BR /&gt;
946	CANCELLED BY PATIENT	0&lt;BR /&gt;
946	CANCELLED BY CLINIC	0&lt;BR /&gt;
946	NO ACTION TAKEN	0&lt;BR /&gt;
946	NO-SHOW	0&lt;BR /&gt;
1072	TBI POS	2&lt;BR /&gt;
1072	NO-SHOW	2&lt;BR /&gt;
1072	NO-SHOW	2&lt;BR /&gt;
1072	COMPLETE	2&lt;BR /&gt;
1086	TBI POS	1&lt;BR /&gt;
1086	COMPLETE	1&lt;BR /&gt;
1086	COMPLETE	9&lt;BR /&gt;
1086	CANCELLED BY PATIENT	9&lt;BR /&gt;
1086	COMPLETE	9</description>
      <pubDate>Wed, 13 Aug 2008 22:05:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Coding-a-Group-Variable/m-p/38870#M1241</guid>
      <dc:creator>WAL83</dc:creator>
      <dc:date>2008-08-13T22:05:13Z</dc:date>
    </item>
    <item>
      <title>Re: Coding a Group Variable</title>
      <link>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Coding-a-Group-Variable/m-p/38871#M1242</link>
      <description>After reading and sorting your input, use a DATA step to input your sorted data, and  assign your GROUP (based on using RETAIN statement variable), as your have explained.  And if you know all possible COMMENT values, consider using a PROC FORMAT to read-up your input data, which would minimize your need for IF/THEN logic to assign GROUP.  &lt;BR /&gt;
&lt;BR /&gt;
Remember that PROC FORMAT can handle character-data range, such as your "starts with", but the VALUE statement or CNTLIN= coding is a bit tricky.  If you do not have a large-volume of unique COMMENT strings, maybe the DATA step IF/THEN approach is sufficient, rather than the time required to setup a PROC FORMAT approach.&lt;BR /&gt;
&lt;BR /&gt;
Scott Barry&lt;BR /&gt;
SBBWorks, Inc.</description>
      <pubDate>Thu, 14 Aug 2008 15:14:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Coding-a-Group-Variable/m-p/38871#M1242</guid>
      <dc:creator>sbb</dc:creator>
      <dc:date>2008-08-14T15:14:14Z</dc:date>
    </item>
    <item>
      <title>Re: Coding a Group Variable</title>
      <link>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Coding-a-Group-Variable/m-p/38872#M1243</link>
      <description>Thanks for the info. I'll give it shot.</description>
      <pubDate>Thu, 14 Aug 2008 21:27:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Coding-a-Group-Variable/m-p/38872#M1243</guid>
      <dc:creator>WAL83</dc:creator>
      <dc:date>2008-08-14T21:27:42Z</dc:date>
    </item>
    <item>
      <title>Re: Coding a Group Variable</title>
      <link>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Coding-a-Group-Variable/m-p/38873#M1244</link>
      <description>Hi:&lt;BR /&gt;
  Scott's approach (with retain and data step and probably first. and last. variables) is one way to go.&lt;BR /&gt;
&lt;BR /&gt;
  I'd probably approach this in a different way -- you have long/skinny data. I would be tempted to use PROC TRANSPOSE to "flip" the data into WIDE data -- so that it would look like this:&lt;BR /&gt;
[pre]&lt;BR /&gt;
Obs     id     _NAME_      COL1      COL2                    COL3                   COL4                      COL5&lt;BR /&gt;
             &lt;BR /&gt;
 1      875    Comment    TBI POS    NO-SHOW                 COMPLETE&lt;BR /&gt;
 2      886    Comment    TBI POS    COMPLETE&lt;BR /&gt;
 3      912    Comment    TBI POS    COMPLETE                UNSCHEDULED&lt;BR /&gt;
            &lt;BR /&gt;
[/pre]&lt;BR /&gt;
                   &lt;BR /&gt;
That way, you could use an ARRAY statement in a DATA step program and make COL1-COL? the variables that composed the array. This would lead to the ability to have IF statements something like this:&lt;BR /&gt;
[pre]&lt;BR /&gt;
data setgroup (keep=ID comment group)&lt;BR /&gt;
     error_obs;&lt;BR /&gt;
&lt;BR /&gt;
  set transout;  /* output from PROC TRANSPOSE */&lt;BR /&gt;
  ARRAY stmt for transposed comment variables;&lt;BR /&gt;
&lt;BR /&gt;
  ... do loop to find out how many comments per ID, and set CNTR variable and also&lt;BR /&gt;
     set an indicator EVERCOMPLETE for whether there's ever a "COMPLETE" &lt;BR /&gt;
     in the comments...&lt;BR /&gt;
   &lt;BR /&gt;
  * then have IF statements like these:;&lt;BR /&gt;
  if cmnt(1) = "TBI POS" then do;&lt;BR /&gt;
    if cntr = 1 then do;&lt;BR /&gt;
       group = 0;&lt;BR /&gt;
    /* If TBI POS is the only field then "Group" = 0 */&lt;BR /&gt;
    end;&lt;BR /&gt;
    else if cmnt(2) = "COMPLETE" then do;&lt;BR /&gt;
       group = 1;&lt;BR /&gt;
      /* If TBI POS immediately followed by COMPLETE &lt;BR /&gt;
           then "Group" = 1 */&lt;BR /&gt;
    end;&lt;BR /&gt;
    else if cmnt(2) ne "COMPLETE" then do;&lt;BR /&gt;
         if evercomplete = 1 then group = 2;&lt;BR /&gt;
         else if evercomplete = 0 then group = 0;&lt;BR /&gt;
         /*&lt;BR /&gt;
        IF TBI POS followed by anything except COMPLETE &lt;BR /&gt;
         ... eventually has COMPLETE then "Group" = 2&lt;BR /&gt;
         ... COMPLETE never shows up then "Group" = 0&lt;BR /&gt;
        */&lt;BR /&gt;
    end;&lt;BR /&gt;
  end;&lt;BR /&gt;
  else if cmnt(1) ne "TBI POS" then output error_obs;&lt;BR /&gt;
     &lt;BR /&gt;
  ...then another do loop to write out the ID, the comment and the group&lt;BR /&gt;
     as a long skinny data set again. But the logic for the DO loop depends on&lt;BR /&gt;
     whether the 9 coding is applied to just group 2 or to group 1 and 2....&lt;BR /&gt;
run;&lt;BR /&gt;
[/pre]&lt;BR /&gt;
 &lt;BR /&gt;
  I did find something inconsistent in your description of the coding and what you showed as your desired results -- you said that if there were comments after the group had been determined to be 2, then those comments should be coded as 9. But in your desired results above, you showed these 2 IDs as having a 9 coded after a GROUP of 1 had been assigned.&lt;BR /&gt;
[pre]&lt;BR /&gt;
ID Comment Group&lt;BR /&gt;
912 TBI POS 1&lt;BR /&gt;
912 COMPLETE 1&lt;B&gt;&lt;BR /&gt;
912 UNSCHEDULED 9&lt;/B&gt;&lt;BR /&gt;
     &lt;BR /&gt;
1086 TBI POS 1&lt;BR /&gt;
1086 COMPLETE 1&lt;B&gt;&lt;BR /&gt;
1086 COMPLETE 9&lt;BR /&gt;
1086 CANCELLED BY PATIENT 9&lt;BR /&gt;
1086 COMPLETE 9 &lt;/B&gt;&lt;BR /&gt;
[/pre]&lt;BR /&gt;
              &lt;BR /&gt;
Also, you do not say whether there would EVER be anything other than TBI POS as the first comment, but if it's possible, then you might want to catch that ID as an error observation.&lt;BR /&gt;
 &lt;BR /&gt;
Another approach to think about -- since you do not know whether to code the FIRST obs for an ID as a 1 or a 0 or a 2 until you know what the value of the second comment is. There are ways around that, but this is a possible approach. And, you could just build an array in the DATA step without ever using PROC TRANSPOSE, too. It really depends on your comfort level with SAS programming.&lt;BR /&gt;
 &lt;BR /&gt;
cynthia</description>
      <pubDate>Fri, 15 Aug 2008 03:32:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Coding-a-Group-Variable/m-p/38873#M1244</guid>
      <dc:creator>Cynthia_sas</dc:creator>
      <dc:date>2008-08-15T03:32:25Z</dc:date>
    </item>
    <item>
      <title>Re: Coding a Group Variable</title>
      <link>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Coding-a-Group-Variable/m-p/38874#M1245</link>
      <description>I think it could be done like this&lt;BR /&gt;
data  one(where=(last.id=1);&lt;BR /&gt;
set yourdata;&lt;BR /&gt;
by id;&lt;BR /&gt;
if ~first.id and ~last.id then group=1;&lt;BR /&gt;
lagid=lag(id);&lt;BR /&gt;
lagc=lag(comment);&lt;BR /&gt;
if id=lagid and comment~="complete" then group=lag(group);&lt;BR /&gt;
if id=lagid and comment="complete" and lagc=TBI POS then group=lag(group)+1;&lt;BR /&gt;
if id=lagid and comment="complete" and lagc=~TBI POS then group=2;&lt;BR /&gt;
run;&lt;BR /&gt;
data next;&lt;BR /&gt;
merge yourdata new;&lt;BR /&gt;
by id;&lt;BR /&gt;
run;&lt;BR /&gt;
data final;&lt;BR /&gt;
set next;&lt;BR /&gt;
by id;&lt;BR /&gt;
lagid=lag(id);&lt;BR /&gt;
lagc=lag(comment);&lt;BR /&gt;
lagg=lag(group);&lt;BR /&gt;
if group=2 and id=lagid and lagc="complete" then group=9;&lt;BR /&gt;
if id=lagid and lagg=9 then group=9;&lt;BR /&gt;
run;&lt;BR /&gt;
I donot know whether it works , my idea is to use lag function.</description>
      <pubDate>Fri, 15 Aug 2008 05:20:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Coding-a-Group-Variable/m-p/38874#M1245</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-08-15T05:20:57Z</dc:date>
    </item>
    <item>
      <title>Re: Coding a Group Variable</title>
      <link>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Coding-a-Group-Variable/m-p/38875#M1246</link>
      <description>Thanks for the detailed explanations. I manually coded the groups since my dataset was so small but when I have some extra time I will try the suggested methods because I will run into this problem again.</description>
      <pubDate>Thu, 21 Aug 2008 14:01:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Health-and-Life-Sciences/Coding-a-Group-Variable/m-p/38875#M1246</guid>
      <dc:creator>WAL83</dc:creator>
      <dc:date>2008-08-21T14:01:48Z</dc:date>
    </item>
  </channel>
</rss>

