<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Keep rows of data when one var changes from 0 to 1, but excluding when that var is only 1 or 0 in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/924377#M363856</link>
    <description>&lt;P&gt;Your sample data shows no instances of reversals (a single id having a change in both directions), then you are simply looking for patient_id's that start at sepsislabel zero and end at sepsislabel one.&amp;nbsp; If so, then:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input Hour HR Temp Resp SepsisLabel Patient_ID;
datalines;
0 . . . 0 3205
1 76 . 20 0 3205
2 78 . 20 1 3205
3 81 . 16 1 3205
0 . . . 0 3206
1 76 . 20 0 3206
2 78 . 20 0 3206
3 81 . 16 0 3206
0 . . . 1 3207
1 76 . 20 1 3207
2 78 . 20 1 3207
3 81 . 16 1 3207
0 . . . 1 3208
1 76 . 20 1 3208
2 78 . 20 0 3208
3 81 . 16 0 3208
;
data want (drop=_:);
  set have (in=firstpass)
      have (in=secondpass);
  by patient_id;
  retain _start_at_zero _end_at_one ;
  if first.patient_id=1 then _start_at_zero=(sepsislabel=0);
  if firstpass then _end_at_one=(sepsislabel=1);
  if secondpass  and _start_at_zero and _end_at_one;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But if you also want instances that change from 1 to 0:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want (drop=_:);
  set have (in=firstpass)
      have (in=secondpass);
  by patient_id;
  if first.patient_id then call missing(_n0,_n1);
  _n0+(firstpass=1 and sepsislabel=0);
  _n1+(firstpass=1 and sepsislabel=1);
  if secondpass  and _n0 and _n1;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 15 Apr 2024 14:18:23 GMT</pubDate>
    <dc:creator>mkeintz</dc:creator>
    <dc:date>2024-04-15T14:18:23Z</dc:date>
    <item>
      <title>Keep rows of data when one var changes from 0 to 1, but excluding when that var is only 1 or 0</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/924289#M363816</link>
      <description>&lt;P&gt;Hello!&lt;/P&gt;&lt;P&gt;I have a long dataset from the "Early Prediction of Sepsis from Clinical Data: the PhysioNet/Computing in Cardiology Challenge 2019" This has over 1.5 million rows of hourly data, with over 40,000 unique Patient_IDs. There are many variable (such as Hour, HR, Resp, O2Sat, various lab values , etc.) by one outcome variable (SepsisLabel. 0 for no sepsis, 1 for sepsis).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm trying to keep the hourly rows of datasets from where SepsisLabel changes from 0 to 1, while excluding those who are only SepsisLabel=0 and excluding those who are only SepsisLabel=1. I've included a truncated example of one unique Patient_ID below with only Hour HR Temp Resp SepsisLabel and Patient_ID&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hour HR Temp Resp SepsisLabel Patient_ID&lt;BR /&gt;0 . . . 0 3205&lt;BR /&gt;1 76 . 20 0 3205&lt;BR /&gt;2 78 . 20 0 3205&lt;BR /&gt;3 81 . 16 0 3205&lt;BR /&gt;4 79 37.89 12 0 3205&lt;BR /&gt;5 81 . 15.25 0 3205&lt;BR /&gt;6 78 . 12 0 3205&lt;BR /&gt;7 75 . 13.5 1 3205&lt;BR /&gt;8 76 37.5 18.5 1 3205&lt;BR /&gt;9 84 . 14.5 1 3205&lt;BR /&gt;10 82 . 33 1 3205&lt;BR /&gt;11 95 . 28 1 3205&lt;BR /&gt;12 99 37.67 26 1 3205&lt;BR /&gt;13 96 . 21 1 3205&lt;BR /&gt;14 92 . 22 1 3205&lt;BR /&gt;15 85 . 26 1 3205&lt;BR /&gt;16 92 37.33 22 1 3205&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please help me with this!&lt;/P&gt;</description>
      <pubDate>Sun, 14 Apr 2024 18:37:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/924289#M363816</guid>
      <dc:creator>pdick2</dc:creator>
      <dc:date>2024-04-14T18:37:39Z</dc:date>
    </item>
    <item>
      <title>Re: Keep rows of data when one var changes from 0 to 1, but excluding when that var is only 1 or 0</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/924351#M363847</link>
      <description>&lt;P&gt;First, let's make some data robust enough for testing.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input Hour HR Temp Resp SepsisLabel Patient_ID;
datalines;
0 . . . 0 3205
1 76 . 20 0 3205
2 78 . 20 1 3205
3 81 . 16 1 3205
0 . . . 0 3206
1 76 . 20 0 3206
2 78 . 20 0 3206
3 81 . 16 0 3206
0 . . . 1 3207
1 76 . 20 1 3207
2 78 . 20 1 3207
3 81 . 16 1 3207
0 . . . 1 3208
1 76 . 20 1 3208
2 78 . 20 0 3208
3 81 . 16 0 3208
;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;In this dataset, Patient_ID 3205 and 3208 had a change in&amp;nbsp;SepsisLabel from 0 to 1 or from 1 to 0. Patient_ID 3206 and 3207 had no change in&amp;nbsp;SepsisLabel. As I understand the problem, you want to keep&amp;nbsp;&lt;EM&gt;all rows&amp;nbsp;&lt;/EM&gt;for&amp;nbsp;Patient_ID 3205 and 3208, and &lt;EM&gt;exclude&lt;/EM&gt;&amp;nbsp;&lt;EM&gt;all rows &lt;/EM&gt;for&amp;nbsp;Patient_ID 3206 and 3207. If so, here's one approach:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
select * 
	from have as h
	inner join
	(select Patient_ID
		from(select distinct Patient_id, SepsisLabel
				from have)
		group by Patient_ID
		having count(*) &amp;gt;1) as w
	on h.Patient_ID=w.Patient_ID
;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The SQL subquery identifies those Patient_ID values having more than one value for&amp;nbsp;SepsisLabel. Those Patient_ID values are used to&amp;nbsp;select the desired rows from the original table.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 15 Apr 2024 12:35:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/924351#M363847</guid>
      <dc:creator>SASJedi</dc:creator>
      <dc:date>2024-04-15T12:35:32Z</dc:date>
    </item>
    <item>
      <title>Re: Keep rows of data when one var changes from 0 to 1, but excluding when that var is only 1 or 0</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/924377#M363856</link>
      <description>&lt;P&gt;Your sample data shows no instances of reversals (a single id having a change in both directions), then you are simply looking for patient_id's that start at sepsislabel zero and end at sepsislabel one.&amp;nbsp; If so, then:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input Hour HR Temp Resp SepsisLabel Patient_ID;
datalines;
0 . . . 0 3205
1 76 . 20 0 3205
2 78 . 20 1 3205
3 81 . 16 1 3205
0 . . . 0 3206
1 76 . 20 0 3206
2 78 . 20 0 3206
3 81 . 16 0 3206
0 . . . 1 3207
1 76 . 20 1 3207
2 78 . 20 1 3207
3 81 . 16 1 3207
0 . . . 1 3208
1 76 . 20 1 3208
2 78 . 20 0 3208
3 81 . 16 0 3208
;
data want (drop=_:);
  set have (in=firstpass)
      have (in=secondpass);
  by patient_id;
  retain _start_at_zero _end_at_one ;
  if first.patient_id=1 then _start_at_zero=(sepsislabel=0);
  if firstpass then _end_at_one=(sepsislabel=1);
  if secondpass  and _start_at_zero and _end_at_one;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But if you also want instances that change from 1 to 0:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want (drop=_:);
  set have (in=firstpass)
      have (in=secondpass);
  by patient_id;
  if first.patient_id then call missing(_n0,_n1);
  _n0+(firstpass=1 and sepsislabel=0);
  _n1+(firstpass=1 and sepsislabel=1);
  if secondpass  and _n0 and _n1;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 15 Apr 2024 14:18:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/924377#M363856</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2024-04-15T14:18:23Z</dc:date>
    </item>
    <item>
      <title>Re: Keep rows of data when one var changes from 0 to 1, but excluding when that var is only 1 or 0</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/924439#M363880</link>
      <description>&lt;P&gt;This worked! Took some tinkering (I had to sort my data by Patient_ID and hour first), but it matches the original dataset, but keeping only those who change from 0 to 1. There aren't any instances on it going from 1 to 0, but thank you for giving that! The one SASJedi gave was a little memory intensive and I'm not sure why (I'm new to SAS).&lt;/P&gt;</description>
      <pubDate>Tue, 16 Apr 2024 00:47:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/924439#M363880</guid>
      <dc:creator>pdick2</dc:creator>
      <dc:date>2024-04-16T00:47:08Z</dc:date>
    </item>
    <item>
      <title>Re: Keep rows of data when one var changes from 0 to 1, but excluding when that var is only 1 or 0</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/924446#M363883</link>
      <description>&lt;P&gt;If "&lt;SPAN&gt;There aren't any instances on it going from 1 to 0". That would be easy.&lt;/SPAN&gt;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table want as
select * from have
 group by Patient_ID
  having count(distinct SepsisLabel)=2
   order by Patient_ID,Hour;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 16 Apr 2024 02:32:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/924446#M363883</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-04-16T02:32:57Z</dc:date>
    </item>
    <item>
      <title>Re: Keep rows of data when one var changes from 0 to 1, but excluding when that var is only 1 or 0</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/924472#M363894</link>
      <description>&lt;P&gt;While SQL is sometimes simpler to write, the SQL processor does a lot of "behind the scenes" work (sorting, etc.) on your behalf. An individual SQL step will often be more resource-intensive than a PROC SORT / DATA step combo.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 16 Apr 2024 11:38:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/924472#M363894</guid>
      <dc:creator>SASJedi</dc:creator>
      <dc:date>2024-04-16T11:38:34Z</dc:date>
    </item>
    <item>
      <title>Re: Keep rows of data when one var changes from 0 to 1, but excluding when that var is only 1 or 0</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/925006#M364074</link>
      <description>&lt;P&gt;After looking at my data some more, turns out I might need to include individuals who are only sepsislabel=0 for their whole time in the hospital, but also include those who are first sepsislabel=0 and then transition to sepsislabel=1. How would you do that? (I.e. Include patient_ID=3206 (the only sepsislabel=0) and 3205 (who goes from sepsislabel=0 to 1) in my dataset). This will exclude people who are only sepsislabel=1.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Trying to to a cox model and need a good comparison group for those who do not develop sepsis. If you have any more advice, please let me know!&lt;/P&gt;</description>
      <pubDate>Fri, 19 Apr 2024 14:22:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/925006#M364074</guid>
      <dc:creator>pdick2</dc:creator>
      <dc:date>2024-04-19T14:22:46Z</dc:date>
    </item>
    <item>
      <title>Re: Keep rows of data when one var changes from 0 to 1, but excluding when that var is only 1 or 0</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/925058#M364083</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/465253"&gt;@pdick2&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;You want&amp;nbsp; all cases for ID's that have sepsislabel always at zero, or that change from zero to one.&lt;/LI&gt;
&lt;LI&gt;You earlier said your data has no instance of going from one to zero.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;If the above are both true, then you simply want all ID's that start with zero,&amp;nbsp; That's a simple task.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If the data are sorted by PATIENT_ID/Hour, then:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want (drop=_:);
  retain _keep ' ';
  set have;
  by patient_id ;
  if first.patient_id=1 then _keep=ifc(sepsislabel=0,'Y','N');
  if _keep='Y';
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If the data within patient_id are sorted by HR, and the patient_id's are grouped, but not sorted, then change the BY statement to:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;  by patient_id notsorted;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Now if the data are sorted by HR, you don't have to re-sort by patient_id/hr.&amp;nbsp; And it doesn't matter what the order is within each HR value.&amp;nbsp; So, if sorting by patient_id/hr is expensive, then you can use a hash object:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want (drop=_:);
  set have;
  by patient_id;

  length _keep $1;
  if _n_=1 then do;
    declare hash h();
      h.definekey('patient_id');
      h.definedata('_keep');
      h.definedone();
  end;
  if h.find()^=0 then do;             /*If this is first instance for this ID ...*/
    if sepsislabel=0 then _keep='Y';  /*Check the initial sepsislabel value      */
    else _keep='N';
    h.add();                          /*Put it in the hash object */
  end;
  if _keep='Y';
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;These are untested.&lt;/P&gt;</description>
      <pubDate>Sat, 20 Apr 2024 02:57:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Keep-rows-of-data-when-one-var-changes-from-0-to-1-but-excluding/m-p/925058#M364083</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2024-04-20T02:57:01Z</dc:date>
    </item>
  </channel>
</rss>

