<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Left join using data step in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297567#M62505</link>
    <description>&lt;P&gt;Why do think you can't use SQL?&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sun, 11 Sep 2016 01:18:04 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2016-09-11T01:18:04Z</dc:date>
    <item>
      <title>Left join using data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297559#M62502</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have a situation where i have to do a left join on two datasets. I cannot use proc sql step as i need to do some processing while doing left join. These processing involves creation of new variables on the basis of values in both the datasets.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My left join is little different as I am trying to merge one record from the first dataset to all the matching records in second.&amp;nbsp; The key on which join is made is not&amp;nbsp;a primary key in both the datasets. I have added a sample data from both datasets below.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Actual size of datasets is one million and right dataset is 20k.&lt;/P&gt;
&lt;P&gt;I got a sample&amp;nbsp;program from internet but I need some help in optimising the program. Now it loops for 20k times for each record in the first dataset. Is there any way i can subset the second dataset&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is there any alternative way to do this join more efficiently?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks in advance,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;First dataset.&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;STRONG&gt;data&lt;/STRONG&gt;&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt; AE;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;input&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt; SUBJECT &lt;/FONT&gt;&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;3.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt; @5EVENT $;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;datalines&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt;; &lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;263 HEADACX&lt;/P&gt;
&lt;P&gt;263 HEADACA&lt;/P&gt;
&lt;P&gt;263 HEADAC2&lt;/P&gt;
&lt;P&gt;263 HEADAC3&lt;/P&gt;
&lt;P&gt;263 HEADAC4&lt;/P&gt;
&lt;P&gt;263 HEADAC5&lt;/P&gt;
&lt;P&gt;263 HEADAC6&lt;/P&gt;
&lt;P&gt;264 xxxxxxx&lt;/P&gt;
&lt;P&gt;264 xxxxxx1&lt;/P&gt;
&lt;P&gt;265 FRACTU0&lt;/P&gt;
&lt;P&gt;265 FRACTU1&lt;/P&gt;
&lt;P&gt;265 FRACTU2&lt;/P&gt;
&lt;P&gt;265 FRACTU3&lt;/P&gt;
&lt;P&gt;265 FRACTU4&lt;/P&gt;
&lt;P&gt;270 NOTPRES&lt;/P&gt;
&lt;P&gt;;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt;;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt;Second dataset&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;STRONG&gt;data&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt; PATDATA;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;input&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt; SUBJECT &lt;/FONT&gt;&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;3.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt; @&lt;/FONT&gt;&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;5&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt;TRT_CODE &lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;$1.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt;;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;&lt;FONT color="#0000ff" face="Courier New" size="3"&gt;datalines&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt;;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;263 A&lt;/P&gt;
&lt;P&gt;263 D&lt;/P&gt;
&lt;P&gt;263 X&lt;/P&gt;
&lt;P&gt;264 A&lt;/P&gt;
&lt;P&gt;265 B&lt;/P&gt;
&lt;P&gt;265 Y&lt;/P&gt;
&lt;P&gt;275 N&lt;/P&gt;
&lt;P&gt;;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt;;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt;program&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;STRONG&gt;%macro&lt;/STRONG&gt;&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt; &lt;STRONG&gt;&lt;I&gt;temp&lt;/I&gt;&lt;/STRONG&gt;;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;DATA alldata0;&lt;/P&gt;
&lt;P&gt;SET ae;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt; DROP _: match; &lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT color="#008000" face="Courier New" size="3"&gt;&lt;FONT color="#008000" face="Courier New" size="3"&gt;&lt;FONT color="#008000" face="Courier New" size="3"&gt;* Drop temporary variables;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt; match=&lt;/FONT&gt;&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;0&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt;;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;/*Below I am looping the second dataset completely. is there any way to subset it?*/&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt; DO i=&lt;/FONT&gt;&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;1&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt; TO xnobs;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;SET patdata (rename=(subject=_subject)) NOBS=xnobs POINT=i;&lt;/P&gt;
&lt;P&gt;if subject=_subject THEN&lt;/P&gt;
&lt;P&gt;DO;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt; match=&lt;/FONT&gt;&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;1&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt;; &lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;OUTPUT;&lt;/P&gt;
&lt;P&gt;END;&lt;/P&gt;
&lt;P&gt;END;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt; IF match=&lt;/FONT&gt;&lt;/FONT&gt;&lt;STRONG&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;&lt;FONT color="#008080" face="Courier New" size="3"&gt;0&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt; THEN DO; &lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT color="#008000" face="Courier New" size="3"&gt;&lt;FONT color="#008000" face="Courier New" size="3"&gt;&lt;FONT color="#008000" face="Courier New" size="3"&gt;* Output AE record if no match in CM;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;CALL MISSING(trt_code);&lt;/P&gt;
&lt;P&gt;OUTPUT; END;&lt;/P&gt;
&lt;P&gt;RUN;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;FONT color="#000080" face="Courier New" size="3"&gt;&lt;STRONG&gt;%mend&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Courier New" size="3"&gt;&lt;FONT face="Courier New" size="3"&gt; temp;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;%&lt;STRONG&gt;&lt;I&gt;temp&lt;/I&gt;&lt;/STRONG&gt;;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards,&lt;/P&gt;
&lt;P&gt;Sheeba Swaminathan&lt;/P&gt;</description>
      <pubDate>Sun, 11 Sep 2016 00:07:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297559#M62502</guid>
      <dc:creator>Sheeba</dc:creator>
      <dc:date>2016-09-11T00:07:44Z</dc:date>
    </item>
    <item>
      <title>Re: Left join using data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297566#M62504</link>
      <description>&lt;P&gt;Why do think you can't use SQL?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 11 Sep 2016 01:18:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297566#M62504</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2016-09-11T01:18:02Z</dc:date>
    </item>
    <item>
      <title>Re: Left join using data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297567#M62505</link>
      <description>&lt;P&gt;Why do think you can't use SQL?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 11 Sep 2016 01:18:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297567#M62505</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2016-09-11T01:18:04Z</dc:date>
    </item>
    <item>
      <title>Re: Left join using data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297568#M62506</link>
      <description>Hi Reeza,  &lt;BR /&gt;&lt;BR /&gt;I cannot use sql as I need to create few variables during the join .. Some of the new variables are created by comparing the variables in both the datasets&lt;BR /&gt;&lt;BR /&gt;Also the join conditions ( here if subject =_subject ) are generated dynamically from a dataset ..&lt;BR /&gt;&lt;BR /&gt;Also I need to get two Data sets o&lt;BR /&gt;&lt;BR /&gt;1. Which satisfies these conditions and&lt;BR /&gt;2 which does not satisfy the matching condition &lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;In order to get all these flexibility I am planning to move forward with data step .&lt;BR /&gt;&lt;BR /&gt;Regards,&lt;BR /&gt;Sheeba</description>
      <pubDate>Sun, 11 Sep 2016 01:33:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297568#M62506</guid>
      <dc:creator>Sheeba</dc:creator>
      <dc:date>2016-09-11T01:33:50Z</dc:date>
    </item>
    <item>
      <title>Re: Left join using data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297573#M62511</link>
      <description>&lt;PRE&gt;
You could use Hash Table instead of Left Join .




data AE;
input SUBJECT 3. @5EVENT $;
datalines;
263 HEADACX
263 HEADACA
263 HEADAC2
263 HEADAC3
263 HEADAC4
263 HEADAC5
263 HEADAC6
264 xxxxxxx
264 xxxxxx1
265 FRACTU0
265 FRACTU1
265 FRACTU2
265 FRACTU3
265 FRACTU4
270 NOTPRES
;
run;
 
data PATDATA;
input SUBJECT 3. @5TRT_CODE $1.;
datalines;
263 A
263 D
263 X
264 A
265 B
265 Y
275 N
;
run;
data want;
 if _n_=1 then do;
  if 0 then set patdata;
  declare hash h(dataset:'patdata',multidata:'y');
  h.definekey('subject');
  h.definedata('trt_code');
  h.definedone();
 end;
set ae;
call missing(trt_code);
rc=h.find();
if rc ne 0 then output;
do while(rc=0);
 output;
 rc=h.find_next();
end;
drop rc;
run;

&lt;/PRE&gt;</description>
      <pubDate>Sun, 11 Sep 2016 02:19:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297573#M62511</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2016-09-11T02:19:01Z</dc:date>
    </item>
    <item>
      <title>Re: Left join using data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297576#M62516</link>
      <description>&lt;P&gt;Use an index to read only matching records:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data AE;
input SUBJECT 3. @5EVENT $;
datalines;
263 HEADACX
263 HEADACA
263 HEADAC2
263 HEADAC3
263 HEADAC4
263 HEADAC5
263 HEADAC6
264 xxxxxxx
264 xxxxxx1
265 FRACTU0
265 FRACTU1
265 FRACTU2
265 FRACTU3
265 FRACTU4
270 NOTPRES
;

data PATDATA /*(index=(subject=(subject)))*/;
input SUBJECT 3. @5TRT_CODE $1.;
datalines;
263 A
263 D
263 X
264 A
265 B
265 Y
275 N
;

proc sql;
create index subject on PATDATA(subject);
quit;

data join;
set AE;
reset = 1;
do i = 1 by 1 while (_iorc_ = 0);
    set PATDATA key=subject keyreset=reset;
    if _error_ then call missing(TRT_CODE);
    if i = 1 or not _error_ then output;
    end;
_iorc_ = 0;
_error_ = 0;
drop i;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sun, 11 Sep 2016 03:13:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297576#M62516</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2016-09-11T03:13:35Z</dc:date>
    </item>
    <item>
      <title>Re: Left join using data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297630#M62531</link>
      <description>Hi ksharp,&lt;BR /&gt;&lt;BR /&gt;Thanks a lot for suggestion.&lt;BR /&gt;&lt;BR /&gt;Hash tables are new to me . I will do a research on that . &lt;BR /&gt;&lt;BR /&gt;Regards,&lt;BR /&gt;Sheeba</description>
      <pubDate>Mon, 12 Sep 2016 00:00:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297630#M62531</guid>
      <dc:creator>Sheeba</dc:creator>
      <dc:date>2016-09-12T00:00:08Z</dc:date>
    </item>
    <item>
      <title>Re: Left join using data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297631#M62532</link>
      <description>Hi PGstats,&lt;BR /&gt;&lt;BR /&gt;Thanks for suggesting this method .&lt;BR /&gt;&lt;BR /&gt;I will try out this .&lt;BR /&gt;&lt;BR /&gt;Thanks again,&lt;BR /&gt;Regards,&lt;BR /&gt;Sheeba Swaminathan</description>
      <pubDate>Mon, 12 Sep 2016 00:02:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297631#M62532</guid>
      <dc:creator>Sheeba</dc:creator>
      <dc:date>2016-09-12T00:02:11Z</dc:date>
    </item>
    <item>
      <title>Re: Left join using data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297751#M62559</link>
      <description>&lt;P&gt;This solution is using Array. The subject is used as index of the array. This helps to decide&lt;BR /&gt;in one probe whether the subject of AE is present in the PATDATA and if so TRT_CODE is harvested.&lt;BR /&gt;This method of using array is called KEY INDEXing. Because it decides in one hit, this method&lt;BR /&gt;is faster than HASH or other solutions. I wish you run all the methods for joy and compare&lt;BR /&gt;run-times.&lt;/P&gt;&lt;P&gt;To be frugal in using the memory, we size the array with say 263:275 from your PATDATA as one argument.&lt;BR /&gt;We also need how many times a SUBJECT repeats. You know that 263 repeats 3 times but others are less.&lt;BR /&gt;So the ecnomical use of array will be:&lt;BR /&gt;array s[263:275, 3] _temporary_;&lt;BR /&gt;You may say that 266 to 274 are not present in PATDATA. Array can't be declared in pieces and so we live&lt;BR /&gt;with that limitation !&lt;/P&gt;&lt;P&gt;We call 263 as minsub, 275 as maxsub and 3 as maxGroup. For general purpose, these 3 values can be&lt;BR /&gt;obtained by scanning the PATDATA data set as:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
   retain max minsub maxsub;
   do count = 1 by 1 until(last.subject);
      set patdata end = eof;
      by subject;
      if last.subject then do;
         max = max(max, count);
         minsub = min(minsub, subject);
         maxsub = max(maxsub, subject);
      end;
   end;
if eof then do;
   call symputx('maxGroup', max);
   call symputx('minsub', minsub);
   call symputx('maxsub', maxsub);
end;
run;
%put &amp;amp;minsub;
%put &amp;amp;maxsub;
%put &amp;amp;maxGroup;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The assumption is that the PATDATA, at least be GROUPED by SUBJECT and the AE data set need&lt;BR /&gt;not be sorted by SUBJECT.&lt;/P&gt;&lt;P&gt;The solution consists of two do-loops. When _N_ = 1, the first do-loop loads the TRT_CODE&lt;BR /&gt;from PATDATA into the array. Remember that if a SUBJECT repeats, COUNT takes care of the&lt;BR /&gt;correct cell of the array.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
   array s[&amp;amp;minsub:&amp;amp;maxsub,&amp;amp;maxGroup] $1 _temporary_;
   array c[&amp;amp;minsub:&amp;amp;maxsub] _temporary_;
   if _n_ = 1 then do;
   do until(eof);
      do count = 1 by 1 until(last.subject);
         set patdata end = eof;
         by subject ;
         s[subject, count] = trt_code;
      end;
      c[subject] = count;
   end;
   end;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The second do-loop first ckecks whether the SUBJECT from AE falls between the RANGE(&amp;amp;minsub and &amp;amp;maxsub).&lt;BR /&gt;If it doen't fall, the default missing value for TRT_CODE is written to output data set. If it falls&lt;BR /&gt;then a check is made on the first cell of the SUBJECT whether it holds a non-missing TRT_CODE. Remember&lt;BR /&gt;situation like missing SUBJECTS with 266 to 274, which by default, will hold missing TRT_CODE in the array.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;   do until(last);
      set ae end = last;
      trt_code = ' ';
      if  ^(&amp;amp;minsub &amp;lt;= subject &amp;lt;= &amp;amp;maxsub) then output; 
      else if not missing(s[subject,1]) then 
      do i = 1 to c[subject];
         trt_code = s[subject,i];
         /** Your computations go here ..... **/
         output;
      end;
      else output;
   end;
drop i count;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Hope this is an acceptable solution to you. Don't forget feed us your comparison of solutions.&lt;/P&gt;</description>
      <pubDate>Mon, 12 Sep 2016 13:47:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297751#M62559</guid>
      <dc:creator>KachiM</dc:creator>
      <dc:date>2016-09-12T13:47:57Z</dc:date>
    </item>
    <item>
      <title>Re: Left join using data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297778#M62567</link>
      <description>&lt;P&gt;Hi datasp,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks a lot for the suggestion.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sure .. I will run all these methods ,compare it&amp;nbsp; and will let you know.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards,&lt;/P&gt;
&lt;P&gt;Sheeba Swaminathan&lt;/P&gt;</description>
      <pubDate>Mon, 12 Sep 2016 15:18:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Left-join-using-data-step/m-p/297778#M62567</guid>
      <dc:creator>Sheeba</dc:creator>
      <dc:date>2016-09-12T15:18:24Z</dc:date>
    </item>
  </channel>
</rss>

