<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Merge two datasets in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets/m-p/869527#M343444</link>
    <description>&lt;P&gt;You really must show us what you expect as a result from this example datasets.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And do you really store your dates like this, and a month as character?&lt;/P&gt;
&lt;P&gt;Or should your data be this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data data1;
infile datalines truncover;
input ID$ ID2$ DT1 :mmddyy8. X Y Z DT2 :yymmdd8.;
format dt1 dt2 yymmddn8.;
datalines;
l2020642 F223220 20190620 0 0 1 20190620
l2020642 Z799199 20180607 1 0 0
l2020642 Z837743 20180418 0 0 1
l2020642 Z242142 20181221 0 0 1
;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;For your second dataset, I would recommend to combine your month/year into a SAS date:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data data2;
input ID$ MTH YR;
monyear = mdy(mth,1,yr);
format monyear yymmn6.;
datalines;
l2020642 08 2017
l2020642 01 2010
;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This enables you to use SAS tools for dates (INTCK, INTHX and so on).&lt;/P&gt;</description>
    <pubDate>Thu, 13 Apr 2023 08:51:27 GMT</pubDate>
    <dc:creator>Kurt_Bremser</dc:creator>
    <dc:date>2023-04-13T08:51:27Z</dc:date>
    <item>
      <title>Merge two datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets/m-p/869220#M343362</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I am not sure which is the best way to merge this dataset. I am not pleased with the result I got when I merged it so please I will like to get help on the best type of merge. Of course I have read lots of materials on one-to-one merge, many to many both in the data step and proc sql. I know the question might be what type of outcome do I want? I want an outcome that will provide the best merge.&lt;/P&gt;&lt;P&gt;I have millions of record but I have create a sample code, my result &amp;amp; concern below.&lt;/P&gt;&lt;PRE&gt;data data1;
input ID$ 1-8 ID2$ 9-16 DT1 17-25 X 27-28 Y 29-30 Z 31-32 DT2;
datalines;
l2020642 F223220 20190620 0 0 1 20190620
l2020642 Z799199 20180607 1 0 0
l2020642 Z837743 20180418 0 0 1
l2020642 Z242142 20181221 0 0 1
;
run;

data data2;
input ID$ 1-8 MTH$ 9-11 YR 12-16 ;
datalines;
l2020642 08 2017
l2020642 01 2010
;
run;

proc sort data=data1; by ID;run;
proc sort data=data2; by ID;run;

Data Want;
merge data2 (IN=IN1) data1 (IN=In2);
by ID;
if In1 and In2;
run;&lt;/PRE&gt;&lt;P&gt;Data2 is my main file so I want to join data1 to data2.&lt;/P&gt;&lt;P&gt;The result generate 3 rows but am worried why the MTH and YR 08/2017 came out once and 01/2010 came out twice. I don't know if the right joining should be one-to-many or many-to-many. I think am not using the right joining. Please providing a code will be help. Thanks in advance&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 11 Apr 2023 22:06:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets/m-p/869220#M343362</guid>
      <dc:creator>CathyVI</dc:creator>
      <dc:date>2023-04-11T22:06:26Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets/m-p/869221#M343363</link>
      <description>What do you want as output? I suspect in this case you need to filter your join by more than just ID.</description>
      <pubDate>Tue, 11 Apr 2023 22:11:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets/m-p/869221#M343363</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2023-04-11T22:11:50Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets/m-p/869409#M343402</link>
      <description>&lt;P&gt;The first data step is not going to work right.&amp;nbsp; Your column numbers are off, but the real issue is the last field you did not list any columns, so SAS will use LIST MODE input.&amp;nbsp; Which means for that it will jump to the next line to find something to read into the DT2 variable.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data data1;
  input ID$ 1-8 ID2$ 10-16 DT1 18-25 X 27 Y 29 Z 31 DT2 33-40;
datalines;
l2020642 F223220 20190620 0 0 1 20190620
l2020642 Z799199 20180607 1 0 0
l2020642 Z837743 20180418 0 0 1
l2020642 Z242142 20181221 0 0 1
;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If you fix that then you get this as the result.&lt;/P&gt;
&lt;PRE&gt;Obs       ID       MTH     YR       ID2         DT1      X    Y    Z       DT2

 1     l2020642    08     2017    F223220    20190620    0    0    1    20190620
 2     l2020642    01     2010    Z799199    20180607    1    0    0           .
 3     l2020642    01     2010    Z837743    20180418    0    0    1           .
 4     l2020642    01     2010    Z242142    20181221    0    0    1           .
&lt;/PRE&gt;
&lt;P&gt;What output would you like instead?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also notice that you get are warning message in the log:&lt;/P&gt;
&lt;PRE&gt;NOTE: MERGE statement has more than one data set with repeats of BY values.
NOTE: There were 2 observations read from the data set WORK.DATA2.
NOTE: There were 4 observations read from the data set WORK.DATA1.
&lt;/PRE&gt;
&lt;P&gt;When you MERGE two datasets that both have repeating observations then SAS will match then one for one until one of the datasets runs out of observations.&amp;nbsp; In that case the values contributed by the "short" dataset will stay the same as the values of the last observation in the group.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you want to eliminate those "retained" values you can make the data step a little more complex.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  merge data2(in=in1) data1(in=in2);
  by id;
  if in1 and in2 then output;
  call missing(of _all_);
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Results:&lt;/P&gt;
&lt;PRE&gt;Obs       ID       MTH     YR       ID2         DT1      X    Y    Z       DT2

 1     l2020642    08     2017    F223220    20190620    0    0    1    20190620
 2     l2020642    01     2010    Z799199    20180607    1    0    0           .
 3     l2020642              .    Z837743    20180418    0    0    1           .
 4     l2020642              .    Z242142    20181221    0    0    1           .
&lt;/PRE&gt;
&lt;P&gt;If you want something else then explain what you want and how the program can KNOW which observations to output.&lt;/P&gt;</description>
      <pubDate>Wed, 12 Apr 2023 13:32:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets/m-p/869409#M343402</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-04-12T13:32:29Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets/m-p/869527#M343444</link>
      <description>&lt;P&gt;You really must show us what you expect as a result from this example datasets.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And do you really store your dates like this, and a month as character?&lt;/P&gt;
&lt;P&gt;Or should your data be this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data data1;
infile datalines truncover;
input ID$ ID2$ DT1 :mmddyy8. X Y Z DT2 :yymmdd8.;
format dt1 dt2 yymmddn8.;
datalines;
l2020642 F223220 20190620 0 0 1 20190620
l2020642 Z799199 20180607 1 0 0
l2020642 Z837743 20180418 0 0 1
l2020642 Z242142 20181221 0 0 1
;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;For your second dataset, I would recommend to combine your month/year into a SAS date:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data data2;
input ID$ MTH YR;
monyear = mdy(mth,1,yr);
format monyear yymmn6.;
datalines;
l2020642 08 2017
l2020642 01 2010
;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This enables you to use SAS tools for dates (INTCK, INTHX and so on).&lt;/P&gt;</description>
      <pubDate>Thu, 13 Apr 2023 08:51:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets/m-p/869527#M343444</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2023-04-13T08:51:27Z</dc:date>
    </item>
  </channel>
</rss>

