<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Merge datasets with different variables that measure the same thing and consolidate them in one in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887196#M350537</link>
    <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/226521"&gt;@ANKH1&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;Ok, thanks for the explanation. I shouldn't say merge.&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;AND realize that multiple SET statements is a much more complex issue than stacking (appending) data.&lt;/P&gt;</description>
    <pubDate>Mon, 31 Jul 2023 20:34:41 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2023-07-31T20:34:41Z</dc:date>
    <item>
      <title>Merge datasets with different variables that measure the same thing and consolidate them in one</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887160#M350526</link>
      <description>&lt;P&gt;Hi, I would like to merge multiple datasets by ID. Each dataset has different number of rows. Some datasets have more than one variable of interest, in this case is age, that is, the same dataset might have age1, age2, age3. The end goal is to be able to report the frequencies of variable age per ID and be able to know where each data point comes from (depicted by&amp;nbsp;the variable ds). These are the sample datasets:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;data ds1;
input ID age1;
datalines;
1 12
3 11
4 10
4 15
;
run;

data ds2;
input ID age2 age3 age4;
datalines;
1 13 13 .
1 23 14 .
2 25 . 21
;
run;

data ds3;
input ID age5;
datalines;
3 22
3 32
3 33
4 12
;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The output that we are aiming for is this before using proc freq by ID:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE border="0" cellspacing="0" cellpadding="0"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;ID&lt;/TD&gt;&lt;TD&gt;age&lt;/TD&gt;&lt;TD&gt;ds&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;12&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;13&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;13&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;23&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;14&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;25&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;21&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;11&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;22&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;32&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;33&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;4&lt;/TD&gt;&lt;TD&gt;10&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;4&lt;/TD&gt;&lt;TD&gt;15&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;4&lt;/TD&gt;&lt;TD&gt;12&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried this code but it omits the observation from ds2, ID1, age3=13:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;data ages_all;
  /* First dataset */
  set ds1(rename=(age1=age) in=in1 keep=ID age1);
  
  /* Second dataset */
  set ds2(rename=(age2=age age3=age_renamed1 age4=age_renamed2) in=in2 keep=ID age2 age3 age4);
  
  /* Rename age_renamed1 and age_renamed2 to age */
  if in2 then age = coalesce(age_renamed1, age_renamed2);
  
  /* Third dataset */
  set ds3(rename=(age5=age) in=in3 keep=ID age5);

  by ID;
  
  ds = in1 + 2*in2 + 3*in3;
  drop age_renamed1 age_renamed2; /* Drop the temporary variables */
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 31 Jul 2023 18:26:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887160#M350526</guid>
      <dc:creator>ANKH1</dc:creator>
      <dc:date>2023-07-31T18:26:59Z</dc:date>
    </item>
    <item>
      <title>Re: Merge datasets with different variables that measure the same thing and consolidate them in one</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887161#M350527</link>
      <description>&lt;P&gt;It would be easier to just use an array.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  set ds1 ds2 ds3 ;
  by id;
  array x age1-age5;
  do index=1 to dim(x);
     age=x[index];
     if not missing(age) then output;
  end;
  drop age1-age5;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If you really did need to know which value came from which dataset you use the INDEX variable for that.&amp;nbsp; INDEX=1 is from DS1 and INDEX=5 is from DS3 and the other three are all from DS2.&lt;/P&gt;</description>
      <pubDate>Mon, 31 Jul 2023 18:34:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887161#M350527</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-07-31T18:34:55Z</dc:date>
    </item>
    <item>
      <title>Re: Merge datasets with different variables that measure the same thing and consolidate them in one</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887175#M350529</link>
      <description>&lt;P&gt;Thanks! This is very helpful. Is there a way to say that age2,age3 and age4 belong to index 2? The problem is that we have more than 20 datasets and each one might or might have multiple age variables.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 31 Jul 2023 19:30:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887175#M350529</guid>
      <dc:creator>ANKH1</dc:creator>
      <dc:date>2023-07-31T19:30:41Z</dc:date>
    </item>
    <item>
      <title>Re: Merge datasets with different variables that measure the same thing and consolidate them in one</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887176#M350530</link>
      <description>&lt;P&gt;Use the INDSNAME= option in the SET statement&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Example:&amp;nbsp;&lt;A href="https://documentation.sas.com/doc/en/pgmmvacdc/9.4/lestmtsref/p00hxg3x8lwivcn1f0e9axziw57y.htm#p00fwyxqcqptpcn10ivxt1r470q1" target="_blank"&gt;https://documentation.sas.com/doc/en/pgmmvacdc/9.4/lestmtsref/p00hxg3x8lwivcn1f0e9axziw57y.htm#p00fwyxqcqptpcn10ivxt1r470q1&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 31 Jul 2023 19:33:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887176#M350530</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2023-07-31T19:33:27Z</dc:date>
    </item>
    <item>
      <title>Re: Merge datasets with different variables that measure the same thing and consolidate them in one</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887177#M350531</link>
      <description>&lt;P&gt;Multiple SET statements behave in a way you likely do not expect.&lt;/P&gt;
&lt;P&gt;You really need to watch what the observations notes in the log shows.&lt;/P&gt;
&lt;P&gt;Brief example:&lt;/P&gt;
&lt;PRE&gt;data one;
   input x;
datalines;
1
2
3
;

data two;
   input y;
datalines;
11
22
;

data sets;
   set one;
   set two;
run;&lt;/PRE&gt;
&lt;P&gt;How many observations do you expect?&lt;/P&gt;
&lt;P&gt;Now, read the LOG:&lt;/P&gt;
&lt;PRE&gt;16   data sets;
17      set one;
18      set two;
19   run;

NOTE: There were 3 observations read from the data set WORK.ONE.
NOTE: There were 2 observations read from the data set WORK.TWO.
NOTE: The data set WORK.SETS has 2 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
&lt;/PRE&gt;
&lt;P&gt;3 observations from the first set but the final output data only has 2 observations.&lt;/P&gt;
&lt;P&gt;Likely your output only includes the number of observations from the set with the smallest number.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;MERGE is a very specific concept in SAS and what you did is not a merge. Especially when you want to combine records according to the value of one or more variables. SAS uses a BY statement with MERGE to align observations on a common value of one or more variables. Caution: Merge does not like, meaning you get unexpected results, if more than one data set contributes multiple values to a Merge.&lt;/P&gt;</description>
      <pubDate>Mon, 31 Jul 2023 19:36:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887177#M350531</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2023-07-31T19:36:50Z</dc:date>
    </item>
    <item>
      <title>Re: Merge datasets with different variables that measure the same thing and consolidate them in one</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887179#M350533</link>
      <description>Ok, thanks for the explanation. I shouldn't say merge.</description>
      <pubDate>Mon, 31 Jul 2023 20:03:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887179#M350533</guid>
      <dc:creator>ANKH1</dc:creator>
      <dc:date>2023-07-31T20:03:07Z</dc:date>
    </item>
    <item>
      <title>Re: Merge datasets with different variables that measure the same thing and consolidate them in one</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887182#M350535</link>
      <description>&lt;P&gt;You could do it the same way as in your original program using the IN= dataset option.&lt;/P&gt;
&lt;P&gt;For example to make a numeric variable:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;set ds1(in=in1) ds2(in=in2) ds3(in=in3);
....
ds=in1 + 2*in2 +3*in3;
...&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Or you could use the INDSNAME= option of the SET statement can create a character variable.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;set ds1 ds2 ds3 indsname=indsname;
...
dsname=indsname;
...&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 31 Jul 2023 20:09:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887182#M350535</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-07-31T20:09:07Z</dc:date>
    </item>
    <item>
      <title>Re: Merge datasets with different variables that measure the same thing and consolidate them in one</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887183#M350536</link>
      <description>Thanks! I used your code with indsname, it worked! How can I accept two answers as solutions?</description>
      <pubDate>Mon, 31 Jul 2023 20:20:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887183#M350536</guid>
      <dc:creator>ANKH1</dc:creator>
      <dc:date>2023-07-31T20:20:40Z</dc:date>
    </item>
    <item>
      <title>Re: Merge datasets with different variables that measure the same thing and consolidate them in one</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887196#M350537</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/226521"&gt;@ANKH1&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;Ok, thanks for the explanation. I shouldn't say merge.&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;AND realize that multiple SET statements is a much more complex issue than stacking (appending) data.&lt;/P&gt;</description>
      <pubDate>Mon, 31 Jul 2023 20:34:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-datasets-with-different-variables-that-measure-the-same/m-p/887196#M350537</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2023-07-31T20:34:41Z</dc:date>
    </item>
  </channel>
</rss>

