<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Merging two datasets with missing values in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/652367#M22456</link>
    <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/315691"&gt;@AlG&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;&amp;nbsp;Thanks so much. I was a bit confused about one aspect of your code. Let me use an example:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Let's say the ABC dataset (sales data) that I mentioned, includes three companies: A, B, C.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The DEF dataset (ad expenditure data) includes two companies: A, B (as I said earlier, the companies in DEF are a subset of those in ABC).&lt;/P&gt;
&lt;P&gt;Both datasets include data on 100 weeks. For company A, ad expenditure and sales data for all 100 weeks are available in the datasets. For company B, 100 weeks of sales data is available in ABC, but only, say, 75 weeks of ad expenditure data is available in DEF (as I said, maybe company B didn't spend anything on advertising during the other 25 weeks and the dataset builder just ignored those weeks, instead of including those weeks and assigning a zero value).&lt;/P&gt;
&lt;P&gt;What I want to have in my merged dataset is:&lt;/P&gt;
&lt;P&gt;--100 rows for company A, which would include sales data for 100 weeks and ad expenditure data for 100 weeks.&lt;/P&gt;
&lt;P&gt;--100 rows for company B, which would include sales data for 100 weeks and ad expenditure data for 75 weeks and assign zero to ad expenditure for the 25 weeks missing in the DEF dataset.&lt;/P&gt;
&lt;P&gt;--0 rows for company C.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Do the codes that you kindly provided give such an output?&lt;/P&gt;
&lt;P&gt;Thanks again!&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The answer is best determined by running the code. &amp;nbsp; Make a dataset of, say 10 weeks total for 2 companies, with all 10 weeks reported in the sales (say 3 variables) data for each company, and a subset of weeks reported in the ad expenditures, with a couple of variables.&amp;nbsp; Run the code, look at the results.&amp;nbsp; This is half the fun of programming - testing code, then going back if necessary to fix or improve it.&amp;nbsp; Welcome to the club.&lt;/P&gt;</description>
    <pubDate>Mon, 01 Jun 2020 20:50:30 GMT</pubDate>
    <dc:creator>mkeintz</dc:creator>
    <dc:date>2020-06-01T20:50:30Z</dc:date>
    <item>
      <title>Merging two datasets with missing values</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/651999#M22441</link>
      <description>&lt;P&gt;Hi friends,&lt;/P&gt;&lt;P&gt;I have two datasets ABC and DEF. They include weekly data about some companies during 2014-2018 (The companies listed in DEF are a subset of companies in ABC). ABC includes sales data and DEF includes some advertising expenditure data.&lt;/P&gt;&lt;P&gt;My problem is that whereas ABC does include all of the weeks of that 4-year period, DEF lacks some of the weeks (probably the company did not have any ad expenditure during that week and the builder of the dataset decided to ignore that week altogether, instead of dedicating a row to that week and listing ad expenditure as zero).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to fix this issue in my merged dataset. I want to build a merged file in such a way that it does include those problematic weeks (i.e. it should include the week and list the sales data, and include blank, or preferably zero, for ad expenditure data).&amp;nbsp;&lt;/P&gt;&lt;P&gt;If I did not have this issue, I think the following code would give me the merged file. But how can I modify it to account for those problematic weeks? Thanks so much in advance!&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SQL;
create table FinalMergedDataset as SELECT *
 FROM DEF, ABC
 WHERE DEF.MergeName=ABC.brand_name and DEF.Year=ABC.year0 and DEF.GovWeek=ABC.week;
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 30 May 2020 18:44:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/651999#M22441</guid>
      <dc:creator>AlG</dc:creator>
      <dc:date>2020-05-30T18:44:20Z</dc:date>
    </item>
    <item>
      <title>Re: Merging two datasets with missing values</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/652051#M22443</link>
      <description>&lt;P&gt;Instead of requesting the intersection of ABC&amp;nbsp; and DEF, do a left join:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SQL;
create table FinalMergedDataset as SELECT *
 FROM ABC  left join DEF
 ON DEF.MergeName=ABC.brand_name and DEF.Year=ABC.year0 and DEF.GovWeek=ABC.week;
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Any instance of ABC not matched in DEF will generate missing values for the def-specific variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Now if you need zeroes instead of missing, you need to know the def variables names.&amp;nbsp; Say they are AD1 and AD2:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SQL;
create table FinalMergedDataset as
   SELECT ABC.*,  
     coalesce(def.a1,0) as a1,
     coalesce(def.a2,0) as a2
 FROM ABC left join DEF
 ON DEF.MergeName=ABC.brand_name and DEF.Year=ABC.year0 and DEF.GovWeek=ABC.week;
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sun, 31 May 2020 02:56:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/652051#M22443</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-05-31T02:56:25Z</dc:date>
    </item>
    <item>
      <title>Re: Merging two datasets with missing values</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/652109#M22444</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;&amp;nbsp;Thanks so much. I was a bit confused about one aspect of your code. Let me use an example:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Let's say the ABC dataset (sales data) that I mentioned, includes three companies: A, B, C.&amp;nbsp;&lt;/P&gt;&lt;P&gt;The DEF dataset (ad expenditure data) includes two companies: A, B (as I said earlier, the companies in DEF are a subset of those in ABC).&lt;/P&gt;&lt;P&gt;Both datasets include data on 100 weeks. For company A, ad expenditure and sales data for all 100 weeks are available in the datasets. For company B, 100 weeks of sales data is available in ABC, but only, say, 75 weeks of ad expenditure data is available in DEF (as I said, maybe company B didn't spend anything on advertising during the other 25 weeks and the dataset builder just ignored those weeks, instead of including those weeks and assigning a zero value).&lt;/P&gt;&lt;P&gt;What I want to have in my merged dataset is:&lt;/P&gt;&lt;P&gt;--100 rows for company A, which would include sales data for 100 weeks and ad expenditure data for 100 weeks.&lt;/P&gt;&lt;P&gt;--100 rows for company B, which would include sales data for 100 weeks and ad expenditure data for 75 weeks and assign zero to ad expenditure for the 25 weeks missing in the DEF dataset.&lt;/P&gt;&lt;P&gt;--0 rows for company C.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Do the codes that you kindly provided give such an output?&lt;/P&gt;&lt;P&gt;Thanks again!&lt;/P&gt;</description>
      <pubDate>Sun, 31 May 2020 18:57:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/652109#M22444</guid>
      <dc:creator>AlG</dc:creator>
      <dc:date>2020-05-31T18:57:32Z</dc:date>
    </item>
    <item>
      <title>Re: Merging two datasets with missing values</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/652367#M22456</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/315691"&gt;@AlG&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;&amp;nbsp;Thanks so much. I was a bit confused about one aspect of your code. Let me use an example:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Let's say the ABC dataset (sales data) that I mentioned, includes three companies: A, B, C.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The DEF dataset (ad expenditure data) includes two companies: A, B (as I said earlier, the companies in DEF are a subset of those in ABC).&lt;/P&gt;
&lt;P&gt;Both datasets include data on 100 weeks. For company A, ad expenditure and sales data for all 100 weeks are available in the datasets. For company B, 100 weeks of sales data is available in ABC, but only, say, 75 weeks of ad expenditure data is available in DEF (as I said, maybe company B didn't spend anything on advertising during the other 25 weeks and the dataset builder just ignored those weeks, instead of including those weeks and assigning a zero value).&lt;/P&gt;
&lt;P&gt;What I want to have in my merged dataset is:&lt;/P&gt;
&lt;P&gt;--100 rows for company A, which would include sales data for 100 weeks and ad expenditure data for 100 weeks.&lt;/P&gt;
&lt;P&gt;--100 rows for company B, which would include sales data for 100 weeks and ad expenditure data for 75 weeks and assign zero to ad expenditure for the 25 weeks missing in the DEF dataset.&lt;/P&gt;
&lt;P&gt;--0 rows for company C.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Do the codes that you kindly provided give such an output?&lt;/P&gt;
&lt;P&gt;Thanks again!&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The answer is best determined by running the code. &amp;nbsp; Make a dataset of, say 10 weeks total for 2 companies, with all 10 weeks reported in the sales (say 3 variables) data for each company, and a subset of weeks reported in the ad expenditures, with a couple of variables.&amp;nbsp; Run the code, look at the results.&amp;nbsp; This is half the fun of programming - testing code, then going back if necessary to fix or improve it.&amp;nbsp; Welcome to the club.&lt;/P&gt;</description>
      <pubDate>Mon, 01 Jun 2020 20:50:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/652367#M22456</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-06-01T20:50:30Z</dc:date>
    </item>
    <item>
      <title>Re: Merging two datasets with missing values</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/658286#M22642</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;&amp;nbsp;Thanks! I tried the following code. The issue that I am facing is that for non-matching observations the value for for week_volume is still missing.&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SQL;
create table Want as
   SELECT ABC.*, DEF.* 
     COALESCE(DEF.Week_Volume,0) as Week_Volume,
 FROM ABC left join DEF
 ON DEF.MergeName=ABC.brand_name and DEF.Year=ABC.year0 and DEF.GovWeek=ABC.week;
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;CODE class=" language-sas"&gt;&lt;/CODE&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;However, when I create a new variable (instead of replacing the missing values in the original variable), it seems to give me what I want (i.e. zero for non-matching observations):&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SQL;
create table Want as
   SELECT ABC.*, DEF.* 
     COALESCE(DEF.Week_Volume,0) as Week_Volume2,
 FROM ABC left join DEF
 ON DEF.MergeName=ABC.brand_name and DEF.Year=ABC.year0 and DEF.GovWeek=ABC.week;
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Given that I want to repeat the same thing for a lot of other variables too, it is really tedious to create new variables for each and then drop the original variables. Can you please tell me what the issue is with my first code above? Thanks again!&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 13 Jun 2020 17:09:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/658286#M22642</guid>
      <dc:creator>AlG</dc:creator>
      <dc:date>2020-06-13T17:09:13Z</dc:date>
    </item>
    <item>
      <title>Re: Merging two datasets with missing values</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/658313#M22646</link>
      <description>&lt;P&gt;There is a comma missing after DEF.*&lt;/P&gt;
&lt;P&gt;You can't have the same named column twice in a Select statement. SAS will not throw an error but it will just pick one of the same named columns.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The moment you need to modify a column you shouldn't use the * syntax anymore but you should list the columns one by one.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What can help avoid a lot of typing - run your SQL with options FEEDBACK NOEXEC&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql feedback noexec;
  select c.*
  from sashelp.class c
  ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This will write the * syntax expanded to the SAS log&lt;/P&gt;
&lt;PRE&gt;NOTE: Statement transforms to:

        select C.Name, C.Sex, C.Age, C.Height, C.Weight
          from SASHELP.CLASS C;&lt;/PRE&gt;
&lt;P&gt;If there are many variables where I have to change something in the same way I normally copy/paste the variable list into Notepad++ and use RegEx search/replace like:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. Each variable on its own line&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Patrick_0-1592099757794.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/42913i954F3A4005CD077C/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Patrick_0-1592099757794.png" alt="Patrick_0-1592099757794.png" /&gt;&lt;/span&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;-&amp;gt;&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Patrick_1-1592099788591.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/42914iD43ABB210A116B2A/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Patrick_1-1592099788591.png" alt="Patrick_1-1592099788591.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P style="line-height: 1.71429;"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. coalesce() around variables&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Patrick_0-1592100811941.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/42916i0DA2524A34401FAE/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Patrick_0-1592100811941.png" alt="Patrick_0-1592100811941.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3. Remove the alias from the new variables&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Patrick_1-1592100940075.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/42917i69CCCBC61AEB5353/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Patrick_1-1592100940075.png" alt="Patrick_1-1592100940075.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;...and now just copy/paste the result back into your SAS SQL code&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Above was for demonstration and you need some RegEx experience to use this approach efficiently - but you can of course also use it for partial changes and then do the rest manually.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Alternatively for your actual problem: Use post processing like a SAS data step with array processing where you set missing values to zero.&lt;/P&gt;
&lt;P&gt;....and should you only need the zero's for reporting/printing then also consider to just set options missing='0'. This will print missings as zero in reports.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 14 Jun 2020 02:19:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/658313#M22646</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2020-06-14T02:19:03Z</dc:date>
    </item>
    <item>
      <title>Re: Merging two datasets with missing values</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/658332#M22658</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12447"&gt;@Patrick&lt;/a&gt;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you so much for your help!&lt;/P&gt;</description>
      <pubDate>Sun, 14 Jun 2020 05:30:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Merging-two-datasets-with-missing-values/m-p/658332#M22658</guid>
      <dc:creator>AlG</dc:creator>
      <dc:date>2020-06-14T05:30:02Z</dc:date>
    </item>
  </channel>
</rss>

