<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Issue merging data in Statistical Procedures</title>
    <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181334#M9424</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;That's right.&amp;nbsp; It is suggested that the two data files are identical except for the new variables.&amp;nbsp; But obviously there is a difference here between what ought to be and what is.&amp;nbsp; Unfortunately, I don't use SAS much and this is turning especially problematic.&amp;nbsp;&amp;nbsp; Is there a way to use proc sql without identifying every one of the 3000-4000 variables.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Mon, 13 Oct 2014 17:42:26 GMT</pubDate>
    <dc:creator>tgm4052</dc:creator>
    <dc:date>2014-10-13T17:42:26Z</dc:date>
    <item>
      <title>Issue merging data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181327#M9417</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;For the variables I have checked, they appear to be numeric in both datasets.&amp;nbsp; &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;13678&amp;nbsp; data call.&amp;amp;name;&lt;/P&gt;&lt;P&gt;13679&amp;nbsp; retain RSSD9001 RSSD9999;&lt;/P&gt;&lt;P&gt;13680&amp;nbsp; merge work.&amp;amp;name call.&amp;amp;name (rename=(RCON9999=RSSD9999));&lt;/P&gt;&lt;P&gt;ERROR: Variable RSSD9050 has been defined as both character and numeric.&lt;/P&gt;&lt;P&gt;ERROR: Variable RCON6979 has been defined as both character and numeric.&lt;/P&gt;&lt;P&gt;ERROR: Variable RCON5562 has been defined as both character and numeric.&lt;/P&gt;&lt;P&gt;ERROR: Variable RCON5563 has been defined as both character and numeric.&lt;/P&gt;&lt;P&gt;ERROR: Variable RCON5564 has been defined as both character and numeric.&lt;/P&gt;&lt;P&gt;ERROR: Variable RCON5565 has been defined as both character and numeric.&lt;/P&gt;&lt;P&gt;ERROR: Variable RCON5566 has been defined as both character and numeric.&lt;/P&gt;&lt;P&gt;ERROR: Variable RCON5567 has been defined as both character and numeric.&lt;/P&gt;&lt;P&gt;ERROR: Variable RCON5568 has been defined as both character and numeric.&lt;/P&gt;&lt;P&gt;ERROR: Variable RCON5569 has been defined as both character and numeric.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The instructions for this data merge are here: &lt;A href="http://chicagofed.org/digital_assets/others/banking/financial_institution_reports/CDR_TabDelimitedFileImport.txt" title="http://chicagofed.org/digital_assets/others/banking/financial_institution_reports/CDR_TabDelimitedFileImport.txt"&gt;http://chicagofed.org/digital_assets/others/banking/financial_institution_reports/CDR_TabDelimitedFileImport.txt&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But I can't seem to resolve the issue.&amp;nbsp; Any help would be greatly appreciated&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 10 Oct 2014 20:14:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181327#M9417</guid>
      <dc:creator>tgm4052</dc:creator>
      <dc:date>2014-10-10T20:14:31Z</dc:date>
    </item>
    <item>
      <title>Re: Issue merging data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181328#M9418</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Run Proc contents on both sets or Proc Compare I bet you will find the variables are different types.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;A very likely cause is importing data with Proc Import, especially from spreadsheets, where the type is being "guessed". Then some values look like numbers but at least one wasn't and the type is character.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;One solution is to import one of the files with proc import and use the code generated to read another file with similar format changing in and output filename/dataset. Or since your link mentions Tab delimited, write code to read the file from scratch.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 10 Oct 2014 20:20:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181328#M9418</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2014-10-10T20:20:39Z</dc:date>
    </item>
    <item>
      <title>Re: Issue merging data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181329#M9419</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I suppose I'll give that a shot.&amp;nbsp; Otherwise, I might try UPDATE instead of MERGE and see if that doesn't yield the same error messages.&amp;nbsp; There would simpler alternative like proc sql but this dataset has roughly 4000 variables. &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 13 Oct 2014 14:40:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181329#M9419</guid>
      <dc:creator>tgm4052</dc:creator>
      <dc:date>2014-10-13T14:40:11Z</dc:date>
    </item>
    <item>
      <title>Re: Issue merging data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181330#M9420</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Well, you haven't specified a keep on either dataset, so you are getting all the variables from the first table and all the variables from the second.&amp;nbsp; Do you need them all?&amp;nbsp; If so consider applying a rename to all the variables from the second table.&amp;nbsp; Or as you mention if you want a merge of all the data together then update in a datastep, or update via SQL would be a good idea, though you do as ballardw mentions above look at the meta data of each table and identify why the variables have different types.&amp;nbsp; It could be many things, proc import "deciding" on its own what the datatype shoul be, maybe on older data it was all numbers, now with later data some text has appeared etc.&amp;nbsp; Maybe those variables don't exist so have been created, and end up with differing types.&amp;nbsp; Step through your program one step at a time and see what is created and what type is given (right click on dataset and view columns).&amp;nbsp; I also note you have two different namings RCON and RSS, make sure they are assigned correctly.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 13 Oct 2014 14:53:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181330#M9420</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2014-10-13T14:53:57Z</dc:date>
    </item>
    <item>
      <title>Re: Issue merging data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181331#M9421</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The ERROR messages indicate that variables of the same name are contained in both datasets to be merged. Aside from the variables in the by statement, this is a BAD idea because it will lead to (almost) unpredictable values in the resulting dataset.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 13 Oct 2014 14:59:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181331#M9421</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2014-10-13T14:59:59Z</dc:date>
    </item>
    <item>
      <title>Re: Issue merging data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181332#M9422</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Unfortunately I do need to retain all variables.&amp;nbsp; Basically they added new variables to the Fed's Call Reports after 2011 and to keep the analysis consistent I need to merge the two, the vast majority of the data should be identical with differences arising with the advent of the new variables.&amp;nbsp; &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 13 Oct 2014 16:33:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181332#M9422</guid>
      <dc:creator>tgm4052</dc:creator>
      <dc:date>2014-10-13T16:33:34Z</dc:date>
    </item>
    <item>
      <title>Re: Issue merging data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181333#M9423</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;PRE __jive_macro_name="quote" class="jive_text_macro jive_macro_quote" modifiedtitle="true"&gt;
&lt;P&gt;tgm4052 wrote:&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Unfortunately&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt; they added new variables &lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;YOU .NEED to keep the analysis consistent&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN style="font-size: 10pt; line-height: 1.5em;"&gt;majority of the data should be identical with differences arising with the advent of the new variables.&amp;nbsp; &lt;/SPAN&gt;&lt;/P&gt;
&lt;/PRE&gt;&lt;P&gt;except thar&lt;/P&gt;&lt;P&gt;there are other differences&amp;nbsp; &lt;/P&gt;&lt;P&gt;- in the data type in columns identified in those SAS error messages.&lt;/P&gt;&lt;P style="font-size: 25px; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;RSSD9050 both character and numeric.&lt;/P&gt;&lt;P style="font-size: 25px; font-family: 'Helvetica Neue', Helvetica, Arial, 'Lucida Grande', sans-serif; background-color: #ffffff;"&gt;RCON6979 and &lt;SPAN style="font-size: 25px; line-height: 1.5em;"&gt;RCON5562&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 13 Oct 2014 16:55:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181333#M9423</guid>
      <dc:creator>Peter_C</dc:creator>
      <dc:date>2014-10-13T16:55:28Z</dc:date>
    </item>
    <item>
      <title>Re: Issue merging data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181334#M9424</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;That's right.&amp;nbsp; It is suggested that the two data files are identical except for the new variables.&amp;nbsp; But obviously there is a difference here between what ought to be and what is.&amp;nbsp; Unfortunately, I don't use SAS much and this is turning especially problematic.&amp;nbsp;&amp;nbsp; Is there a way to use proc sql without identifying every one of the 3000-4000 variables.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 13 Oct 2014 17:42:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181334#M9424</guid>
      <dc:creator>tgm4052</dc:creator>
      <dc:date>2014-10-13T17:42:26Z</dc:date>
    </item>
    <item>
      <title>Re: Issue merging data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181335#M9425</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;But you won't keep the values of the data this way. Merging datasets with identical variables leads to bogus data, since which value "trumps" during the merge is quite unpredictable.&lt;/P&gt;&lt;P&gt;You need to rename the variables so that they can exist in parallel, and you can then decide in the data step which values to keep.&lt;/P&gt;&lt;P&gt;Since SAS already complains about different data types, you need to do some serious cleaning before you actually get to the merge.&lt;/P&gt;&lt;P&gt;If the variables in common _should_ contain identical values (I would NEVER trust statements like that without thoroughly checking the data first. I consider all outside data bullshit until proven otherwise; just think of the fact that in your case you have numeric and character mixed up, already), then you do best by dropping them from one of the datasets during the merge, so that you avoid conflict. You can write a preceding step to determine the common variables and set up a drop list in a macro variable.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 14 Oct 2014 05:29:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181335#M9425</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2014-10-14T05:29:27Z</dc:date>
    </item>
    <item>
      <title>Re: Issue merging data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181336#M9426</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Kurt&lt;/P&gt;&lt;P&gt;i thought it was documented that non by-variables are used from the most-right table that contributes that variable ( now I must find that doc ).&lt;/P&gt;&lt;P&gt;of course even if we can be that specific it wont fix the problem in that ERROR:&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 14 Oct 2014 05:55:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181336#M9426</guid>
      <dc:creator>Peter_C</dc:creator>
      <dc:date>2014-10-14T05:55:15Z</dc:date>
    </item>
    <item>
      <title>Re: Issue merging data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181337#M9427</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Imagine the following situation:&lt;/P&gt;&lt;P&gt;dataset have1:&lt;/P&gt;&lt;P&gt;Key Data&lt;/P&gt;&lt;P&gt;A 1&lt;/P&gt;&lt;P&gt;A 2&lt;/P&gt;&lt;P&gt;A 3&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;dataset have2:&lt;/P&gt;&lt;P&gt;Key Data&lt;/P&gt;&lt;P&gt;A 4&lt;/P&gt;&lt;P&gt;A 5&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data want;&lt;/P&gt;&lt;P&gt;merge have1 have2;&lt;/P&gt;&lt;P&gt;by Key;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Dataset want will have the values of have2 in the records 1 &amp;amp; 2 of Key 'A', but record 3 will have the values of have1.&lt;/P&gt;&lt;P&gt;I repeat: having non-by variables with identical names in a merge is usually a BAD idea. Luckily, tgm4052's example crashed because of differing data types, resulting in a good learning example about cleaning input data.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So what needs to be done?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;- Make sure that variables with identical names have identical data types&lt;/P&gt;&lt;P&gt;- Check if identical variables in identical records (as identified by key or position) have correct identical values.&lt;/P&gt;&lt;P&gt;- Make sure that no ill effects arise like what I mentioned at the beginning of this post&lt;/P&gt;&lt;P&gt;- Write a solid merge that deals with everything&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 14 Oct 2014 06:20:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181337#M9427</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2014-10-14T06:20:14Z</dc:date>
    </item>
    <item>
      <title>Re: Issue merging data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181338#M9428</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Although agreeing with you, I would make a further recommendation:&lt;/P&gt;&lt;P&gt;When treating data like transactions for a database table probably we should use the db----- options like DBTYPE &lt;/P&gt;&lt;P&gt;and save PROC IMPORT for places where the input structure is unknown. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Of course it would help if data suppliers provided a definition of the data supplied.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 14 Oct 2014 07:20:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181338#M9428</guid>
      <dc:creator>Peter_C</dc:creator>
      <dc:date>2014-10-14T07:20:21Z</dc:date>
    </item>
    <item>
      <title>Re: Issue merging data</title>
      <link>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181339#M9429</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Reviewing the instructions for getting the data they are using proc import.&lt;/P&gt;&lt;P&gt;It could be better to replace that one with an approach that does not guessing column types. &lt;/P&gt;&lt;P&gt;Bu guessing you have most likely the cause the datasets got different types. That is the failure at merging.&lt;/P&gt;&lt;P&gt;Using the datastep merge the first (left) value gets precedence unless missing. That is different when using sql where you have to specify that.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 14 Oct 2014 07:42:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Statistical-Procedures/Issue-merging-data/m-p/181339#M9429</guid>
      <dc:creator>jakarman</dc:creator>
      <dc:date>2014-10-14T07:42:02Z</dc:date>
    </item>
  </channel>
</rss>

