<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Merging Datasets - Truncated Values in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259152#M50083</link>
    <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/32733"&gt;@FreelanceReinh﻿&lt;/a&gt;&amp;nbsp;Sorry. I should have tried this first. Given my struggles lately, I assumed that it couldn't be that easy...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It works, kinda. A new PROC CONTENTS says the length has been changed, however the values are still truncated. Obviously I need to do some more investigating.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The values are not truncated in any of the datasets that go into the final dataset...Hmmm.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Btw, I should have made it clear that I am&amp;nbsp;joining/stacking the 5 merged datasets into one final dataset (i.e. Not merging). I'm going to make the change in the original post.&lt;/P&gt;</description>
    <pubDate>Fri, 25 Mar 2016 20:51:19 GMT</pubDate>
    <dc:creator>_maldini_</dc:creator>
    <dc:date>2016-03-25T20:51:19Z</dc:date>
    <item>
      <title>Merging/Stacking Datasets - Truncated Values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259136#M50075</link>
      <description>&lt;P&gt;I am stacking&amp;nbsp;5 datasets into 1 (Time periods 1-5 = Final dataset). Each of these 5 datasets was the result&amp;nbsp;of a merge of&amp;nbsp;5 datasets (e.g. Physical activity, nutrition, personal health, etc.). That's 25 datasets total. There are 500 variables and 2000 observations.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I used PROC IMPORT for each of 25 Excel worksheets&amp;nbsp;(5 files w/ 5 tabs each).&amp;nbsp;I will never do this&amp;nbsp;again!&amp;nbsp;&lt;img id="smileylol" class="emoticon emoticon-smileylol" src="https://communities.sas.com/i/smilies/16x16_smiley-lol.png" alt="Smiley LOL" title="Smiley LOL" /&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am seeing truncated values in at least 1 character variable - an open text box - in the final dataset only.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Not surprisingly, according to PROC CONTENTS, the length of this variable is different in each of the 5 merged datasets (Time period 1, 2, 3, 4, 5).&amp;nbsp;No truncation occurs in these datasets.&amp;nbsp;The truncation only occurs in the final dataset, after these 5 datasets are joined&amp;nbsp;into&amp;nbsp;one.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Can I assign length, or other attributes, to certain variables in the DATA step before&amp;nbsp;the SET&amp;nbsp;statement for the final dataset?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My code:&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;DATA a1_a5_final;

  LENGTH var1 $45;
  SET 
			a1_final 
			a2_final 
			a3_final 
			a4_final 
			a5_final;	
RUN; &lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There are only a handful of character variables that I would need to be worried about here. In a perfect world, I would go back and read-in the data for each worksheet and set the attributes. It's too late for that now.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is there a quicker way to solve this problem?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 25 Mar 2016 20:53:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259136#M50075</guid>
      <dc:creator>_maldini_</dc:creator>
      <dc:date>2016-03-25T20:53:43Z</dc:date>
    </item>
    <item>
      <title>Re: Merging Datasets - Truncated Values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259140#M50077</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/36911"&gt;@_maldini_&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;Can I assign length, or other attributes, to certain variables in the DATA step before&amp;nbsp;the MERGE statement for the final dataset?&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Yes, you can. Assigning lengths could look like this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
length first_chvar $260 second_chvar $300 ...;
merge ...
...
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 25 Mar 2016 19:58:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259140#M50077</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2016-03-25T19:58:26Z</dc:date>
    </item>
    <item>
      <title>Re: Merging Datasets - Truncated Values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259148#M50080</link>
      <description>&lt;P&gt;Consider using PROC SQL to join the data sets instead of a DATA step. &amp;nbsp;SQL is pretty good at locating the longest length needed and using it, with no extra work on your part.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Mar 2016 20:26:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259148#M50080</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2016-03-25T20:26:06Z</dc:date>
    </item>
    <item>
      <title>Re: Merging Datasets - Truncated Values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259151#M50082</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/4954"&gt;@Astounding﻿&lt;/a&gt;&amp;nbsp;I've never used it. Would it be too much to ask for some example code?&amp;nbsp;&lt;img id="smileysurprised" class="emoticon emoticon-smileysurprised" src="https://communities.sas.com/i/smilies/16x16_smiley-surprised.png" alt="Smiley Surprised" title="Smiley Surprised" /&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Btw, I should have made it clear that was joining/stacking the 5 merged datasets into one final dataset (i.e. Not merging)...&lt;/P&gt;</description>
      <pubDate>Fri, 25 Mar 2016 20:48:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259151#M50082</guid>
      <dc:creator>_maldini_</dc:creator>
      <dc:date>2016-03-25T20:48:52Z</dc:date>
    </item>
    <item>
      <title>Re: Merging Datasets - Truncated Values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259152#M50083</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/32733"&gt;@FreelanceReinh﻿&lt;/a&gt;&amp;nbsp;Sorry. I should have tried this first. Given my struggles lately, I assumed that it couldn't be that easy...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It works, kinda. A new PROC CONTENTS says the length has been changed, however the values are still truncated. Obviously I need to do some more investigating.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The values are not truncated in any of the datasets that go into the final dataset...Hmmm.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Btw, I should have made it clear that I am&amp;nbsp;joining/stacking the 5 merged datasets into one final dataset (i.e. Not merging). I'm going to make the change in the original post.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Mar 2016 20:51:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259152#M50083</guid>
      <dc:creator>_maldini_</dc:creator>
      <dc:date>2016-03-25T20:51:19Z</dc:date>
    </item>
    <item>
      <title>Re: Merging Datasets - Truncated Values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259153#M50084</link>
      <description>&lt;P&gt;You can use the LENGTH statement in the same way before a SET statement (if that's what you used).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The important point&amp;nbsp;is that the LENGTH statement is executed &lt;EM&gt;before&lt;/EM&gt; the SET (or MERGE, ...) statement, so that the specified lengths are set in the PDV (program data vector)&amp;nbsp;and the values from the input datasets are read into the "wide fields" prepared in the PDV.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Mar 2016 20:58:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259153#M50084</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2016-03-25T20:58:32Z</dc:date>
    </item>
    <item>
      <title>Re: Merging/Stacking Datasets - Truncated Values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259157#M50086</link>
      <description>&lt;P&gt;Also, please make sure that&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;no permanent formats (of insufficient lengths) are associated with the character variables in question (see PROC CONTENTS output)&lt;/LI&gt;
&lt;LI&gt;the truncation is not&amp;nbsp;just caused by displaying the values, e.g., with PROC PRINT (see the log for pertinent warnings).&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Maybe your variables contain the correct values, but either of the above issues&amp;nbsp;lets them &lt;EM&gt;appear&lt;/EM&gt; truncated.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Mar 2016 21:04:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259157#M50086</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2016-03-25T21:04:57Z</dc:date>
    </item>
    <item>
      <title>Re: Merging Datasets - Truncated Values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259187#M50093</link>
      <description>&lt;P&gt;Here's a version worth trying:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc sql;&lt;/P&gt;
&lt;P&gt;create table a1_a5_final as&lt;/P&gt;
&lt;P&gt;select * from a1_final union&lt;/P&gt;
&lt;P&gt;select * from a2_final union&lt;/P&gt;
&lt;P&gt;select * from a3_final union&lt;/P&gt;
&lt;P&gt;select * from a4_final union&lt;/P&gt;
&lt;P&gt;select * from a5_final;&lt;/P&gt;
&lt;P&gt;quit;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Pay careful attention to the number of records coming in vs. going out. &amp;nbsp;SQL UNION has a habit of removing duplicates.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Mar 2016 22:23:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259187#M50093</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2016-03-25T22:23:06Z</dc:date>
    </item>
    <item>
      <title>Re: Merging Datasets - Truncated Values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259189#M50095</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/4954"&gt;@Astounding&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;SQL UNION has a habit of removing duplicates.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Indeed, this is the standard behavior. But you can avoid the elimination of duplicates by adding the ALL keyword.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Another "habit" of the UNION operator which is &lt;EM&gt;very&lt;/EM&gt; unfamiliar if you come from data step programming is to align columns by &lt;EM&gt;position&lt;/EM&gt;, not by name. Again, there is a keyword, CORRESPONDING, to change&amp;nbsp;this behavior to the more familiar alignment by variable name. So, you may want to try "&lt;FONT face="courier new,courier"&gt;union all corresponding&lt;/FONT&gt;" if you encounter unwanted effects with "&lt;FONT face="courier new,courier"&gt;union&lt;/FONT&gt;" alone. However, the CORRESPONDING keyword&amp;nbsp;has a (maybe again unwanted) side effect: Non-matching columns are dropped. (So, in this case the UNION &lt;SPAN&gt;CORRESPONDING&amp;nbsp;&lt;/SPAN&gt;operator operates like an "&lt;EM&gt;intersection&lt;/EM&gt; operator" on the sets of variable names ... Maybe the data step is somewhat easier to use.)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Edit: I forgot to mention that PROC SQL in general has a habit of delivering&amp;nbsp;observations in (what can seem like) random order. (But&amp;nbsp;you can add an ORDER BY clause ... &lt;EM&gt;if&lt;/EM&gt; you can specify sort criteria.)&lt;/P&gt;</description>
      <pubDate>Fri, 25 Mar 2016 22:57:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259189#M50095</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2016-03-25T22:57:38Z</dc:date>
    </item>
    <item>
      <title>Re: Merging/Stacking Datasets - Truncated Values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259194#M50100</link>
      <description>&lt;P&gt;SAS has an extremely nasty habit when reading from Excel or external databases of permanently attaching formats to character variables.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;format comment $100.;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Now if you use PROC IMPORT to let SAS guess at how to format your data from Excel of text files then you can end up with the variable in one file being of length 10 and in another file of length 20. &amp;nbsp;This causes two problems. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The first problem is how to make sure that when you stack them that each variable is assigned the proper length so that no values are truncated. &amp;nbsp;PROC SQL can help with that. &amp;nbsp;Or you can just define the variables BEFORE the MERGE or SET statement.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want ;
   length id $20 age 8 comment $200 ;
   set ds1-ds5 ;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The second problem as those annoying permanently attached formats. &amp;nbsp;If the first dataset has COMMENT define as LENGTH=$200 and has no format attached and the second has it defined as LENGTH=100 and has $100. format attached if you set them together you get the right length but the wrong format is attached so when you print the value it is truncated. &amp;nbsp;You fix this by adding a format statement that lists variables but does not have a format specified. This will remove the format from those variables.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want ;
   length id $20 age 8 comment $200 ;
   set ds1-ds5 ;
   format _character_ ;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 25 Mar 2016 23:29:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259194#M50100</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2016-03-25T23:29:38Z</dc:date>
    </item>
    <item>
      <title>Re: Merging/Stacking Datasets - Truncated Values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259489#M50218</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom﻿&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This explanation is very helpful. Thank you.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;After playing around w/ these ideas, I noticed that the lengths and formats for many of these variable conflicted, or there was no format assigned, as you mentioned. I also noticed that when the LENGTH and the FORMAT conflict, the LENGTH attribute "wins" in the dataset.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;lt;You fix this by adding a format statement that lists variables but does not have a format specified. This will remove the format from those variables.&amp;gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In your example you put&amp;nbsp;"_character_" in the FORMAT statement.&amp;nbsp;What is the effect of this? Will it simply strip the format of all character variables?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Final question, if I included an INFORMAT statement in a DATA step like this, where would it go relative to the LENGTH, SET and FORMAT statements? After LENGTH and before SET?&lt;/P&gt;</description>
      <pubDate>Mon, 28 Mar 2016 22:42:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259489#M50218</guid>
      <dc:creator>_maldini_</dc:creator>
      <dc:date>2016-03-28T22:42:42Z</dc:date>
    </item>
    <item>
      <title>Re: Merging/Stacking Datasets - Truncated Values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259500#M50220</link>
      <description>&lt;P&gt;_CHARACTER_ is an example of a variable list. &amp;nbsp;It means all character variables. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There is almost no situations where it makes sense to have a $xxx. format permanently attached to a variable. &amp;nbsp;Formats instruct SAS how to display the data and SAS already knows how to display character strings without any special instructions.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The real solution is to avoid using CSV files or other formats for your data where the lengths of variables are not defined. &amp;nbsp;If you have a lot of CSV files then create your own data step&amp;nbsp;to read them so that you can make sure to define the variables properly. &amp;nbsp;This will make combining them easier.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Mar 2016 22:45:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merging-Stacking-Datasets-Truncated-Values/m-p/259500#M50220</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2016-03-28T22:45:46Z</dc:date>
    </item>
  </channel>
</rss>

