<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: merging 2 datasets by variable - multiple lengths in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/merging-2-datasets-by-variable-multiple-lengths/m-p/456207#M284353</link>
    <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/41529"&gt;@Diana_AdventuresinSAS&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;Thanks Tom!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The OCCSOC variable looks good, seems to have worked. However, I still have the case of the missing values for all the ACS2016 variables.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any thoughts on why that is?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Obs    occsoc    MET2013    PERNUM    cognitiveindex    manuphysindex 
1      1110XX    .          .          70.3800             18.0500 
2      111021    .          .          64.8700             26.2500 &lt;/CODE&gt;&lt;/PRE&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I would assume it is because those values of the BY variable are not present in that dataset.&lt;/P&gt;
&lt;P&gt;Formats change how the value is printed not how it is stored.&lt;/P&gt;
&lt;P&gt;Check the values for leading spaces as normally SAS will not display leading spaces.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql ;
  select case when occsoc = left(occsoc) then 'LEFT ALIGNED' 
   else 'OTHER' end as check,count(*)
  from ACS2016 
  group by 1
 ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sat, 21 Apr 2018 17:19:19 GMT</pubDate>
    <dc:creator>Tom</dc:creator>
    <dc:date>2018-04-21T17:19:19Z</dc:date>
    <item>
      <title>merging 2 datasets by variable - multiple lengths</title>
      <link>https://communities.sas.com/t5/SAS-Programming/merging-2-datasets-by-variable-multiple-lengths/m-p/456199#M284349</link>
      <description>&lt;P&gt;Hello!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have 2 datasets I would like to merge by variable OCCSOC. In one of them, it is connected to a formats file. (Unfortunately, I can't figure out how to duplicate the dataset by get rid of the formats.)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am getting the following statement:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;52   Data onetlib.onetipums;
53   merge IPUMS.ACS2016 onetlib.onetmergedup;
54   length occsoc $10;
WARNING: Length of character variable OCCSOC has already been set.
         Use the LENGTH statement as the very first statement in the DATA STEP to declare the
         length of a character variable.
55   by occsoc;
56   format occsoc $occf.;
57   run;

WARNING: Multiple lengths were specified for the BY variable OCCSOC by input data sets and
         LENGTH, FORMAT, INFORMAT, or ATTRIB statements. This might cause unexpected results.
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Data=IPUMS.ACS2016&lt;/P&gt;
&lt;P&gt;# &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Variable&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Type &amp;nbsp;&amp;nbsp;&amp;nbsp;Len &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Format &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Label&lt;/P&gt;
&lt;P&gt;10 &amp;nbsp;&amp;nbsp;OCCSOC &amp;nbsp;&amp;nbsp;&amp;nbsp;Char &amp;nbsp;&amp;nbsp;&amp;nbsp;6&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Occupation, SOC classification&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Data=ONETLIB.ONETMERGEDUP&lt;/P&gt;
&lt;P&gt;# &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Variable&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Type &amp;nbsp;&amp;nbsp;&amp;nbsp;Len &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Format&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Informat&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Label&lt;/P&gt;
&lt;P&gt;1 &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;OCCSOC &amp;nbsp;&amp;nbsp;&amp;nbsp;Char &amp;nbsp;&amp;nbsp;&amp;nbsp;10 &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;$OCCF. &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;$10. &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;OCCSOC&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is my merge program:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;libname IPUMS "C:\Users\dps24\Documents\D Drive\IPUMS";
libname onetlib 'C:\Users\dps24\Documents\D Drive\ONET\db_21_3_excel\Use for Dissertation';
options fmtsearch=(onetlib.occformat);

proc contents data=IPUMS.ACS2016; run;
proc contents data=onetlib.onetmergedup; run;

*both data are sorted by occsoc;

Data onetlib.onetipums;
merge IPUMS.ACS2016 onetlib.onetmergedup;
length occsoc $10;
by occsoc;
format occsoc $occf.;
run;

proc print data=onetlib.onetipums (obs=10); run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I get a very funky print of data:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Obs MET2013 PERNUM PERWT SEX &amp;nbsp;&amp;nbsp;OCCSOC &amp;nbsp;&amp;nbsp;broad major minor IND cognitiveindex manuphysindex workvaluesindex&lt;/P&gt;
&lt;P&gt;1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;. &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;. &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;. &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;. &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;11-101&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 70.3800 &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;18.0500 &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;22.9967&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The variables belonging to ACS2016 have missing data, while the onetmerdup variables look fine. The OCCSOC values are wrong, like the formats didn't do their job. When I print these two datasets separately, they look great and OCCSOC values match (e.g. 111010 instead of 11-1010 - I did the formats because I needed the OCCSOC values to match to eventually merge them like I'm trying now).&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is this a length issue or a format issue? Is it possible to duplicate the onetmergedup data without any attached formats, and would that correct this problem?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks so much!!&lt;/P&gt;
&lt;P&gt;Diana&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 21 Apr 2018 16:21:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/merging-2-datasets-by-variable-multiple-lengths/m-p/456199#M284349</guid>
      <dc:creator>Diana_AdventuresinSAS</dc:creator>
      <dc:date>2018-04-21T16:21:29Z</dc:date>
    </item>
    <item>
      <title>Re: merging 2 datasets by variable - multiple lengths</title>
      <link>https://communities.sas.com/t5/SAS-Programming/merging-2-datasets-by-variable-multiple-lengths/m-p/456200#M284350</link>
      <description>&lt;P&gt;SAS will define the variable type and storage length for variable used in a data step at the first place it sees the variable. Once the length for character variable is defined it cannot be changed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Because the first place&amp;nbsp;&lt;SPAN&gt;OCCSOC appears it is defined as length $6 the result is that y&lt;/SPAN&gt;ou are currently trying to stuff data from a 10 character long variable into a 6 character long variable.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Moving the LENGTH statement before the MERGE statement will define the variable as length $10. This will prevent truncating the values that are read from&amp;nbsp;&lt;SPAN&gt;ONETLIB.ONETMERGEDUP.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The format attached does not matter to how the values are matched. Just how they are printed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 21 Apr 2018 16:29:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/merging-2-datasets-by-variable-multiple-lengths/m-p/456200#M284350</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2018-04-21T16:29:38Z</dc:date>
    </item>
    <item>
      <title>Re: merging 2 datasets by variable - multiple lengths</title>
      <link>https://communities.sas.com/t5/SAS-Programming/merging-2-datasets-by-variable-multiple-lengths/m-p/456203#M284351</link>
      <description>&lt;P&gt;Thanks Tom!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The OCCSOC variable looks good, seems to have worked. However, I still have the case of the missing values for all the ACS2016 variables.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any thoughts on why that is?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Obs    occsoc    MET2013    PERNUM    cognitiveindex    manuphysindex 
1      1110XX    .          .          70.3800             18.0500 
2      111021    .          .          64.8700             26.2500 &lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 21 Apr 2018 16:47:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/merging-2-datasets-by-variable-multiple-lengths/m-p/456203#M284351</guid>
      <dc:creator>Diana_AdventuresinSAS</dc:creator>
      <dc:date>2018-04-21T16:47:20Z</dc:date>
    </item>
    <item>
      <title>Re: merging 2 datasets by variable - multiple lengths</title>
      <link>https://communities.sas.com/t5/SAS-Programming/merging-2-datasets-by-variable-multiple-lengths/m-p/456206#M284352</link>
      <description>&lt;P&gt;Sounds like the OCCSOC variable is actually different in one of the datasets. If one dataset contains OCCSOC values that contain a hyphen, something like the following would correct it:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data ACS2016;
  length OCCSOC $6.;
  input OCCSOC x;
  cards;
111010 4
111020 3
;

data ONETMERGEDUP;
  length OCCSOC $10.;
  input OCCSOC y;
  cards;
11-1010 1
11-1020 2
;

data onetmergedup_fixed (drop=_:);
  length OCCSOC $6;
  set ONETMERGEDUP (rename=(OCCSOC=_OCCSOC));
  OCCSOC=strip(compress(_OCCSOC,'-'));
run;

data want;
  merge ACS2016 onetmergedup_fixed;
  by OCCSOC;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Art, CEO, AnalystFinder.com&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 21 Apr 2018 17:18:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/merging-2-datasets-by-variable-multiple-lengths/m-p/456206#M284352</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2018-04-21T17:18:59Z</dc:date>
    </item>
    <item>
      <title>Re: merging 2 datasets by variable - multiple lengths</title>
      <link>https://communities.sas.com/t5/SAS-Programming/merging-2-datasets-by-variable-multiple-lengths/m-p/456207#M284353</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/41529"&gt;@Diana_AdventuresinSAS&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;Thanks Tom!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The OCCSOC variable looks good, seems to have worked. However, I still have the case of the missing values for all the ACS2016 variables.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any thoughts on why that is?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Obs    occsoc    MET2013    PERNUM    cognitiveindex    manuphysindex 
1      1110XX    .          .          70.3800             18.0500 
2      111021    .          .          64.8700             26.2500 &lt;/CODE&gt;&lt;/PRE&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I would assume it is because those values of the BY variable are not present in that dataset.&lt;/P&gt;
&lt;P&gt;Formats change how the value is printed not how it is stored.&lt;/P&gt;
&lt;P&gt;Check the values for leading spaces as normally SAS will not display leading spaces.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql ;
  select case when occsoc = left(occsoc) then 'LEFT ALIGNED' 
   else 'OTHER' end as check,count(*)
  from ACS2016 
  group by 1
 ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 21 Apr 2018 17:19:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/merging-2-datasets-by-variable-multiple-lengths/m-p/456207#M284353</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2018-04-21T17:19:19Z</dc:date>
    </item>
    <item>
      <title>Re: merging 2 datasets by variable - multiple lengths</title>
      <link>https://communities.sas.com/t5/SAS-Programming/merging-2-datasets-by-variable-multiple-lengths/m-p/456266#M284354</link>
      <description>&lt;P&gt;A merge is done with raw, unformatted values. Create a new variable with correct, clean values (in the dateset(s) where that conversion is needed) and merge by that.&lt;/P&gt;
&lt;P&gt;Ideally, you should do such cleaning when data is imported into your data warehouse.&lt;/P&gt;</description>
      <pubDate>Sun, 22 Apr 2018 06:15:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/merging-2-datasets-by-variable-multiple-lengths/m-p/456266#M284354</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2018-04-22T06:15:48Z</dc:date>
    </item>
  </channel>
</rss>

