<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Create unique identifier to link rows with duplicates across different variables in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Create-unique-identifier-to-link-rows-with-duplicates-across/m-p/385444#M92195</link>
    <description>&lt;P&gt;Some of the rules before this gets too krazy ...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There is no such thing as "second." in SAS.&amp;nbsp; You have "first." and "last." and that's it.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To be allowed to use "first." and "last." there must be a BY statement earlier in the DATA step, using the same variable name.&amp;nbsp; So if your BY statement reads "by din;" then you have first.din and last.din to work with, but there is no last.var to work with.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To be allowed to use a BY statement, your data must be sorted by the variable(s) in the BY statement.&amp;nbsp; There are exceptions to this if you specify the word NOTSORTED, but save that for another day.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you want help fixing your application, that might be possible.&amp;nbsp; I would advise skipping macro language entirely, however, until you get a working version of your program.&amp;nbsp; Then&amp;nbsp;you can begin to think about how to add macro language to generate the same working program.&lt;/P&gt;</description>
    <pubDate>Thu, 03 Aug 2017 18:59:20 GMT</pubDate>
    <dc:creator>Astounding</dc:creator>
    <dc:date>2017-08-03T18:59:20Z</dc:date>
    <item>
      <title>Create unique identifier to link rows with duplicates across different variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-unique-identifier-to-link-rows-with-duplicates-across/m-p/385420#M92188</link>
      <description>&lt;P&gt;Hi all-&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I could really use some help with Base SAS programming.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a dataset with a unique identifier called clt_id. There are also 50 variables var1-var50 that have&amp;nbsp;unique identifiers for property ownership.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So I have 51 collums and&amp;nbsp;I would like to create an identifier that links the clt_id's together when there is a match between any of the 50 property identifiers.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So the collums look like this&lt;/P&gt;&lt;P&gt;&lt;U&gt;clt_id&amp;nbsp;&lt;/U&gt;&amp;nbsp;&lt;U&gt; var1&lt;/U&gt; ... &lt;U&gt;var50&lt;/U&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;and I would like to have:&lt;/P&gt;&lt;P&gt;&lt;U&gt;clt_id&lt;/U&gt; &lt;U&gt;var1&lt;/U&gt; ...&amp;nbsp;&lt;U&gt;var50&lt;/U&gt; &lt;U&gt;varm&lt;/U&gt;&lt;/P&gt;&lt;P&gt;where varm would be the same for two clt_ids&amp;nbsp;if any var from&amp;nbsp;one clt_id&amp;nbsp;matched with any other var under another clt_id. There&amp;nbsp;are a maximum of 2 occurances for each var so varm should only contain pairs or unique values.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I would like the&amp;nbsp;identifier to be a concatanation of the two relevant clt_ids.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;ie.&lt;/P&gt;&lt;P&gt;if first.clt_id &amp;gt;&amp;nbsp;second.clt_id then&amp;nbsp;ident = cats(of&amp;nbsp;second.clt_id first.clt_id);&lt;BR /&gt;if&amp;nbsp;first.clt_id &amp;lt;&amp;nbsp;second.clt_id then&amp;nbsp;ident = cats(of first.clt_id second.clt_id);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried stacking the collums into one called var, sort ascending var clt_id&amp;nbsp;, then using the property ownership identifier on second.var as the identifier.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Then i sorted descending and did the same to apply the ownership identifier to the other match. This should work since I am only interested in matching in twos.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Heres what I have so far:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%macro stack();
	data stacked;
	set
	%do i=1 %to 50;
		have (keep=var&amp;amp;i clt_id rename=(var&amp;amp;i=var))
	%end;
;
	run;
%mend stack;
%stack();

proc sort data=stacked;
	by ascending var clt_id;
run;

data stacked;
	do _n_=1 by 1 until(last.var);
	set stacked;
	by var;
	if second.var then varm=var;
	end;
run;

data sort data=stacked;
	by descending var clt_id;
run;

data stacked;
	do _n_=1 by 1 until(last.var);
	set stacked;
	by var;
	if second.var then dinm=var;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;I received the following error on my last data step:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;ERROR 650-185: DATA STEP Component Object failure.&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Aborted during the COMPILATION phase.&lt;BR /&gt;ERROR 557-185: Variable second is not an object.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I thought I could refer to first and second when using by after the set. Anyone know why I am getting this error?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks so much,&lt;/P&gt;&lt;P&gt;Krazee&lt;/P&gt;</description>
      <pubDate>Thu, 03 Aug 2017 19:40:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-unique-identifier-to-link-rows-with-duplicates-across/m-p/385420#M92188</guid>
      <dc:creator>krazee_koder</dc:creator>
      <dc:date>2017-08-03T19:40:43Z</dc:date>
    </item>
    <item>
      <title>Re: Create unique identifier to link rows with duplicates across different variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-unique-identifier-to-link-rows-with-duplicates-across/m-p/385434#M92190</link>
      <description>&lt;P&gt;Hi, What is second.var? That seems interesting. Has SAS introduced an automatic variable second.var recently?&lt;/P&gt;&lt;P&gt;Do you mind charting out a sample table of your HAVE dataset and a WANT dataset with a few records to help me visualise better please. Thank you!&lt;/P&gt;</description>
      <pubDate>Thu, 03 Aug 2017 18:43:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-unique-identifier-to-link-rows-with-duplicates-across/m-p/385434#M92190</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2017-08-03T18:43:00Z</dc:date>
    </item>
    <item>
      <title>Re: Create unique identifier to link rows with duplicates across different variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-unique-identifier-to-link-rows-with-duplicates-across/m-p/385444#M92195</link>
      <description>&lt;P&gt;Some of the rules before this gets too krazy ...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There is no such thing as "second." in SAS.&amp;nbsp; You have "first." and "last." and that's it.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To be allowed to use "first." and "last." there must be a BY statement earlier in the DATA step, using the same variable name.&amp;nbsp; So if your BY statement reads "by din;" then you have first.din and last.din to work with, but there is no last.var to work with.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To be allowed to use a BY statement, your data must be sorted by the variable(s) in the BY statement.&amp;nbsp; There are exceptions to this if you specify the word NOTSORTED, but save that for another day.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you want help fixing your application, that might be possible.&amp;nbsp; I would advise skipping macro language entirely, however, until you get a working version of your program.&amp;nbsp; Then&amp;nbsp;you can begin to think about how to add macro language to generate the same working program.&lt;/P&gt;</description>
      <pubDate>Thu, 03 Aug 2017 18:59:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-unique-identifier-to-link-rows-with-duplicates-across/m-p/385444#M92195</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2017-08-03T18:59:20Z</dc:date>
    </item>
    <item>
      <title>Re: Create unique identifier to link rows with duplicates across different variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-unique-identifier-to-link-rows-with-duplicates-across/m-p/385445#M92196</link>
      <description>&lt;P&gt;It is pitty you havn't post the full log. It is difficult to know which line made the ERROR without the full log.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You cannot use SECOND.VAR - what did you mean by it?&lt;/P&gt;
&lt;P&gt;You can use first.var meaning first row of a value when sorted by var.&lt;/P&gt;
&lt;P&gt;You can get the previous row value of the variable VAR using LAG function. (like: prev = lag(var); )&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I cannot guess what you meant by&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data stacked;
	do _n_=1 by 1 until(last.var);
	set stacked;
	by var;
	if second.var then dinm=var;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;What is DINM ?&lt;/P&gt;
&lt;P&gt;Where do you check that vars match?&lt;/P&gt;
&lt;P&gt;If you have both CLT_IDs in memory and they are &lt;STRONG&gt;numeric&lt;/STRONG&gt; you can conctenate them by:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;varm = min(clt_id1, clt_id2) || max(clt_id1, clt_id2);&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;and no need to sort twice (ascending and descending).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 03 Aug 2017 19:01:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-unique-identifier-to-link-rows-with-duplicates-across/m-p/385445#M92196</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2017-08-03T19:01:13Z</dc:date>
    </item>
    <item>
      <title>Re: Create unique identifier to link rows with duplicates across different variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-unique-identifier-to-link-rows-with-duplicates-across/m-p/385741#M92320</link>
      <description>&lt;P&gt;Here's an idea.&lt;/P&gt;
&lt;P&gt;Provide some example data. You need not provide all 50 variables but 3 or 4 should be sufficient to show the procees.&lt;/P&gt;
&lt;P&gt;Then show what the final result should look like for that data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have a feeling that you are making this way more complicated than it should be.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;NOTE that First. and Last. refer to the ROW order in the data set. Also the value of First. is either 1 or 0 for true (the record is the first fo the specified group) or false (the record is not the first of the group).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 04 Aug 2017 18:07:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-unique-identifier-to-link-rows-with-duplicates-across/m-p/385741#M92320</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2017-08-04T18:07:19Z</dc:date>
    </item>
  </channel>
</rss>

