<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: SAS NOVICE NEED TO DEDUPLICATE! in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/SAS-NOVICE-NEED-TO-DEDUPLICATE/m-p/20744#M4314</link>
    <description>Also consider the merit of using SAS PROC TRANSPOSE to "go horizontal" with your file letting SAS generate your COMMENTn variables, to be followed by a simple SAS DATA step used to consolidate and count, as shown below:&lt;BR /&gt;
&lt;BR /&gt;
%LET MAXLEN = 1000;&lt;BR /&gt;
DATA consolidated_file;&lt;BR /&gt;
KEEP COMMENT: ALL_COMMENTS COUNT_COMMENT;&lt;BR /&gt;
* Generate some SAS variables to mimic TRANSPOSE output. ;&lt;BR /&gt;
COMMENT1 = 'This is comment #1';&lt;BR /&gt;
COMMENT2 = 'This is comment #2';&lt;BR /&gt;
COMMENT3 = ' ';&lt;BR /&gt;
COMMENT4 = ' ';&lt;BR /&gt;
* SET transposed_file;&lt;BR /&gt;
ARRAY ACOMMENTS (*) $ COMMENT: ;&lt;BR /&gt;
* Combine individual comment variables into one. ;&lt;BR /&gt;
LENGTH ALL_COMMENTS $ &amp;amp;maxlen;&lt;BR /&gt;
DO I=1 TO DIM(ACOMMENTS);&lt;BR /&gt;
  IF ACOMMENTS(I) NE ' ' THEN DO;&lt;BR /&gt;
    * GIVE UP IF WE WILL TRUNCATE THE COMBINED COMMENT VAR. ;&lt;BR /&gt;
    LEN = LENGTH(ALL_COMMENTS!!ACOMMENTS(I));&lt;BR /&gt;
    IF LENGTH( catt( ALL_COMMENTS,ACOMMENTS(I) ) ) GT &amp;amp;MAXLEN THEN ABORT ;&lt;BR /&gt;
    ALL_COMMENTS = CATX(' ',ALL_COMMENTS,ACOMMENTS(I) );&lt;BR /&gt;
  END;&lt;BR /&gt;
  ELSE LEAVE;&lt;BR /&gt;
END;&lt;BR /&gt;
COUNT_COMMENT = I-1;&lt;BR /&gt;
RUN;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
Scott Barry&lt;BR /&gt;
SBBWorks, Inc.</description>
    <pubDate>Fri, 17 Apr 2009 17:20:04 GMT</pubDate>
    <dc:creator>sbb</dc:creator>
    <dc:date>2009-04-17T17:20:04Z</dc:date>
    <item>
      <title>SAS NOVICE NEED TO DEDUPLICATE!</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/SAS-NOVICE-NEED-TO-DEDUPLICATE/m-p/20741#M4311</link>
      <description>Hi there,&lt;BR /&gt;
&lt;BR /&gt;
I am a 110% novice in SAS and need to merge two big datasets. My problem is that I have multiple observations in multiple tables that use the same variable. It is unfortunately the only variable that all the tables share. When I merge them initially I get of course, duplicate observations, many with much of the same information but 1 or 2 variables have different information.&lt;BR /&gt;
Would I create a dataset using a new variable to deal with this? In the end, I'd like to just have 1 'ID' per person, not multiple records per person.&lt;BR /&gt;
&lt;BR /&gt;
To be more precise, going off of another's post, my data looks somewhat like this:&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
StudyID         &lt;BR /&gt;
1356           Comment1&lt;BR /&gt;
1356           Comment2&lt;BR /&gt;
1356           Comment3&lt;BR /&gt;
1444           Comment1&lt;BR /&gt;
1444           Comment2&lt;BR /&gt;
1667           Comment1&lt;BR /&gt;
1551           Comment1&lt;BR /&gt;
1551           Comment2&lt;BR /&gt;
1551           Comment3&lt;BR /&gt;
1551           Comment4&lt;BR /&gt;
&lt;BR /&gt;
There can be an infinite number of comments per ID (however, there usually aren't more than 3 or 4)  I want to get all 3 comments for ID 1356 (for example) into multiple variables like this:&lt;BR /&gt;
&lt;BR /&gt;
ID      COMM_Multiple  Comm_Count  Comm1   Comm 2 &lt;BR /&gt;
1356         Y              3        " "      " "&lt;BR /&gt;
&lt;BR /&gt;
I can't get it to even look right on here. But I want the Y to match with Comm_multiple and the 3 to be under Comm_count&lt;BR /&gt;
&lt;BR /&gt;
Etc etc.... does this make any sense? I don't want them concatenated into one variable, but I want each comment to&lt;BR /&gt;
remain its own variable. I just want ID to be a single, unique identifier so that there is only 1 per person.&lt;BR /&gt;
&lt;BR /&gt;
I don't know what to do!&lt;BR /&gt;
Help &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;BR /&gt;
&lt;BR /&gt;
Thanks :)

Message was edited by: Nov_girl</description>
      <pubDate>Fri, 17 Apr 2009 14:02:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/SAS-NOVICE-NEED-TO-DEDUPLICATE/m-p/20741#M4311</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2009-04-17T14:02:56Z</dc:date>
    </item>
    <item>
      <title>Re: SAS NOVICE NEED TO DEDUPLICATE!</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/SAS-NOVICE-NEED-TO-DEDUPLICATE/m-p/20742#M4312</link>
      <description>Firstly, if you are going to be a SAS programmer you need to know that 110% indicates that there is probably some sort of error with your data.&lt;BR /&gt;
&lt;BR /&gt;
Aside from that, you need to do this.....&lt;BR /&gt;
&lt;BR /&gt;
Although your data looks sorted already, sort it.&lt;BR /&gt;
&lt;BR /&gt;
proc sort data=study;&lt;BR /&gt;
by id;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
Then flatten the data down to one line per id........&lt;BR /&gt;
&lt;BR /&gt;
data flat;&lt;BR /&gt;
*give the concatenated comment anough characters to take all comments;&lt;BR /&gt;
*and give the multiple comments flag a length of one;&lt;BR /&gt;
length comments $500 comm_multiple $1;&lt;BR /&gt;
*retain the value of the comments variable from record to record;&lt;BR /&gt;
retain comments;&lt;BR /&gt;
set study;&lt;BR /&gt;
by id;&lt;BR /&gt;
*if its the first occurence of id then reset the count and the concatenated field;&lt;BR /&gt;
if first.id then do;&lt;BR /&gt;
 comments='';&lt;BR /&gt;
 comm_count=0;&lt;BR /&gt;
end;&lt;BR /&gt;
*add one to the count for each record;&lt;BR /&gt;
comm_count+1;&lt;BR /&gt;
*and concatenate the comment onto the overall comments;&lt;BR /&gt;
comments=left(trim(comments))!!' '!!comment;&lt;BR /&gt;
*when we process the last record for the id;&lt;BR /&gt;
if last.id then do;&lt;BR /&gt;
*of we have more than one comment set the flag to Y;&lt;BR /&gt;
if comm_count&amp;gt;1 then comm_multiple='Y';&lt;BR /&gt;
else comm_multiple='N';&lt;BR /&gt;
*output the record, one per id;&lt;BR /&gt;
output;&lt;BR /&gt;
end;&lt;BR /&gt;
run;</description>
      <pubDate>Fri, 17 Apr 2009 15:04:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/SAS-NOVICE-NEED-TO-DEDUPLICATE/m-p/20742#M4312</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2009-04-17T15:04:34Z</dc:date>
    </item>
    <item>
      <title>Re: SAS NOVICE NEED TO DEDUPLICATE!</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/SAS-NOVICE-NEED-TO-DEDUPLICATE/m-p/20743#M4313</link>
      <description>Thanks for the response. Did I mention that I'm a novice &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;BR /&gt;
&lt;BR /&gt;
So...I haven't actually created the variables for multiple comments (yes/no) or multiple comments/count.. Can that be done within this step?</description>
      <pubDate>Fri, 17 Apr 2009 15:53:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/SAS-NOVICE-NEED-TO-DEDUPLICATE/m-p/20743#M4313</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2009-04-17T15:53:25Z</dc:date>
    </item>
    <item>
      <title>Re: SAS NOVICE NEED TO DEDUPLICATE!</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/SAS-NOVICE-NEED-TO-DEDUPLICATE/m-p/20744#M4314</link>
      <description>Also consider the merit of using SAS PROC TRANSPOSE to "go horizontal" with your file letting SAS generate your COMMENTn variables, to be followed by a simple SAS DATA step used to consolidate and count, as shown below:&lt;BR /&gt;
&lt;BR /&gt;
%LET MAXLEN = 1000;&lt;BR /&gt;
DATA consolidated_file;&lt;BR /&gt;
KEEP COMMENT: ALL_COMMENTS COUNT_COMMENT;&lt;BR /&gt;
* Generate some SAS variables to mimic TRANSPOSE output. ;&lt;BR /&gt;
COMMENT1 = 'This is comment #1';&lt;BR /&gt;
COMMENT2 = 'This is comment #2';&lt;BR /&gt;
COMMENT3 = ' ';&lt;BR /&gt;
COMMENT4 = ' ';&lt;BR /&gt;
* SET transposed_file;&lt;BR /&gt;
ARRAY ACOMMENTS (*) $ COMMENT: ;&lt;BR /&gt;
* Combine individual comment variables into one. ;&lt;BR /&gt;
LENGTH ALL_COMMENTS $ &amp;amp;maxlen;&lt;BR /&gt;
DO I=1 TO DIM(ACOMMENTS);&lt;BR /&gt;
  IF ACOMMENTS(I) NE ' ' THEN DO;&lt;BR /&gt;
    * GIVE UP IF WE WILL TRUNCATE THE COMBINED COMMENT VAR. ;&lt;BR /&gt;
    LEN = LENGTH(ALL_COMMENTS!!ACOMMENTS(I));&lt;BR /&gt;
    IF LENGTH( catt( ALL_COMMENTS,ACOMMENTS(I) ) ) GT &amp;amp;MAXLEN THEN ABORT ;&lt;BR /&gt;
    ALL_COMMENTS = CATX(' ',ALL_COMMENTS,ACOMMENTS(I) );&lt;BR /&gt;
  END;&lt;BR /&gt;
  ELSE LEAVE;&lt;BR /&gt;
END;&lt;BR /&gt;
COUNT_COMMENT = I-1;&lt;BR /&gt;
RUN;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
Scott Barry&lt;BR /&gt;
SBBWorks, Inc.</description>
      <pubDate>Fri, 17 Apr 2009 17:20:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/SAS-NOVICE-NEED-TO-DEDUPLICATE/m-p/20744#M4314</guid>
      <dc:creator>sbb</dc:creator>
      <dc:date>2009-04-17T17:20:04Z</dc:date>
    </item>
  </channel>
</rss>

