BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi there,

I am a 110% novice in SAS and need to merge two big datasets. My problem is that I have multiple observations in multiple tables that use the same variable. It is unfortunately the only variable that all the tables share. When I merge them initially I get of course, duplicate observations, many with much of the same information but 1 or 2 variables have different information.
Would I create a dataset using a new variable to deal with this? In the end, I'd like to just have 1 'ID' per person, not multiple records per person.

To be more precise, going off of another's post, my data looks somewhat like this:


StudyID
1356 Comment1
1356 Comment2
1356 Comment3
1444 Comment1
1444 Comment2
1667 Comment1
1551 Comment1
1551 Comment2
1551 Comment3
1551 Comment4

There can be an infinite number of comments per ID (however, there usually aren't more than 3 or 4) I want to get all 3 comments for ID 1356 (for example) into multiple variables like this:

ID COMM_Multiple Comm_Count Comm1 Comm 2
1356 Y 3 " " " "

I can't get it to even look right on here. But I want the Y to match with Comm_multiple and the 3 to be under Comm_count

Etc etc.... does this make any sense? I don't want them concatenated into one variable, but I want each comment to
remain its own variable. I just want ID to be a single, unique identifier so that there is only 1 per person.

I don't know what to do!
Help 😞

Thanks :) Message was edited by: Nov_girl
3 REPLIES 3
deleted_user
Not applicable
Firstly, if you are going to be a SAS programmer you need to know that 110% indicates that there is probably some sort of error with your data.

Aside from that, you need to do this.....

Although your data looks sorted already, sort it.

proc sort data=study;
by id;
run;

Then flatten the data down to one line per id........

data flat;
*give the concatenated comment anough characters to take all comments;
*and give the multiple comments flag a length of one;
length comments $500 comm_multiple $1;
*retain the value of the comments variable from record to record;
retain comments;
set study;
by id;
*if its the first occurence of id then reset the count and the concatenated field;
if first.id then do;
comments='';
comm_count=0;
end;
*add one to the count for each record;
comm_count+1;
*and concatenate the comment onto the overall comments;
comments=left(trim(comments))!!' '!!comment;
*when we process the last record for the id;
if last.id then do;
*of we have more than one comment set the flag to Y;
if comm_count>1 then comm_multiple='Y';
else comm_multiple='N';
*output the record, one per id;
output;
end;
run;
deleted_user
Not applicable
Thanks for the response. Did I mention that I'm a novice 😉

So...I haven't actually created the variables for multiple comments (yes/no) or multiple comments/count.. Can that be done within this step?
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Also consider the merit of using SAS PROC TRANSPOSE to "go horizontal" with your file letting SAS generate your COMMENTn variables, to be followed by a simple SAS DATA step used to consolidate and count, as shown below:

%LET MAXLEN = 1000;
DATA consolidated_file;
KEEP COMMENT: ALL_COMMENTS COUNT_COMMENT;
* Generate some SAS variables to mimic TRANSPOSE output. ;
COMMENT1 = 'This is comment #1';
COMMENT2 = 'This is comment #2';
COMMENT3 = ' ';
COMMENT4 = ' ';
* SET transposed_file;
ARRAY ACOMMENTS (*) $ COMMENT: ;
* Combine individual comment variables into one. ;
LENGTH ALL_COMMENTS $ &maxlen;
DO I=1 TO DIM(ACOMMENTS);
IF ACOMMENTS(I) NE ' ' THEN DO;
* GIVE UP IF WE WILL TRUNCATE THE COMBINED COMMENT VAR. ;
LEN = LENGTH(ALL_COMMENTS!!ACOMMENTS(I));
IF LENGTH( catt( ALL_COMMENTS,ACOMMENTS(I) ) ) GT &MAXLEN THEN ABORT ;
ALL_COMMENTS = CATX(' ',ALL_COMMENTS,ACOMMENTS(I) );
END;
ELSE LEAVE;
END;
COUNT_COMMENT = I-1;
RUN;


Scott Barry
SBBWorks, Inc.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1288 views
  • 0 likes
  • 2 in conversation