DATA Step, Macro, Functions and more

Merging long and wide datasets

Reply
Contributor
Posts: 28

Merging long and wide datasets

Hi,

I'm trying to merge three long (meaning the each ID has multiple rows of data) datasets but I know that about 150 of the IDs scattered throughout these datasets are people I want to exclude from my analysis. I have another dataset that contains only the IDs I want. How can I create a final long dataset that has all three of the long datasets, but that includes only the IDs that I want? I have tried to use in=a in the datastep but it didn't work. I tried something like this:

data merged;

set long1 long2 long3 idfile (in=a) ;

if a;

by id;

run;

Thanks!

Laurie

PROC Star
Posts: 7,487

Re: Merging long and wide datasets

One way might be something like the following untested code:

data _long1;

  merge long1(in=a) idfile (in=b) ;

  if a and b;

  by id;

run;

data _long2;

  merge long2(in=a) idfile (in=b) ;

  if a and b;

  by id;

run;

data _long3;

  merge long3(in=a) idfile (in=b) ;

  if a and b;

  by id;

run;

data merged;

  set _long1 _long2 _long3;

  by id;

run;

Contributor
Posts: 28

Re: Merging long and wide datasets

Thanks so much for your thoughts. I tried it, and it didn't work, but it's because I wasn't very clear in my question. For the 3 long datasets, I don't want to merge them by the ID, I want them to go on top of one another (concatenate?) so each ID still has many rows of data. So it works to combine them by using the set command:

data mergedlong;

set long1 long2 long3;

run;

Then I used (essentially) your code to do the merge of combined long datasets plus the ID file which kicked out the IDs I didn't want:

proc sort data=mergedlong; by id;

data merged;

merge mergedlong (in=a) idfile (in=b);

if a and b;

by id; run;

Worked great! Thanks!!

PROC Star
Posts: 7,487

Re: Merging long and wide datasets

Glad to hear you solved your problem.  However, for others facing a similar situation, both Tom's and my suggested code were designed to concatenate, not merge, the 3 files.  I really liked Tom's suggested method.

My guess, as to why it didn't initially work, was that the files weren't all first sorted by id.

However, the method you selected definitely accomplishes the task, too, and only required the one sort (I presume that the idfile was already sorted).

Super User
Super User
Posts: 7,071

Re: Merging long and wide datasets

Put the IDFILE first in the list and remember the setting of the IN= variable from the first observation for the current ID value.

data merged;

  set idfile(in=a) long1 long2 long3 ;

  by id;

  retain in_idfile;

  if first.id then in_idfile=a;

  if in_idfile;

run;


If you do not want to include extra records caused by the IDFILE then you might want to change your subsetting IF condition.

data merged;

  set idfile(in=a) long1 long2 long3 ;

  by id;

  retain in_idfile;

  if first.id then in_idfile=a;

  if in_idfile and NOT a ;

run;

Ask a Question
Discussion stats
  • 4 replies
  • 644 views
  • 3 likes
  • 3 in conversation