Solved: How to merge a dataset with duplicate identifiers to a panel data?

xyxu · Posted 01-20-2018 11:55 AM

I have two datasets, call them dataset1 and dataset2. The first is a list of identifiers with character variable name "symbol", and there are duplicates in it. I use another variable "id" to order these symbols. The second is a panel data set with symbol and "year". For example,

data dataset1;
input id symbol $;
format id 1. symbol $2.;
datalines;
1 A
2 B
3 B
4 C
5 D
6 D
7 D
8 E
;
run;

and

data dataset2;
length symbol $2 date 8;
informat symbol $2. date yymmdd.;
format symbol $2. date yymmddn8.;
input symbol $ date;
datalines;
A 20120101
A 20120102
A 20120103
B 20120101
B 20120102
C 20120102
C 20120103
C 20120104
D 20120101
D 20120102
D 20120103
D 20120104
E 20120102
E 20120103
;
run;

My purpose is to merge the two datasets by "symbol", and each "id" in dataset1 should be merged with all corresponding symbol-date observations in dataset2. The merged data set should contain 24 observations. However, if I use

data want;
	merge dataset1 (in = a) dataset2 (in = b);
	by symbol;
	if a and b;
run;

I got only 14 observations. The reason is that, duplicate symbols in dataset1 are not treated as different IDs. How can I achieve the result I want?

Astounding · Posted 01-20-2018 12:15 PM

SQL does this fairly easily:

proc sql;

create table want as

select dataset1.*, dataset2.date

from dataset1, dataset2

where dataset1.symbol = dataset2.symbol;

quit;

View solution in original post

Astounding · Posted 01-20-2018 12:15 PM

SQL does this fairly easily:

proc sql;

create table want as

select dataset1.*, dataset2.date

from dataset1, dataset2

where dataset1.symbol = dataset2.symbol;

quit;

How to merge a dataset with duplicate identifiers to a panel data?

Re: How to merge a dataset with duplicate identifiers to a panel data?

Re: How to merge a dataset with duplicate identifiers to a panel data?

Catch up on SAS Innovate 2026

How to merge a dataset with duplicate identifiers to a panel data?

Re: How to merge a dataset with duplicate identifiers to a panel data?

Re: How to merge a dataset with duplicate identifiers to a panel data?

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away