BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
NonSleeper
Quartz | Level 8

Maybe the title is too vague, but here's the actual circumstance. Before merging, I have file number 1 with the size of N1 and file number 2 N2.

For the most part, N2 is a subset of N1. Some observations in N2 may not be in N1 as seen later.

After merging, the size becomes N, N > N1.

When I run a crosstab of index variables, it indicates that all observations in N come from file number 1. How can that be possible when N > N1?

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

If dataset 2 is a subset of dataset 1 in terms of the by variable, but has multiple observations for one or more ID's present in dataset 1, then those ID's will be multiplied.

Example:

dataset A

ID var1

1 x

2 y

3 z

dataset B

ID var2

1 a

1 b

2 c

result:

ID var1 var2

1 x a

1 x b

2 y c

3 z

View solution in original post

6 REPLIES 6
NonSleeper
Quartz | Level 8

Like this:

data merge;

merge file1 (in=x) file2 (in=y);

by ID;

index1=x;

index2=y;

run;

Then:

proc freq data=merge;

table index1*index2;

run;

NonSleeper
Quartz | Level 8

But we can always filter out the observations after merging.

Yet I think the problem here is that after merging, all observations were indicated to be from file 1; that is, in the crosstab, there were no cells where index1=0. That's OK if all observations in file 2 were subset of file 1, but then we expect that the sample size did not increase after merging, which however did increase (N > N1).

Kurt_Bremser
Super User

If dataset 2 is a subset of dataset 1 in terms of the by variable, but has multiple observations for one or more ID's present in dataset 1, then those ID's will be multiplied.

Example:

dataset A

ID var1

1 x

2 y

3 z

dataset B

ID var2

1 a

1 b

2 c

result:

ID var1 var2

1 x a

1 x b

2 y c

3 z

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2312 views
  • 0 likes
  • 2 in conversation