turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- FINDING THE MISSING observation

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-25-2008 06:58 PM

I can't find the missing school: Los_angeles has 2140 observations and KALA has 2141 observations. I tried using the following code to determine which observation is missing but can't.

DATA MATCH LOS_ANGELES KALA;

MERGE LOS_ANGELES (IN=L)

KALA (IN=K);

BY SCHNAME;

IF L AND NOT K THEN OUTPUT LOS_ANGELES;

IF NOT L AND K THEN OUTPUT KALA;

IF L AND K THEN OUTPUT MATCH;

RUN;

DATA MATCH LOS_ANGELES KALA;

MERGE LOS_ANGELES (IN=L)

KALA (IN=K);

BY SCHNAME;

IF L AND NOT K THEN OUTPUT LOS_ANGELES;

IF NOT L AND K THEN OUTPUT KALA;

IF L AND K THEN OUTPUT MATCH;

RUN;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-28-2008 03:45 AM

Teresa,

you have made one mistake and think about the second remark

Andre

1) you are overwriting your source LOS_Angeles With your code replacing it!

see how to avoid this

[PRE]

25 DATA MATCH LOS K;

26 MERGE LOS_ANGELES (IN=L)

27 KALA (IN=K);

28 BY SCHNAME;

29 IF L AND NOT K THEN OUTPUT LOS;

30 IF NOT L AND K THEN OUTPUT K;

31 IF L AND K THEN OUTPUT MATCH;

32 RUN;

INFO: The variable one on data set WORK.LOS_ANGELES will be overwritten by data set WORK.KALA.

INFO: The variable two on data set WORK.LOS_ANGELES will be overwritten by data set WORK.KALA.

NOTE: There were 2 observations read from the data set WORK.LOS_ANGELES.

NOTE: There were 3 observations read from the data set WORK.KALA.

NOTE: The data set WORK.MATCH has 2 observations and 4 variables.

NOTE: The data set WORK.LOS has 0 observations and 4 variables.

NOTE: The data set WORK.K has 1 observations and 4 variables.

NOTE: DATA statement used (Total process time):

real time 0.06 seconds

cpu time 0.01 seconds

[/PRE]

and second point

How are you sure there is only one record that differ?

This can also happen with this startpoint

[PRE]

Data los_angeles;

input schname $1. one two;

datalines;

A 12 36

B 15 25

;

data Kala;

input schname $1. one two three;

datalines;

D . 36 45

E 75 27 .

C 1 2 3

;

run;

Proc sort data=Kala; by schname;run;

INFO: The variable one on data set WORK.LOS_ANGELES will be overwritten by data set WORK.KALA.

INFO: The variable two on data set WORK.LOS_ANGELES will be overwritten by data set WORK.KALA.

NOTE: There were 2 observations read from the data set WORK.LOS_ANGELES.

NOTE: There were 3 observations read from the data set WORK.KALA.

NOTE: The data set WORK.MATCH has 0 observations and 4 variables.

NOTE: The data set WORK.LOS has 2 observations and 4 variables.

NOTE: The data set WORK.K has 3 observations and 4 variables.

NOTE: DATA statement used (Total process time):

real time 0.01 seconds

cpu time 0.01 seconds

[/PRE]

you have made one mistake and think about the second remark

Andre

1) you are overwriting your source LOS_Angeles With your code replacing it!

see how to avoid this

[PRE]

25 DATA MATCH LOS K;

26 MERGE LOS_ANGELES (IN=L)

27 KALA (IN=K);

28 BY SCHNAME;

29 IF L AND NOT K THEN OUTPUT LOS;

30 IF NOT L AND K THEN OUTPUT K;

31 IF L AND K THEN OUTPUT MATCH;

32 RUN;

INFO: The variable one on data set WORK.LOS_ANGELES will be overwritten by data set WORK.KALA.

INFO: The variable two on data set WORK.LOS_ANGELES will be overwritten by data set WORK.KALA.

NOTE: There were 2 observations read from the data set WORK.LOS_ANGELES.

NOTE: There were 3 observations read from the data set WORK.KALA.

NOTE: The data set WORK.MATCH has 2 observations and 4 variables.

NOTE: The data set WORK.LOS has 0 observations and 4 variables.

NOTE: The data set WORK.K has 1 observations and 4 variables.

NOTE: DATA statement used (Total process time):

real time 0.06 seconds

cpu time 0.01 seconds

[/PRE]

and second point

How are you sure there is only one record that differ?

This can also happen with this startpoint

[PRE]

Data los_angeles;

input schname $1. one two;

datalines;

A 12 36

B 15 25

;

data Kala;

input schname $1. one two three;

datalines;

D . 36 45

E 75 27 .

C 1 2 3

;

run;

Proc sort data=Kala; by schname;run;

INFO: The variable one on data set WORK.LOS_ANGELES will be overwritten by data set WORK.KALA.

INFO: The variable two on data set WORK.LOS_ANGELES will be overwritten by data set WORK.KALA.

NOTE: There were 2 observations read from the data set WORK.LOS_ANGELES.

NOTE: There were 3 observations read from the data set WORK.KALA.

NOTE: The data set WORK.MATCH has 0 observations and 4 variables.

NOTE: The data set WORK.LOS has 2 observations and 4 variables.

NOTE: The data set WORK.K has 3 observations and 4 variables.

NOTE: DATA statement used (Total process time):

real time 0.01 seconds

cpu time 0.01 seconds

[/PRE]

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-29-2008 01:31 PM

Thanks Andre! I appreciate your help.