## Output second record

Solved
Occasional Contributor
Posts: 14

# Output second record

Hello~

I need help finding specific records in a dataset:

I have one dataset called CRASH with contains auto crash records.  The dataset has crashid, driverid, and atfaultYN.  The crashid is unique for two records: the at fault driver and the not at fault driver.  Each record has the driverid as well as the atfaultYN variable which = 0 for atfault and 1 for not at fault.  Of course, one driver can have many crashes so driverid is repeated throughout the dataset but the crashID is unique to the two records of each crash.  Because the analysis was previously on drivers with high frequency of not at fault crashes to determine fraud, we have another dataset REAREND which has crash records for the drivers who have high frequency of not at fault crashes.

So for example, I have:

CRASH

Crashid          driverID               AtfaultYN

110                2                              0

110                91                            1

118                6                              1

118                210                          0

122                91                            1

122                13                            0

256                87                            0

256                123                          1

REAREND

Crashid          driverID               AtfaultYN

110                    91                         1

118                    6                           1

122                    91                         1

Now our analysis is on the drivers at fault.  I need to extract the crash record of the at fault drivers (atfaultYN=0) from CRASH for each crash for the not at fault drivers in REAREND.

So I need:

VICTIM

Crashid          driverID               atfaultYN

110                    2                         0

118                    210                     0

122                    13                       0

There are 3 million crash records and only 796k crash records that I need, so I don't want the entire dataset, just the at fault crash records for the driverid in REAREND.

-C

Accepted Solutions
Solution
‎04-07-2014 12:24 PM
Super User
Posts: 23,295

## Re: Output second record

Use a SQL query to select records with the same crashID AND the atfaultYN is 0.

proc sql;

create table want as

select * from crash

where atfaultyn=0 and crashid in (Select crashId from rearend);

quit;

All Replies
Solution
‎04-07-2014 12:24 PM
Super User
Posts: 23,295

## Re: Output second record

Use a SQL query to select records with the same crashID AND the atfaultYN is 0.

proc sql;

create table want as

select * from crash

where atfaultyn=0 and crashid in (Select crashId from rearend);

quit;

Posts: 4,674

## Re: Output second record

If these are SAS tables then you could also use a SAS hash table for this.

data CRASH;
input  Crashid driverID AtfaultYN;
datalines;
110 2 0
110 91 1
118 6 1
118 210 0
122 91 1
122 13 0
256 87 0
256 123 1
;
run;

data REAREND;
input Crashid driverID AtfaultYN;
datalines;
110 91 1
118 6 1
122 91 1
;
run;

data want(drop=_;
set crash(where=(atfaultYN=0));
if _n_=1 then do;
dcl hash rearend (dataset:'rearend(where=(AtfaultYN=1))');
_rc=rearend.defineKey('Crashid');
_rc=rearend.defineDone();
end;
if rearend.check()=0 then output;
run;

Super User
Posts: 9,890

## Re: Output second record

data want;

merge

CRASH (

in=a

where=(atfaultYN=0)

)

REAREND (

in=b

keep=crashid

)

;

by crashid;

if a and b;

run;

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Occasional Contributor
Posts: 14

## Re: Output second record

Thank you all very much!

🔒 This topic is solved and locked.