BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
kbhagat
Fluorite | Level 6

Hello Everyone,

I have crash data in which I have information on what type of person is involved in the crash: Driver (D), Occupant (O), and Passenger (P). I also have the report number per crash and person Id per person which means if there are 5 people involved in a crash then all 5 will have the same report number but different person IDs. I also have an age variable in the data. I am trying to create sample data where drivers should be 18 years or younger and passengers should be more than one and less than 20 years.

 

Here are my codes:

/*D- Driver

P- Pedestrian

O- Occupant*/

proc freq data = young.person21;

table PERSON_TYPE;

run;

 

proc contents data= young.person21;

run;

 

/*Finding number of drivers >= 18 years and more than one Passengers under age 20 involved in a crash

257981 observations and 6 variables*/

data person21;

retain INJ_SEVER_CODE REPORT_NO VEHICLE_ID PERSON_ID age;

length INJ_SEVER_CODE $5. Report_no $10. age 8.;

format INJ_SEVER_CODE $5. Report_no $10. age 8.;

informat INJ_SEVER_CODE $5. Report_no $10. age 8.;

set young.person21;

keep INJ_SEVER_CODE REPORT_NO VEHICLE_ID PERSON_ID AGE person_type ACC_DATE LIC_NUM;

run;

 

 

data young_crash21;

length type $15.;

set person21;

if age <= 18 and person_type = 'D' then Type= "Young Driver";

if age <= 20 and person_type = 'O' then Type= "Passenger";

if Type in ('Young Driver' ,'Passenger');

run;

 

/*30241 observations and 7 variable*/

data young_crash21_1;

set person21;

if age <= 18 and person_type = 'D' then Type1= "Young Driver";

if age <= 20 and person_type = 'O' then Type2= "Passenger";

if Type1= 'Young Driver' or Type2= 'Passenger';

run;

 

proc freq data = young_crash21_1;

table Lic_num;

run;

 

/*First method*/

proc sort data = young_crash21;

by report_no;

run;

 

 

data young_crash21_count;

set young_crash21;

count+1;

by report_no;

if first.report_no then count=1;

run;

 

/*Second method*/

 

proc sort data= young_crash21 out= young_crash21_count1;

by type report_no;

run;

 

data young_crash21_count2;;

set young_crash21_count1;

count + 1;

by type report_no;

if first.type or first.report_no then count = 1;

run;

I am stuck here. I don't know how to capture only that crash where the driver is less than and equal to 18 years. That particular crash has more than one passenger who is less than 20 years old. In the final data by using my codes I do see a few cases where the driver is older than 18 years and I don't know how to get rid of it.

 

Please help

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

The variables names don't match and the coding won't match for comparisons, eg type = "D" vs person_type="D" and/or person_type = "Driver" all require the appropriate data values and variable names. So it does matter.  

 

Based on the last input data this works as expected:

 

data test;
input Report_no$ Vehicle_id Person_id age Person_type$;
datalines;
123456 101075 1541 22 D
123456 101076 1542 40 D
123456 101076 1543 16 O
123456 101076 1544 54 O
A05656 101077 1545 25 O
A07651 100918 1546 24 D
A07651 100919 1547 18 O
A07651 100919 1548 17 O
A0074 8840 1336 18 D
A0085 8853 1337 21 D
A0085 8853 1338 3 O
A075 1009 528 17 D
A075 1009 529 19 O
A075 1009 530 18 O
;

run;

data young_crash21_new;
    set test;
    if person_type='D' and age<=21 then flag1=1; else flag1=0;
    if person_type='O' and age<=20 then flag2=1; else flag2=0;
run;
proc sql;
    create table want as select * from young_crash21_new
    group by report_no 
    having max(flag1)=1 and max(flag2)=1;
quit;

View solution in original post

18 REPLIES 18
PaigeMiller
Diamond | Level 26

I am trying to create sample data where drivers should be 18 years or younger and passengers should be more than one and less than 20 years.

 

UNTESTED CODE (because no data has been provided)

 

data young_crash21_new;
    set young_crash21;
    if type='D' and age<=21 then flag1=1; else flag1=0;
    if type='O' and age<=20 then flag2=1; else flag2=0;
run;
proc sql;
    create table want as select * from young_crash21_new
    group by report_no 
    having max(flag1)=1 and max(flag2)=1;
quit;
--
Paige Miller
kbhagat
Fluorite | Level 6

Thank you for the response. Let me try these codes thank you

 

kbhagat
Fluorite | Level 6
I am getting error:
201 data young_crash21_new;
202 set young_crash21;
203 if type='D' and age<=21 then flag1=1; else flag1=0;
204 if type='O' and age<=20 then flag2=1; else flag2=0;
205 run;

NOTE: There were 30241 observations read from the data set WORK.YOUNG_CRASH21.
NOTE: The data set WORK.YOUNG_CRASH21_NEW has 30241 observations and 11 variables.
NOTE: DATA statement used (Total process time):
real time 0.05 seconds
cpu time 0.03 seconds


206 proc sql;
207 create table want as select * from young_crash21_new
208 group by report_no
209 having max(flag1)=1 and max(flag2)=1;
NOTE: The query requires remerging summary statistics back with the original data.
NOTE: Table WORK.WANT created, with 0 rows and 11 columns.

210 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.07 seconds
cpu time 0.04 seconds

PaigeMiller
Diamond | Level 26

I don't see an error. 🤷‍♂️

--
Paige Miller
kbhagat
Fluorite | Level 6

It is not an error but a note and it is giving 0 observation in the final data set (want)

NOTE: The query requires remerging summary statistics back with the original data.
NOTE: Table WORK.WANT created, with 0 rows and 11 columns.

PaigeMiller
Diamond | Level 26

Since you haven't shared any data, maybe that's the correct answer.

 

If you can share a portion of your actual data, please do so. If you can't share it, please make up some data.


Data MUST be presented as working SAS data step code, which you can type in yourself or by following these instructions. Do not provide data other ways, I will not be able to use it.

--
Paige Miller
kbhagat
Fluorite | Level 6

Will this data work?

 

PaigeMiller
Diamond | Level 26

No, it does not work. Repeating my earlier request:

 

Data MUST be presented as working SAS data step code, which you can type in yourself or by following these instructions. Do not provide data other ways, I will not be able to use it.

This code should be included in your reply by copying as text and then pasting it into the window that appears when you click on the "little running man" icon.

--
Paige Miller
kbhagat
Fluorite | Level 6

data test;
input Report_no$ Vehicle_id Person_id age Person_type$;
datalines;
123456 101075 1541 22 Driver
123456 101076 1542 40 Driver
123456 101076 1543 16 Passenger
123456 101076 1544 54 Passenger
A05656 101077 1545 25 Passenger
A07651 100918 1546 24 Driver
A07651 100919 1547 18 Passenger
A07651 100919 1548 17 Passenger
A0074 8840 1336 18 Driver
A0085 8853 1337 21 Driver
A0085 8853 1338 3 Passenger
A075 1009 528 17 Driver
A075 1009 529 19 Passenger
A075 1009 530 18 Passenger
;

run;

PaigeMiller
Diamond | Level 26

I want you to do the programming that you can do, and I'll do the part that you are having trouble with. We have code where the variable PERSON_TYPE has values D or O, but that's not the data you provided. Please straighten that out, either by code or by adjusting the data.

--
Paige Miller
kbhagat
Fluorite | Level 6

D is Driver and O is Passenger. Does it matter?

kbhagat
Fluorite | Level 6

data test;
input Report_no$ Vehicle_id Person_id age Person_type$;
datalines;
123456 101075 1541 22 D
123456 101076 1542 40 D
123456 101076 1543 16 O
123456 101076 1544 54 O
A05656 101077 1545 25 O
A07651 100918 1546 24 D
A07651 100919 1547 18 O
A07651 100919 1548 17 O
A0074 8840 1336 18 D
A0085 8853 1337 21 D
A0085 8853 1338 3 O
A075 1009 528 17 D
A075 1009 529 19 O
A075 1009 530 18 O
;

run;

Reeza
Super User

The variables names don't match and the coding won't match for comparisons, eg type = "D" vs person_type="D" and/or person_type = "Driver" all require the appropriate data values and variable names. So it does matter.  

 

Based on the last input data this works as expected:

 

data test;
input Report_no$ Vehicle_id Person_id age Person_type$;
datalines;
123456 101075 1541 22 D
123456 101076 1542 40 D
123456 101076 1543 16 O
123456 101076 1544 54 O
A05656 101077 1545 25 O
A07651 100918 1546 24 D
A07651 100919 1547 18 O
A07651 100919 1548 17 O
A0074 8840 1336 18 D
A0085 8853 1337 21 D
A0085 8853 1338 3 O
A075 1009 528 17 D
A075 1009 529 19 O
A075 1009 530 18 O
;

run;

data young_crash21_new;
    set test;
    if person_type='D' and age<=21 then flag1=1; else flag1=0;
    if person_type='O' and age<=20 then flag2=1; else flag2=0;
run;
proc sql;
    create table want as select * from young_crash21_new
    group by report_no 
    having max(flag1)=1 and max(flag2)=1;
quit;
kbhagat
Fluorite | Level 6

Thank you so much. This actually worked!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 18 replies
  • 4243 views
  • 1 like
  • 3 in conversation