BookmarkSubscribeRSS Feed
e_______c
Calcite | Level 5

Hi 🙂

 

I want to create a conditional variable (outcome) based on accident_id and road_user_type:

 

- if anyone in an accident was a vulnerable road user > then outcome = 1;

- else if everyone in an accident was a MVO > then outcome = 2;

- else outcome = 3.

 

Please help 🙂

 

dataset have;

Accident_IDRoad_user_type
1Vulnerable
1MVO
1.
2.
2MVO
3MVO
3MVO
3MVO

 

dataset want;

Accident_IDRoad_user_typeOutcome
1Vulnerable1
1MVO1
1.1
2..
2MVO.
3MVO2
3MVO2
3MVO2

 

 

 

10 REPLIES 10
sdhilip
Quartz | Level 8
Data Want;
Set Have;
If Road_user_type = "Vulnerable" then Outcome = 1;
If Road_user_type = "MVO" then Outcome = 2;
Else Outcome = 3;
Run;

e_______c
Calcite | Level 5

Hey Chris,

 

This is road crash data. The three one means the three people (rows) are each in the same accident (accident_id). Does that make sense?

ChrisNZ
Tourmaline | Level 20

Like this?

proc sql;
  select ACCIDENT_ID
       , ROAD_USER_TYPE 
       , case when sum(ROAD_USER_TYPE='Vulnerable')   then 1
              when sum(ROAD_USER_TYPE='MVO')=count(*) then 2
              else                                         3 end as OUTCOME
  from HAVE
  group by ACCIDENT_ID;
 
ACCIDENT_ID ROAD_USER_TYPE OUTCOME
1   1
1 Vulnerable 1
1 MVO 1
2 MVO 3
2   3
3 MVO 2
3 MVO 2
3 MVO 2

 

kelxxx
Quartz | Level 8
data want;
set have;
by accident_id;
retain outcome;
if first.accident_id then do;
if road_user_type eq "Vulnerable" then outcome=1;
else if road_user_type eq "MVO" then outcome=2;
else if road_user_type eq "" then outcome=.;
else outcome=3;
end;
run;

Hello, it s my proposition.

I saw in the case accident_id = 2, road_user_type =. >>> outcome =. >>> then I add a case where outcome =.
So we have 4 cases: outcome =., 1,2,3

Have a nice day.

Shmuel
Garnet | Level 18
data have;
 infile datalines;
 input Accident_ID Road_user_type $10.;
datalines;
1 Vulnerable
1 MVO
1 .
2 .
2 MVO
3 MVO
3 MVO
3 MVO
3 MVO
; run;
 
 data temp1;
 set have;
       if road_user_type = 'Vulnerable' then road_user_code=1; else
       if road_user_type = 'MVO' then road_user_code=2; else
         road_user_code=0;
run;

proc sort data=temp1; by Accident_ID road_user_code; run;

data temp2;
 set temp1;
    by Accident_ID ;
       retain outcome;
       if first.Accident_ID then outcome=.;
       if road_user_code = 1 then outcome=1;
       if outcome ne 1 then do;
          if first.Accident_ID then do;
             if road_user_code = 2
                then outcome = 2;
                else outcome = .;
          end; 
          else do;
             if road_user_code = 2 and
                outcome = 2 then;
             else outcome = .;
          end;
      end;
      if last.accident_id then output;
run;

data want;
 merge have temp2;
  by Accident_ID ;
run;
ed_sas_member
Meteorite | Level 14

Hi @e_______c 

 

Here is another approach using a PROC SQL:

proc sql;

	/* if anyone in an accident was a vulnerable road user > then outcome = 1 */
	(select *, 1 as outcome
	from have
	group by Accident_ID
	having sum(Road_user_type="Vulnerable") > 0)
	
	union all
	
	/* else if everyone in an accident was a MVO > then outcome = 2 */
	(select *, 2 as outcome
	from have
	group by Accident_ID
	having count(Accident_ID) = sum(Road_user_type="MVO"))

	union all
	
	/* else outcome = 3 */
	(select *, 3 as outcome
	from have
	group by Accident_ID
	having sum(Road_user_type="Vulnerable") = 0 and count(Accident_ID) > sum(Road_user_type="MVO"))
	
	order by Accident_ID;
quit;
Kurt_Bremser
Super User

Per your rule, accident_id = 2 should have outcome 3.

This code achieves that:

data have;
input Accident_ID Road_user_type :$10.;
datalines;
1 Vulnerable
1 MVO
1 .
2 .
2 MVO
3 MVO
3 MVO
3 MVO
;

data acc_type;
set have;
by accident_id;
retain flag_mvo flag_vuln;
if first.accident_id
then do;
  flag_mvo = 1;
  flag_vuln = 0;
end;
if road_user_type ne 'MVO' then flag_mvo = 0;
if road_user_type = 'Vulnerable' than flag_vuln = 1;
if last.accident_id;
if flag_vuln then outcome = 1;
else if flag_mvo then outcome = 2;
else outcome = 3;
keep accident_id outcome;
run;

data want;
merge
  have
  acc_type
;
by accident_id;
run;
mkeintz
PROC Star

You have a nice compact assignment rule, whose data needs can be satisfied with a self-merge, where each argument (except the last) of the merge statement selects a needed data subset:

 

data have;
input Accident_ID Road_user_type :$10.;
datalines;
1 Vulnerable
1 MVO
1 .
2 .
2 MVO
3 MVO
3 MVO
3 MVO
;

data want;
  merge 
      have (where=(road_user_type='Vulnerable') in=invul)
      have (where=(road_user_type=' ') in=inmiss)
      have (where=(road_user_type^='MVO') in=non_mvo)
      have ;
  by accident_id;
  if invul=1   then outcome=1; else
  if inmiss=1  then outcome=.; else
  if non_mvo=0 then outcome=2; else
  outcome=3;
run;

You might ordinarily be worried about mixing data from differing observations in the merge.  But because the last argument of the merge is the complete set of records in HAVE, the individual values from those records will overwrite the value obtained from the matching subsets preceding it.

 

edit: Let me add some further notes on the MERGE statement with BY.  Although the merged subsets for each ID can have differing numbers of records from each other and from the full dataset, the IN= dummies will be constant for the entire BY-group.  I.e. if you have one record with road_user_type=' ' (and therefore inmiss=1) vs 3 records in the entire BY group, you'll have 3 iterations of the data step (i.e. 3 output records in this case), and for ALL of those iterations, the INMISS will equal 1.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
e_______c
Calcite | Level 5

Thanks mkeintz > that worked perfectly! Such a simple thing but I couldn't figure it out. Thanks again!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1558 views
  • 3 likes
  • 8 in conversation