DATA Step, Macro, Functions and more

How to group observations with logic?

Accepted Solution Solved
Reply
Occasional Contributor hx
Occasional Contributor
Posts: 17
Accepted Solution

How to group observations with logic?

Hi, I have a dataset as follows, the variable 'w' has already been sorted, and I want to group them according to the values of 'value' and generate a new variable 'nw', and 'nw' should be given as the last w whose value is less than or equal to 200, and for those values large than 200, stay the same. It's really hard to describe here.

wvalue
abate200
abbreviations50
abdomen20
abdominal110
abelcet100
aberrant100
abiraterone100
abl1100
ablation210
abnormal30
abo780
abscission260
absence100
absenteeism140
absorption130
abstinence150
abstracting110
academia210
academic30
academy600
acceleration180
access130
accessibility100
accident200
accidents230
accountability50
accounting100
accreditation230
acculturation30
ace600
acetates920
acetazolamide310
acetic110
acetone210
acetyl50
acetylaminofluorene200
acetylatio230
acetylation200
bupropion550
burden560
burkitt's230
burn300
burns140
businesses550
butanoic800
butterflies500
butyrates100
butyric450
butyryl800
bypass330
bystander300
c300
c2130
c2h2150
c3h250
c57bl720


That is to say, I want to have something like:

wvaluenw
abate200abl1
abbreviations50abl1
abdomen20abl1
abdominal110abl1
abelcet100abl1
aberrant100abl1
abiraterone100abl1
abl1100abl1
ablation210ablation
abnormal30abnormal
abo780abo
abscission260abscission
absence100abstracting
absenteeism140abstracting
absorption130abstracting
abstinence150abstracting
abstracting110abstracting
academia210academia
academic30academic
academy600academy
acceleration180accident
access130accident
accessibility100accident
accident200accident
accidents230accidents
accountability50accounting
accounting100accounting
accreditation230accreditation
acculturation30acculturation
ace600ace
acetates920acetates
acetazolamide310acetazolamide
acetic110acetic
acetone210acetone
acetyl50acetylaminofluorene
acetylaminofluorene200acetylaminofluorene
acetylatio230acetylatio
acetylation200acetylation
bupropion550bupropion
burden560burden
burkitt's230burkitt's
burn300burn
burns140burns
businesses550businesses
butanoic800butanoic
butterflies500butterflies
butyrates100butyrates
butyric450butyric
butyryl800butyryl
bypass330bypass
bystander300bystander
c300c
c2130c2h2
c2h2150c2h2
c3h250c3h
c57bl720c57bl

All the things that came into my mind is to use sort but I don't think it worked. It seems like I cannot sort by the 'values' since I have sorted the data by'w', can someone help me out? Really appreciated.

 


Accepted Solutions
Solution
‎12-02-2016 05:05 PM
Super User
Posts: 19,862

Re: How to group observations with logic?

This should get you close, I haven't fully verified it works though.

PS - in the future please post your same data as a data step - like the first step in my proc. 

 

data have;
informat w $30. value 8.;
input w	$ value;
cards;
abate   200
abbreviations   50
abdomen 20
abdominal   110
abelcet 100
aberrant    100
abiraterone 100
abl1    100
ablation    210
abnormal    30
abo 780
abscission  260
absence 100
absenteeism 140
absorption  130
abstinence  150
abstracting 110
academia    210
academic    30
academy 600
acceleration    180
access  130
accessibility   100
accident    200
accidents   230
accountability  50
accounting  100
accreditation   230
acculturation   30
ace 600
acetates    920
acetazolamide   310
acetic  110
acetone 210
acetyl  50
acetylaminofluorene 200
acetylatio  230
acetylation 200
bupropion   550
burden  560
burkitt's   230
burn    300
burns   140
businesses  550
butanoic    800
butterflies 500
butyrates   100
butyric 450
butyryl 800
bypass  330
bystander   300
c   300
c2  130
c2h2    150
c3h 250
c57bl   720
;
run;

data add_order;
set have;
order=_n_;
run;

proc sort data=add_order;
by descending order;
run;

data want;
set add_order;
retain prev_group group;
if value>=200 then prev_group=1;
if prev_group=1 and value<200 then do;
	group = w;
	prev_group=0;
end;
else if prev_group=1 and value>=200 then group=w;

run;

proc sort data=want;
by order;
run;

View solution in original post


All Replies
Trusted Advisor
Posts: 1,022

Re: How to group observations with logic?

Why does w=accident/value=200 get nw=accident,    yet   w=abate/value=200 get nw=abl1?

Occasional Contributor hx
Occasional Contributor
Posts: 17

Re: How to group observations with logic?

Thank you for your reply, mkeintz!
w=abate/value=200 get nw=abl1 because I wanted to get the nw="the last w of the values that are <=200", for instance w=accident/value=200 get nw=accident because accidents/value=230 which is larger than 200. Does that make sense? sorry I didn't describe it clearly.
Super User
Posts: 19,862

Re: How to group observations with logic?

Can you reverse the order of your data? So you're not looking forward but looking backward and reversing your logic. Its much easier and cleaner to lag in SAS. 

 

 

Occasional Contributor hx
Occasional Contributor
Posts: 17

Re: How to group observations with logic?

Do you mean reverse them by w alphabetically? But I still want to get the new variable based on w/values, how does that work by reversing?
Super User
Posts: 19,862

Re: How to group observations with logic?

I don't know how you created your order in the first place.

 

But if you reverse your entire data you can reverse your logic. 

 

So it becomes if the value is over 200 then assign the value from column 1. Use retain to keep the value from row to row.

data want;
set have;
retain group;
if X > 200 then group=column1;
run;

 

Super User
Posts: 19,862

Re: How to group observations with logic?

Nevermind, I didn't understand your logic. You're handling the values greater than 200 differently than I thought.

Occasional Contributor hx
Occasional Contributor
Posts: 17

Re: How to group observations with logic?

Yeah, but your code did give me some thinking, by using the group did give me something like what I want, yet still can't. Thank you anyway, I'm still trying
Super User
Posts: 19,862

Re: How to group observations with logic?

You'd have to set a flag and then check the value of the flag so you need two retained variables.

 

 

Occasional Contributor hx
Occasional Contributor
Posts: 17

Re: How to group observations with logic?

How can I set the value of the flag? Should I use dummies or 'w'?
Solution
‎12-02-2016 05:05 PM
Super User
Posts: 19,862

Re: How to group observations with logic?

This should get you close, I haven't fully verified it works though.

PS - in the future please post your same data as a data step - like the first step in my proc. 

 

data have;
informat w $30. value 8.;
input w	$ value;
cards;
abate   200
abbreviations   50
abdomen 20
abdominal   110
abelcet 100
aberrant    100
abiraterone 100
abl1    100
ablation    210
abnormal    30
abo 780
abscission  260
absence 100
absenteeism 140
absorption  130
abstinence  150
abstracting 110
academia    210
academic    30
academy 600
acceleration    180
access  130
accessibility   100
accident    200
accidents   230
accountability  50
accounting  100
accreditation   230
acculturation   30
ace 600
acetates    920
acetazolamide   310
acetic  110
acetone 210
acetyl  50
acetylaminofluorene 200
acetylatio  230
acetylation 200
bupropion   550
burden  560
burkitt's   230
burn    300
burns   140
businesses  550
butanoic    800
butterflies 500
butyrates   100
butyric 450
butyryl 800
bypass  330
bystander   300
c   300
c2  130
c2h2    150
c3h 250
c57bl   720
;
run;

data add_order;
set have;
order=_n_;
run;

proc sort data=add_order;
by descending order;
run;

data want;
set add_order;
retain prev_group group;
if value>=200 then prev_group=1;
if prev_group=1 and value<200 then do;
	group = w;
	prev_group=0;
end;
else if prev_group=1 and value>=200 then group=w;

run;

proc sort data=want;
by order;
run;
Occasional Contributor hx
Occasional Contributor
Posts: 17

Re: How to group observations with logic?

Thank you so much Reeza! it did work! I got to look deep into your code to get better understanding.
Super User
Posts: 5,516

Re: How to group observations with logic?

This is untested, but should be at least most of the way there:

 

data want;

do until (done1 or value > 200);

   set have end=done1;

   if value <= 200 then nw = w;

end;

do until (done2 or value > 200);

   set have end=done2;

   if value > 200 then nw = w;

   output;

end;

run;

 

 

Occasional Contributor hx
Occasional Contributor
Posts: 17

Re: How to group observations with logic?

Posted in reply to Astounding
I tried but it seemed that new variable nw only has one value which is the last w in the set.
Super User
Posts: 5,516

Re: How to group observations with logic?

Sounds like something is a little off.  While I can't see your log, try checking "done1" and "done2" to make sure they appear in the proper places.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 22 replies
  • 384 views
  • 1 like
  • 5 in conversation