BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ger15xxhcker
Quartz | Level 8

Hello!

 

I would like to produce 3 sets under the following conditions. How could this be done?

 

Thanks!

 

data test;
 input id amt;
 datalines;
 1 5
 1 5
 1 106
 2 90
 2 100
 3 111
 3 111
 3 111
 3 111
 4 837
 4 838
 4 839
 ;
run;

data want01; /*01 after the first two occurrences, the difference is greater than 20% compared to the third one*/
 input id flg$;
 datalines;
 1 Y
 ;
run;
data want02; /*02 where the amt field is constant*/
 input id flg$;
 datalines;
 3 Y
 ;
run;
data want03; /*03 where the amt field always changeing*/
 input id flg$;
 datalines;
 2 Y
 4 Y
 ;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
ScottBass
Rhodochrosite | Level 12

I'm sure there are data step approaches but I work with RDBMS's so much that I often think in SQL.

 

If your real datasets are huge this might not perform well.

 

Hope this helps...

 

* source data ;
data have;
 input id amt;
 datalines;
 1 5
 1 5
 1 106
 2 90
 2 100
 3 111
 3 111
 3 111
 3 111
 4 837
 4 838
 4 839
 ;
run;

* create two keys - this could be a view if your data is large ;
data have;
   format sk gk;
   sk+1;
   set have;
   by id;
   if first.id then gk=0;
   gk+1;
run;

/*01 after the first two occurrences, the difference is greater than 20% compared to the third one*/
* this is unclear but here is a start ;
proc sql _method;
   * for debugging ;
   create table want1_debug as
   select a.id, a.amt, b.amt as amt_3
   from have a
   left join have b
   on a.id=b.id
   and b.gk=3
   order by a.sk;

   * for real ;
   create table want1 as
   select distinct a.id
   from have a
   left join have b
   on a.id=b.id
   and b.gk=3
   where b.amt > a.amt*1.2;
quit;

/*02 where the amt field is constant*/
proc sql _method;
   * for debugging ;
   create table want2_debug as
   select id,min(amt) as min_amt,max(amt) as max_amt
   from have
   group by id;

   * for real ;
   create table want2 as
   select distinct id 
   from (
   select id,min(amt) as min_amt,max(amt) as max_amt
   from have
   group by id
   ) x
   where x.min_amt = x.max_amt;
quit;

/*03 where the amt field always changeing*/
proc sql _method;
   * for debugging ;
   create table want3_debug as
   select a.id,a.amt,b.amt as next_amt,(a.amt ^= b.amt or b.amt is missing) as flag
   from have a
   left join have b
   on a.id=b.id
   and a.sk+1=b.sk
   order by a.sk;

   * for real ;
   create table want3 as
   select distinct id
   from (
   select id,min(flag) as min_flag 
   from (
   select a.id,a.amt,b.amt as next_amt,(a.amt ^= b.amt or b.amt is missing) as flag
   from have a
   left join have b
   on a.id=b.id
   and a.sk+1=b.sk
   ) x
   group by id
   having min_flag ^= 0
   ) x;
quit;

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

View solution in original post

4 REPLIES 4
ChrisNZ
Tourmaline | Level 20

What does this mean?

 after the first two occurrences, the difference is greater than 20% compared to the third one
ger15xxhcker
Quartz | Level 8

Sorry for the wrong wording. An example:

 

1. row id =1 amt = 10

2. row id =1 amt = 11

3. row id =1 amt = 13 (this one is 18,12% higher then the second one)

4. row id=2  amt = 10

5. row id=2  amt = 11

6. row id=2  amt = 14 this third one is 27,2% higher then the second one (which is for id = 2)

 

i'd like to compare this values. If the third one is >= 20% then take a flag.

 

Thanks for your help!

 

 

 

ScottBass
Rhodochrosite | Level 12

I'm sure there are data step approaches but I work with RDBMS's so much that I often think in SQL.

 

If your real datasets are huge this might not perform well.

 

Hope this helps...

 

* source data ;
data have;
 input id amt;
 datalines;
 1 5
 1 5
 1 106
 2 90
 2 100
 3 111
 3 111
 3 111
 3 111
 4 837
 4 838
 4 839
 ;
run;

* create two keys - this could be a view if your data is large ;
data have;
   format sk gk;
   sk+1;
   set have;
   by id;
   if first.id then gk=0;
   gk+1;
run;

/*01 after the first two occurrences, the difference is greater than 20% compared to the third one*/
* this is unclear but here is a start ;
proc sql _method;
   * for debugging ;
   create table want1_debug as
   select a.id, a.amt, b.amt as amt_3
   from have a
   left join have b
   on a.id=b.id
   and b.gk=3
   order by a.sk;

   * for real ;
   create table want1 as
   select distinct a.id
   from have a
   left join have b
   on a.id=b.id
   and b.gk=3
   where b.amt > a.amt*1.2;
quit;

/*02 where the amt field is constant*/
proc sql _method;
   * for debugging ;
   create table want2_debug as
   select id,min(amt) as min_amt,max(amt) as max_amt
   from have
   group by id;

   * for real ;
   create table want2 as
   select distinct id 
   from (
   select id,min(amt) as min_amt,max(amt) as max_amt
   from have
   group by id
   ) x
   where x.min_amt = x.max_amt;
quit;

/*03 where the amt field always changeing*/
proc sql _method;
   * for debugging ;
   create table want3_debug as
   select a.id,a.amt,b.amt as next_amt,(a.amt ^= b.amt or b.amt is missing) as flag
   from have a
   left join have b
   on a.id=b.id
   and a.sk+1=b.sk
   order by a.sk;

   * for real ;
   create table want3 as
   select distinct id
   from (
   select id,min(flag) as min_flag 
   from (
   select a.id,a.amt,b.amt as next_amt,(a.amt ^= b.amt or b.amt is missing) as flag
   from have a
   left join have b
   on a.id=b.id
   and a.sk+1=b.sk
   ) x
   group by id
   having min_flag ^= 0
   ) x;
quit;

Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
ger15xxhcker
Quartz | Level 8

Thank, you are right full of name conventions next time i will do so! Your code is perfect for me. Only the first method need some changes. How can i do if I only look at the  more then 20% increase, not the decrease. So i modify my question: If the first 2 record is the same and the third one is more then 20 increase.

 

I really appreciate it!

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1125 views
  • 1 like
  • 3 in conversation