BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mathis1
Quartz | Level 8

I'm re-uploading this topic because the previous one was considered as solved i don't know why...

 

I have this dataset :

 

Data Table;
Input ID Y VAR $;
datalines;
1 -1762.05 A
2 -776.07 A
3 -670 A
4 -535 A
6 -531 A
7 -528 B
8 -294 B
9 -281 B
10 -227 B
;
run;

 

I'm now creating variables to get the difference of values between previous and next values, for each observation :

 

%MACRO COMPARE(table= , var= );
Data &table.;
retain id;
set &table.;
id=_N_;
run;

data &table._COMPARE;
set &table.;
by ID;
set &table. ( firstobs = 2 keep = &var. rename = (&var. = Next_Measure) )
&table. ( obs = 1 drop = _all_ );
Label Next_Measure=Next_Measure;
Prev_Measure = lag(&var.) ;
Diff_prev= (&var.-lag(&var.))/lag(&var.) ;
Diff_Next= (Next_Measure-&var.)/&var. ;
run;
%MEND;

 

%COMPARE(table=Table, var=Y);

 

data TABLE_COMPARE ;
Set TABLE_COMPARE;
If abs(Diff_prev) < 0.01 or abs(Diff_Next) < 0.01 then Group= 1;
run;

 

What I'm looking to get is :

 

Data TABLE_COMPARE_WANT;
Set TABLE_COMPARE;
Input Group_WANT;
datalines;
1
.
.
2
2
1
.
.
2
;
run;

 

So i'm looking to have for each variable, several groups that must contain very close observations (a difference of 0.01 with the previous OR the next observation).

 

So in my example i would have two group 1 and group 2 for VAR A and Group1 and group 2 for VAR B. 

 

Thank You for your answer 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

I still don't think your description is clear enough, as the very first record seems to be assigned to group 1, yet it does not meet the condition. This needs further explanation, as do several other points in your explanation.

 

Nevertheless, the way to get any macro code to work is to first create code that works without macros and without macro variables. It does not seem that you have done this, and I would strongly recommend you do that. Here is my attempt to get this to work without macros and without macro variables, although as I said, I don't see why you say the first record is in group 1; my code assigns it to the missing group.

 

Also, please state why macros are needed here at all. Thanks.

 

proc expand data=table out=table1;
	by var;
	convert y=lag_y/transformout=(lag 1);
	convert y=lead_y/transformout=(lead 1);
run;
data want;
	set table1;
	by var;
	retain prev_group;
	prev_group=lag(group);
	diff_prev=abs((y-lag_y)/lag_y);
	diff_next=abs((lead_y-y)/y);
	condition=(diff_prev<0.01 and not missing(diff_prev)) or (diff_next<0.01 and not missing(diff_next));
	if first.var then counter=0;
	counter+1;
	if counter=1 and condition then group=1; else group=.; 
	if counter=2 and condition then group=1; else group=.;
	if counter>2 then do;
		if condition and (prev_group=1 or prev_group=.) then group=1;
		if condition and missing(prev_group) then group=2;
		else group=.;
	end;
	prev_group=lag(group);
run;
--
Paige Miller

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

You haven't stated what criterion is used for assigning observations to group 2.

 

--
Paige Miller
Mathis1
Quartz | Level 8

The criterion is :

The first time the variable is encountered, if the condition "abs(Diff_prev) < 0.01 or abs(Diff_Next) < 0.01" is met, then Group=1.

For the second observation of this variable :

    -  if the condition is met, then group is still equal to 1 : indeed, it is in the same group as the previous observation, as the difference with first observation's value of Y is small (<0.01)). 

  - If the condition is not met, then group = "."  (don't do anything)

 

For the third observation 

      - If the condition is met :

             - if previous value of Group variable was 1 then, Group = 1 for thirs observation

             - if previous value of Group variable was "." then Group = 2 : indeed, a new group is created as this thirs observation has close value with the next observation ("Next diff"<0.01).

 

    -  If the condition is not met : then group ="."

 

 

PaigeMiller
Diamond | Level 26

@Mathis1 wrote:

The criterion is :

The first time the variable is encountered


Which variable? What do you mean by "encountered"? Variables are either present or not present, they are not "encountered" as I understand the word.

--
Paige Miller
Mathis1
Quartz | Level 8

The variable is the "VAR" column in my example, and takes the value"A" or "B" in the table Table.

 

PaigeMiller
Diamond | Level 26

I still don't think your description is clear enough, as the very first record seems to be assigned to group 1, yet it does not meet the condition. This needs further explanation, as do several other points in your explanation.

 

Nevertheless, the way to get any macro code to work is to first create code that works without macros and without macro variables. It does not seem that you have done this, and I would strongly recommend you do that. Here is my attempt to get this to work without macros and without macro variables, although as I said, I don't see why you say the first record is in group 1; my code assigns it to the missing group.

 

Also, please state why macros are needed here at all. Thanks.

 

proc expand data=table out=table1;
	by var;
	convert y=lag_y/transformout=(lag 1);
	convert y=lead_y/transformout=(lead 1);
run;
data want;
	set table1;
	by var;
	retain prev_group;
	prev_group=lag(group);
	diff_prev=abs((y-lag_y)/lag_y);
	diff_next=abs((lead_y-y)/y);
	condition=(diff_prev<0.01 and not missing(diff_prev)) or (diff_next<0.01 and not missing(diff_next));
	if first.var then counter=0;
	counter+1;
	if counter=1 and condition then group=1; else group=.; 
	if counter=2 and condition then group=1; else group=.;
	if counter>2 then do;
		if condition and (prev_group=1 or prev_group=.) then group=1;
		if condition and missing(prev_group) then group=2;
		else group=.;
	end;
	prev_group=lag(group);
run;
--
Paige Miller
ballardw
Super User

You should seriously reconsider use of this programming construct:

Data &table.;
retain id;
set &table.;

When you use the same name for the output data set as the source data you completely replace the source data. So if there is any logic issue you may not have the data you need at all. This is further compounded by burying it in a macro. You may have already so altered the source data with testing that you can't get what you want.

 

Did you have something that worked as needed before attempting the macro? If not, get a version working without any macro calls or variables.

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1260 views
  • 0 likes
  • 3 in conversation