BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mathis1
Quartz | Level 8

I'm re-uploading this topic because the previous one was considered as solved i don't know why...

 

I have this dataset :

 

Data Table;
Input ID Y VAR $;
datalines;
1 -1762.05 A
2 -776.07 A
3 -670 A
4 -535 A
6 -531 A
7 -528 B
8 -294 B
9 -281 B
10 -227 B
;
run;

 

I'm now creating variables to get the difference of values between previous and next values, for each observation :

 

%MACRO COMPARE(table= , var= );
Data &table.;
retain id;
set &table.;
id=_N_;
run;

data &table._COMPARE;
set &table.;
by ID;
set &table. ( firstobs = 2 keep = &var. rename = (&var. = Next_Measure) )
&table. ( obs = 1 drop = _all_ );
Label Next_Measure=Next_Measure;
Prev_Measure = lag(&var.) ;
Diff_prev= (&var.-lag(&var.))/lag(&var.) ;
Diff_Next= (Next_Measure-&var.)/&var. ;
run;
%MEND;

 

%COMPARE(table=Table, var=Y);

 

data TABLE_COMPARE ;
Set TABLE_COMPARE;
If abs(Diff_prev) < 0.01 or abs(Diff_Next) < 0.01 then Group= 1;
run;

 

What I'm looking to get is :

 

Data TABLE_COMPARE_WANT;
Set TABLE_COMPARE;
Input Group_WANT;
datalines;
1
.
.
2
2
1
.
.
2
;
run;

 

So i'm looking to have for each variable, several groups that must contain very close observations (a difference of 0.01 with the previous OR the next observation).

 

So in my example i would have two group 1 and group 2 for VAR A and Group1 and group 2 for VAR B. 

 

Thank You for your answer 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

I still don't think your description is clear enough, as the very first record seems to be assigned to group 1, yet it does not meet the condition. This needs further explanation, as do several other points in your explanation.

 

Nevertheless, the way to get any macro code to work is to first create code that works without macros and without macro variables. It does not seem that you have done this, and I would strongly recommend you do that. Here is my attempt to get this to work without macros and without macro variables, although as I said, I don't see why you say the first record is in group 1; my code assigns it to the missing group.

 

Also, please state why macros are needed here at all. Thanks.

 

proc expand data=table out=table1;
	by var;
	convert y=lag_y/transformout=(lag 1);
	convert y=lead_y/transformout=(lead 1);
run;
data want;
	set table1;
	by var;
	retain prev_group;
	prev_group=lag(group);
	diff_prev=abs((y-lag_y)/lag_y);
	diff_next=abs((lead_y-y)/y);
	condition=(diff_prev<0.01 and not missing(diff_prev)) or (diff_next<0.01 and not missing(diff_next));
	if first.var then counter=0;
	counter+1;
	if counter=1 and condition then group=1; else group=.; 
	if counter=2 and condition then group=1; else group=.;
	if counter>2 then do;
		if condition and (prev_group=1 or prev_group=.) then group=1;
		if condition and missing(prev_group) then group=2;
		else group=.;
	end;
	prev_group=lag(group);
run;
--
Paige Miller

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

You haven't stated what criterion is used for assigning observations to group 2.

 

--
Paige Miller
Mathis1
Quartz | Level 8

The criterion is :

The first time the variable is encountered, if the condition "abs(Diff_prev) < 0.01 or abs(Diff_Next) < 0.01" is met, then Group=1.

For the second observation of this variable :

    -  if the condition is met, then group is still equal to 1 : indeed, it is in the same group as the previous observation, as the difference with first observation's value of Y is small (<0.01)). 

  - If the condition is not met, then group = "."  (don't do anything)

 

For the third observation 

      - If the condition is met :

             - if previous value of Group variable was 1 then, Group = 1 for thirs observation

             - if previous value of Group variable was "." then Group = 2 : indeed, a new group is created as this thirs observation has close value with the next observation ("Next diff"<0.01).

 

    -  If the condition is not met : then group ="."

 

 

PaigeMiller
Diamond | Level 26

@Mathis1 wrote:

The criterion is :

The first time the variable is encountered


Which variable? What do you mean by "encountered"? Variables are either present or not present, they are not "encountered" as I understand the word.

--
Paige Miller
Mathis1
Quartz | Level 8

The variable is the "VAR" column in my example, and takes the value"A" or "B" in the table Table.

 

PaigeMiller
Diamond | Level 26

I still don't think your description is clear enough, as the very first record seems to be assigned to group 1, yet it does not meet the condition. This needs further explanation, as do several other points in your explanation.

 

Nevertheless, the way to get any macro code to work is to first create code that works without macros and without macro variables. It does not seem that you have done this, and I would strongly recommend you do that. Here is my attempt to get this to work without macros and without macro variables, although as I said, I don't see why you say the first record is in group 1; my code assigns it to the missing group.

 

Also, please state why macros are needed here at all. Thanks.

 

proc expand data=table out=table1;
	by var;
	convert y=lag_y/transformout=(lag 1);
	convert y=lead_y/transformout=(lead 1);
run;
data want;
	set table1;
	by var;
	retain prev_group;
	prev_group=lag(group);
	diff_prev=abs((y-lag_y)/lag_y);
	diff_next=abs((lead_y-y)/y);
	condition=(diff_prev<0.01 and not missing(diff_prev)) or (diff_next<0.01 and not missing(diff_next));
	if first.var then counter=0;
	counter+1;
	if counter=1 and condition then group=1; else group=.; 
	if counter=2 and condition then group=1; else group=.;
	if counter>2 then do;
		if condition and (prev_group=1 or prev_group=.) then group=1;
		if condition and missing(prev_group) then group=2;
		else group=.;
	end;
	prev_group=lag(group);
run;
--
Paige Miller
ballardw
Super User

You should seriously reconsider use of this programming construct:

Data &table.;
retain id;
set &table.;

When you use the same name for the output data set as the source data you completely replace the source data. So if there is any logic issue you may not have the data you need at all. This is further compounded by burying it in a macro. You may have already so altered the source data with testing that you can't get what you want.

 

Did you have something that worked as needed before attempting the macro? If not, get a version working without any macro calls or variables.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 673 views
  • 0 likes
  • 3 in conversation