BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yaoyao
Fluorite | Level 6

Hi,

I'd like to search a column in a dataset based on some conditions. Then replace them with a sequence with range [0,1] and 1/n as increment (n is the number of data found based on the condition). For example, search odd numbers in the column j in the Test dataset below. Then replace '3, 5, 7, 9, 11' with '0.2, 0.4, 0.6, 0.8, 0.1'.

 

 

data test;
 do i=1 to 10 by 1;
 j=i+1;
 output;
 end;
run;

 

Many thanks in advance

 

 

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20
data WANT; 
  do until(LASTOBS1);
    set HAVE(where=(mod(J,2)=1)) end=LASTOBS1;
    DENOM+1;
  end;
  N=START_RANGE;
  do until(LASTOBS2);
    set HAVE end=LASTOBS2;
    if mod(J,2)=1 then do;
      J=N;
      N+(END_RANGE-START_RANGE)/DENOM;
    end;
   output;
  end;
run;

View solution in original post

14 REPLIES 14
ChrisNZ
Tourmaline | Level 20

Like this?

data WANT; 
  set HAVE;
  if mod(J,2)=1 then do;
    N+0.2;
    J=N;
  end;
run;

 

yaoyao
Fluorite | Level 6
Thank you for your help. It is great! Have a further question. Since we know there are five odd numbers, we set 1/5 as increment. How can we have a dynamic method to set the increment.
ChrisNZ
Tourmaline | Level 20

Do you want 2 steps?  1) Count the number of odd number  2) Apply the derived ratio?

yaoyao
Fluorite | Level 6

Yep, 2 steps. Also is it possible we can do it with the dynamic range defined in the columns like column start_range and end_range as below?

 

data test;
 do i=1 to 10 by 1;
 j=i+1;
 start_range=0.5;
 end_range=1;
 output;
 end;
run;
ChrisNZ
Tourmaline | Level 20

> Also is it possible we can do it with the dynamic range 

Just change the line

N+1/DENOM;

 

ChrisNZ
Tourmaline | Level 20
data WANT; 
  do until(LASTOBS1);
    set HAVE(where=(mod(J,2)=1)) end=LASTOBS1;
    DENOM+1;
  end;
  N=START_RANGE;
  do until(LASTOBS2);
    set HAVE end=LASTOBS2;
    if mod(J,2)=1 then do;
      J=N;
      N+(END_RANGE-START_RANGE)/DENOM;
    end;
   output;
  end;
run;
yaoyao
Fluorite | Level 6

Many thanks. It is exactly what I wanted. Can I have the last request? How can I calculate N by group? For example, the test dataset has a group column. I want to count the number of the odd numbers by the group. Then set the two subgroups with the different increments. Appreciate your help.

 

data test;
 do i=1 to 10 by 1;
 j=i+1;
 if j<=5 then
		group = 0;
	else
		group = 1;

 start_range=0.5;
 end_range=1;
 output;
 end;
run;
ChrisNZ
Tourmaline | Level 20

> How can I calculate N by group? 

Like this ?

data WANT; 
  do until(LASTOBS1);
    set HAVE(where=(mod(J,2)=1)) end=LASTOBS1;
    DENOM0+(GROUP=0); 
    DENOM1+(GROUP=1);
  end;
  do until(LASTOBS2);
    set HAVE end=LASTOBS2;
    if mod(J,2)=1 then do;
      if GROUP=0 then do;
        N0+1/DENOM0; 
        J=N0;
      end;
      if GROUP=1 then do;
        N1+1/DENOM1; 
        J=N1;
      end;
    end;
    output;
  end;
run;

Assuming the data is not sorted by GROUP.

 

yaoyao
Fluorite | Level 6
Thank you for your code. Is it possible we can dynamically deal with the group? In my real code, the dataset is grouped by date. It expands over time.
ChrisNZ
Tourmaline | Level 20

Like this?

data WANT; 
  call missing(DENOM,N);
  do until(last.DATE);
    set HAVE(where=(mod(J,2)=1));
    by DATE; 
    DENOM+1;
  end;
  do until(last.DATE);
    set HAVE ;
    by DATE;
    if mod(J,2)=1 then do;
      N+1/DENOM; 
      J=N;
    end;
    output;
  end;
run;

Untested as I don't have access to SAS atm.

This won't work if some groups contain no odd numbers.

ChrisNZ
Tourmaline | Level 20

Like this?

data WANT; 
  do until(LASTOBS1);
    set HAVE(where=(mod(J,2)=1)) end=LASTOBS1;
    DENOM+1;
  end;
  do until(LASTOBS2);
    set HAVE end=LASTOBS2;
    if mod(J,2)=1 then do;
      N+1/DENOM;
      J=N;
    end;
    output;
  end;
run;

 

mkeintz
PROC Star

As you have seen, you need to pass through the data twice: first time to establish a denominator, and the second time to increment the cumulative fraction and output.  You can do this with a pair of loops each with a SET statement, or a single SET with "in=" parameters, and a couple of "summing statements":

 

data want (drop=_:);
  set have (in=firstpass)
      have (in=secondpass) ;
  where mod(j,2)=1;
  if firstpass then _denom+1;  /*summing statement for counting*/
  if secondpass ;  /*"subsetting if" */
  n+1/_denom ;     /* summing statement for incrementing */
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
yaoyao
Fluorite | Level 6
Thank you for your suggestion. Can you help to write a sample code? I am an R user for several years. But still new to SAS. In my real code, there are thousands of groups. Wondered whether we can do the grouping calculation efficiently.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 868 views
  • 2 likes
  • 3 in conversation