Solved: Re: comparing 2rows

Anita_n · Posted 10-10-2021 11:23 AM

Dear all,

I have a question concerning how to compare two rows in SAS.

I have the following dataset. I wish to compare row1 with row2 and if var1, var2, var3 in row1 is the same as the corresponding variables in row2 then the value of the first var3name should be outputted in a new variable called var_new for all corresponding rows

data sample_data;
input var1 var2 var3 var3name $;
datalines; 
	01 0 01 ABN 
	01 0 01 ABS
	01 0 01 ABE
	01 0 02 CGM
	01 0 02 CGN
	01 0 02 CGM
	01 0 02 CGR
	02 1 03 DFJ
	02 1 03 DFK
	02 1 03 DFL
;
run;

The output should look like this below:

data want;
input var1 $ var2 $ var3 $ var3name $ var_new $;
datalines; 
	01 0 01 ABN ABN
	01 0 01 ABS ABN
	01 0 01 ABE ABN
	01 0 02 CGM CGM
	01 0 02 CGN CGM
	01 0 02 CGM CGM
	02 1 03 DFJ DFJ
	02 1 03 DFK DFJ
	02 1 03 DFL DFJ
;
run;

I will appreciate any help

FreelanceReinh · Posted 10-10-2021 11:47 AM

Hello @Anita_n,

Use BY-group processing and retain variable var_new:

data want;
set sample_data;
by var1-var3;
if first.var3 then var_new=var3name;
retain var_new;
run;

You haven't specified the value of var_new in the case that, e.g., the first and second row differ in var1, var2 or var3. My suggested code treats the first row in this case as a group comprised of only one observation, i.e., the value of var3name is copied to var_new, as always in the first observation of a group.

(Note that your WANT and SAMPLE_DATA are not exactly consistent.)

Edit:

If your real variable names do not have a common prefix followed by "1", "2", "3," write them out in the BY statement, e.g.,

by red green blue;

If your real data are not sorted by var1 var2 var3, but you still want to compare only consecutive observations, add the NOTSORTED option to the BY statement:

by var1-var3 notsorted;

View solution in original post

FreelanceReinh · Posted 10-10-2021 11:47 AM

Hello @Anita_n,

Use BY-group processing and retain variable var_new:

data want;
set sample_data;
by var1-var3;
if first.var3 then var_new=var3name;
retain var_new;
run;

You haven't specified the value of var_new in the case that, e.g., the first and second row differ in var1, var2 or var3. My suggested code treats the first row in this case as a group comprised of only one observation, i.e., the value of var3name is copied to var_new, as always in the first observation of a group.

(Note that your WANT and SAMPLE_DATA are not exactly consistent.)

Edit:

If your real variable names do not have a common prefix followed by "1", "2", "3," write them out in the BY statement, e.g.,

by red green blue;

If your real data are not sorted by var1 var2 var3, but you still want to compare only consecutive observations, add the NOTSORTED option to the BY statement:

by var1-var3 notsorted;

Anita_n · Posted 10-10-2021 12:22 PM

@FreelanceReinh thanks for the quick reply. Yes, I will forgot to specify the case where row1 and row2 differs in var1, var2, and var3, could you suggest how I should go about such cases please. Thanks a lot

FreelanceReinh · Posted 10-10-2021 12:38 PM

You can define the value of var_new in this case as appropriate for your application, e.g., use the value of var3name (as my suggested code does), leave it missing, set it to a constant value (e.g., "unique") or to a value indicating which of the three variables var1, var2 and var3 differ, etc. The code follows the specifications.

Anita_n · Posted 10-10-2021 12:41 PM

thanks, your code worked

comparing 2rows

Re: comparing 2rows

Re: comparing 2rows

Re: comparing 2rows

Re: comparing 2rows

Re: comparing 2rows

Registration is open

SAS Training: Just a Click Away