BookmarkSubscribeRSS Feed
lpy0521
Fluorite | Level 6

Hi,

I have question that how to conditional perform vertical sum as following: 

Data have:

IDVAR1VAR2COE1
A240.2
ANA/.NA/.0.3
A130.5
B460.1
B790.2
B130.7
C560.5
C230.5

 

I want to do: if there is NA or dot  in var1 or var2, then sum the rest vertically by ID, otherwise calculate the sum product between Var1, Var2 and COE 1.

 

Data want:

IDVAR1VAR2
A37
B2.54.5
C3.54.5

 

where in data want,

Var1(A) = 2+1, Var2(A) =4+3;

Var1(B) =4*0.1+7*0.2+1*0.7=2.5 Var2(B) = 6*0.1+9*0.2+3*0.7=4.5;

Var1(C) = 5*0.5+2*0.5 = 3.5, Var2(C) = 6*0.5+3*0.5 = 4.5

 

Thanks!

7 REPLIES 7
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

here is a data step that gives your requested results

 

 

data have;
input ID $ var1 var2 coe1;
cards;

A 2 4 0.2
A NA/. NA/. 0.3
A 1 3 0.5
B 4 6 0.1
B 7 9 0.2
B 1 3 0.7
C 5 6 0.5
C 2 3 0.5
;
proc sort data=have out=have2;
	by ID;
run;

data want(drop = var1_total var2_total coe1);
	set have2;
	by id;
	if first.ID then
		do;
			var1_total = 0;
			var2_total = 0;
		end;
	var1_total + var1;
	var2_total + var2;
	if last.ID then
		do;
			var1 = var1_total;
			var2 = var2_total;
			output;
		end;
run;

lpy0521
Fluorite | Level 6

Thanks for your suggestion. But it gives the sum for B C, instead of the sum product. By sum product I mean:

Var1(B)*COE(B) + VAR2(B)*COE(B).

 

Thanks!

andreas_lds
Jade | Level 19

Lets solve the issue with the input data first: the string "NA/." can't be stored in a numeric variable. var1 and var2 must be numeric to be used in calculations. So please post the example data as data-step using the data-types you have in your original dataset.

 

Except for the special treatment when var1 or var2 are missing in a group, this looks like weighted sums.

lpy0521
Fluorite | Level 6

Sorry for the confusion here. What I mean by "NA/." is actually missing values and there are almost half of my var1 and var2 are missing. So basically I want to perform:

 

For each ID, if there is missing value in Var1 or Var2, then sum those non-missing value up, if there are no missing value, then calculate the weighted sum with corresponding COE1.

 

Thanks for your comment!

andreas_lds
Jade | Level 19

Not 100% that this works, but it seems to provide the requested results:

 

proc sql;
  /* find all IDs having var1 and/or var2 missing */
   create table work.WithMissing as
      select distinct id
         from work.have2
            where missing(var1) or missing(var2)
   ;
  /* set coe1 to 1 for the IDs */
   update work.have2 
      set coe1 = 1 
         where id in (select id from work.WithMissing)
   ;
quit;

proc summary data=work.have2 nway;
   class id;
   var var1 var2;
   weight coe1;
   output out=work.want(drop= _type_ _freq_) sum=;
run;
s_lassen
Meteorite | Level 14

It can be done using SQL:

proc sql;
  create table want as select
    ID,
    case nmiss(var1)+nmiss(var2)
      when 0 then sum(var1*coe1)
      else sum(var1)
    end as var1,  
    case nmiss(var1)+nmiss(var2)
      when 0 then sum(var2*coe1)
      else sum(var2)
    end as var2
  from have
  group by id;
quit; 

 Or you can do it with datasteps like this:

data miss;
  set have(where=(var1 is null or var2 is null))
  by id;
  if first.id;
  keep id;
run;

data want;
  sum1=0;
  sum2=0;
  do until(last.id);
    merge miss(in=miss) have;
    by id; 
    if miss then do;
      sum1+var1;
      sum2+var2;
      end;
    else do; 
      sum1+var1*coe1;
      sum2+var2*coe1;
      end;
    end;
  drop sum1 sum2;
  var1=sum1;
  var2=sum2;
run;

 

Astounding
PROC Star

Assuming that your variables are actually numeric, and that your data is sorted ...

 

Why not just compute both and track which one you want to use.  For example:

 

data want;

var1_sum=0;

var1_weighted=0;

var2_sum=0;

var2_weighted=0;

do until (last.id);

   set have;

   by id;

   var1_sum + var1;

   var2_sum + var2;

   if var1 > . then var1_weighted + var1*coe1;

   else var1_missing=1;

   if var2 > . then var2_weighted + var2*coe1;

   else var2_missing=1;

end;

if var1_missing then var1=var1_sum;

else var1= var1_weighted;

if var2_missing then var2=var2_sum;

else var2=var2_weighted;

drop var1_: var2_: ;

run;

   

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1817 views
  • 4 likes
  • 5 in conversation