BookmarkSubscribeRSS Feed
MiniRadde
Obsidian | Level 7

Dear all,

 

I am in deep need of some help with something that I think is quite basic and simple, but which I cannot figure out myself. Suppose I have a dataset with a large number of variables and four of them are of concern for this data step:

 

HAVE = 

Var1Var2Var3Var4
A1C1
B1C2
TOT1C.
A2C4
B2C5
TOT2C.
A3C7
B3C8
TOT3C.
A1D90
B1D80
TOT1D.
A2D60
B2D50
TOT2D.
A3D30
B3D20
TOT3D.

 

 

Now I wish to fill out the missing values for the "TOT"-observation, by summing the observations of "A" and "B" in which Var2 and Var3 are equal. I.e., I need the red numbers in:

 

WANT = 

Var1Var2Var3Var4
A1C1
B1C2
TOT1C3
A2C4
B2C5
TOT2C9
A3C7
B3C8
TOT3C15
A1D90
B1D80
TOT1D170
A2D60
B2D50
TOT2D110
A3D30
B3D20
TOT3D50

 

Is there anyone who knows how to solve this in a simple data step? 

 

 

Thank you in advance!

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8 REPLIES 8
novinosrin
Tourmaline | Level 20

Hi @MiniRadde Do you really have TOT obs in your HAVE dataset following every set of of A and B obs. Not a big deal as you rightly mentioned in datastep, but i find the requirement more alike a proc print/report. Can you clarify? Oh well

MiniRadde
Obsidian | Level 7

Hi @novinosrin! yes, I do have it already inte the HAVE dataset. a smentioned there are other variables in there where TOT have values. They are however not sorted as illustrated. 

Kurt_Bremser
Super User

Providing meaningful example data in a readily usable fashion (data step with datalines) enhances the quality of answers and reduces the need to go back-and-forth several times.

novinosrin
Tourmaline | Level 20

Thank you @MiniRadde for clarifying. So assuming your data is a good representative, the sort is the most important part.

 


data have;
input Var1 $	Var2	Var3 $	Var4;
cards;
A	1	C	1
B	1	C	2
TOT	1	C	.
A	2	C	4
B	2	C	5
TOT	2	C	.
A	3	C	7
B	3	C	8
TOT	3	C	.
A	1	D	90
B	1	D	80
TOT	1	D	.
A	2	D	60
B	2	D	50
TOT	2	D	.
A	3	D	30
B	3	D	20
TOT	3	D	.
;

proc sort data=have out=_have;
by var3 var2 var1;
run;

data want;
set _have;
by  var3 var2 var1;
if first.var2 then s=0;
s+var4;
if var1='TOT' and missing(var4) then var4=s;
drop s;
run;
PaigeMiller
Diamond | Level 26

Assuming your data is sorted properly by var3 and var2:

 

data want;
    set have;
    by var3 var2;
    if first.var2 then sum=0;
    sum+var4;
    if var1='TOT' then var4=sum;
run;

 

--
Paige Miller
PeterClemmensen
Tourmaline | Level 20

Do something like this

 

data have;
input Var1 $ Var2 Var3 $ Var4;
datalines;
A 1 C 1
B 1 C 2
TOT 1 C .
A 2 C 4
B 2 C 5
TOT 2 C .
A 3 C 7
B 3 C 8
TOT 3 C .
A 1 D 90
B 1 D 80
TOT 1 D .
A 2 D 60
B 2 D 50
TOT 2 D .
A 3 D 30
B 3 D 20
TOT 3 D .
;

data want(drop=c);
   set have;
   c+Var4;
   if missing(Var4) then do;
      Var4=c;
      c=0;
   end;
run;
Kurt_Bremser
Super User

by-group processing, and retaining a helper variable:

data want;
set have;
by var2 var3;
retain _var4;
if first.var3
then _var4 = var4;
else if not last.var3
then _var4 + var4;
else var4 = _var4;
drop _var4;
run;
MiniRadde
Obsidian | Level 7

Thank you all for all the help you provided. The solutions seemed to be in a similar setting from all of you. I could not however make it work, as best it just copied the value from A or B but did not generate a sum. I solved the task by adding some data steps instead and could from there merge them back and make the sumation over variables instead. Ugly code - YES - but it solved my problem. Again, thanks for all the effort!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1302 views
  • 0 likes
  • 5 in conversation