- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Community,
Just consider the dataset as an example.
w - wood; p- plastic ; s - steel ; Items 1 2 3 4 5....(wooditem1 w1; plasticitem5 p5; steelitem3 s3 and so on)....
Here these 3 stuffs have a value. I want to sum up the value by category.
data xxx;
input aa$ bb$ cc$ dd$ c1 c2 c3 c4;
cards;
w1 p2 s4 w3 10 45 14 5
s1 s3 w2 p4 20 14 15 84
p2 w4 w5 w4 12 18 45 7
s1 w3 p5 s2 23 56 87 44
run;
proc print data = xxx;
run;
This is how my output should be as a result
W | S | P |
15 | 14 | 45 |
34 | 15 | 84 |
12 | 0 | 70 |
… | … | … |
Could any one suggest some code by SAS or Proc SQL to resolve it.
Thanks in advance!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data xxx;
input aa$ bb$ cc$ dd$ c1 c2 c3 c4;
cards;
w1 p2 s4 w3 10 45 14 5
s1 s3 w2 p4 20 14 15 84
p2 w4 w5 w4 12 18 45 7
s1 w3 p5 s2 23 56 87 44
;
run;
data want;
set xxx;
w=0;p=0;s=0;
array x{*} $ aa bb cc dd;
array y{*} c1-c4;
do i=1 to dim(x);
if x{i}=: 'w' then w+y{i};
else if x{i}=: 'p' then p+y{i};
else if x{i}=: 's' then s+y{i};
end;
keep w p s;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try below code:
%let name_list =aa bb cc dd;
%macro loop();
data xxx2;
set xxx;
w = 0;
s = 0;
p = 0;
%let i=1;
%do %while (%scan(&name_list, &i) ne );
%let next_name = %scan(&name_list, &i);
if substr(&next_name.,1,1) = 'w' then w = w + c&i.;
else if substr(&next_name.,1,1) = 's' then s = s + c&i.;
else if substr(&next_name.,1,1) = 'p' then p = p + c&i.;
%let i = %eval(&i + 1);
%end;
run;
%mend;
%loop();
If the variables your using to identify each material are easy to loop through (i.e. mat1,mat2,mat3.......) then the above can be simplified.
Regards,
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data xxx;
input aa$ bb$ cc$ dd$ c1 c2 c3 c4;
cards;
w1 p2 s4 w3 10 45 14 5
s1 s3 w2 p4 20 14 15 84
p2 w4 w5 w4 12 18 45 7
s1 w3 p5 s2 23 56 87 44
;
run;
data want;
set xxx;
w=0;p=0;s=0;
array x{*} $ aa bb cc dd;
array y{*} c1-c4;
do i=1 to dim(x);
if x{i}=: 'w' then w+y{i};
else if x{i}=: 'p' then p+y{i};
else if x{i}=: 's' then s+y{i};
end;
keep w p s;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Note that in the rows 2-3 of your proposed output the values of W and S are swapped. Assuming that it's a typo, you've already been offered a solution. One problem with it is that the output variables W, S, P are hard coded. It's fine when you know that is all you have. But in the real world with the data structured like yours - when the information about output metadata is embedded into the data, the aggregate variable names cannot be determined without a preliminary pass through the input: Doing otherwise would mean making assumption about the data.
One way of avoiding this rather precarious practice is to analyze the input data set first by reading it in full, find out what the distinct prefixes in the values of the variables aa--dd are and then hard code them into a conditional structure, such as if-then-else. Another, more dynamic, approach is to let SAS do the job as shown below. This way, you don't care what the prefixes are: They are discovered in the first pass, and in the ensuing aggregating pass, the compiler knows the output variable names and types automatically.
data have ;
input aa$ bb$ cc$ dd$ c1 c2 c3 c4 ;
cards ;
w1 p2 s4 w3 10 45 14 5
s1 s3 w2 p4 20 14 15 84
p2 w4 w5 w4 12 18 45 7
s1 w3 p5 s2 23 56 87 44
run ;
/* 1st pass: Find output vnames for 2nd pass */
data _null_ ;
do until (z) ;
set have end = z ;
array ch _char_ ;
length vv $ 32767 ;
do over ch ;
v = compress (ch, , "d") ;
if not findw (vv, cats (v)) then vv = catx (" ", vv, v) ;
end ;
end ;
call symputx ("vv", vv) ;
run ;
/* 2nd pass: Aggregate into var list &vv */
data want (keep = &vv) ;
set have ;
array ch _char_ ;
array nn _numeric_ ;
array vv (j) &vv ;
do over vv ; vv = 0 ;
do over ch ;
if ch =: vname (vv) then vv = sum (vv, nn) ;
end ;
end ;
run ;
The correct output will look as follows:
W P S ---------- 15 45 14 15 84 34 70 12 0 56 87 67
Paul D.