- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have the following data file
For each variable a, b, c I have 3 period (p1 p2 p3) reporting the value.
data have;
input time p1a p2a p3a p1b p2b p3b p1c p2c p3c;
datalines;
1 1 0 -1 6 9 5 3 -9 6
2 2 3 6 0 0 0 -2 -2 -9
;
for each row, I want to count the number of positive period for A, B, C seperately.
The summary column for a , b, c is:
1 3 2
3 0 0
In my data, there are 100 of variables, so I cannot array 1 by 1.
Thanks for your help.
HHC
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Using arrays, the key to making this a short program is whether you can abbreviate the list of variables in the array. For example, would this ARRAY statement define all 78 variables in the proper order?
array all {78} p1a -- p3z;
If so, the task becomes straightforward:
data want;
set have;
array all {78} p1a -- p3z;
array sums {26} sum1-sum26;
do _n_=1 to 26;
sums{_n_}=0;
end;
do _n_=1 to 78;
if all{_n_} > 0 then sums{ceil(_n_/3)} + 1;
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I don't have access to SAS at the moment, but something like:
data want; set have; array dat{9} p1a--p3c; array res{9} 1; do i=1 to 9; if dat{i} > 0 then res{i}=1; end; a=sum(res1-res3); b=sum(res4-res6); c=sum(res7-res9); run;
or:
data want; set have; a=sum(ifn(p1a>0,1,0),ifn(p2a>0,1,0),ifn(p3a>0,1,0)); b=sum(ifn(p1b>0,1,0),ifn(p2b>0,1,0),ifn(p3b>0,1,0)); c=sum(ifn(p1c>0,1,0),ifn(p2c>0,1,0),ifn(p3c>0,1,0)); run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I would question the decision to name the variables p1a p2a and p3a if I knew that I was going to process them as a group. Names of ap1 ap2 and ap3 would then work better. You may need to provide more examples of the "100 of variables" involved as a solution for your example may not work with longer or variable length names or suffixes.
One approach is to create variables with the suffix as a prefix so that arrays are possible. This will do that but if the actual variables have varying length names and suffixes it will not work.
data have;
input time p1a p2a p3a p1b p2b p3b p1c p2c p3c;
datalines;
1 1 0 -1 6 9 5 3 -9 6
2 2 3 6 0 0 0 -2 -2 -9
;
run;
proc transpose data=have out=trans;
run;
data temp;
set trans;
if _name_ ne 'time' then _name_ = cats(substr(_name_,3),_name_);
run;
proc transpose data=temp name=_name_ out=havetrans (drop=_name_);
run;
One suspects that a dataset with "hundreds" of similarly named variables may be the wrong structure in general and would be better served with a Time Group (values of A B C in your example) P1 P2 P3 structure in general.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Oh, I didn't see the 100's of variables. I agree totally, why have hundreds of variables. Doesn't make any sense. Any processing you do on that type of strcuture is going to be far harder. Normalise your data, process it using a simple structure, then at teh end transpose it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @hhchenfx,
How about this?
proc transpose data=have out=trans;
by time;
run;
data trans;
do _n_=1 to 3;
set trans;
_name_=substr(_name_,3);
npos=sum(npos,col1>0);
end;
drop col1;
run;
proc transpose data=trans out=want(drop=_name_);
by time;
run;
Edit: Slightly simplified the DOW loop by using an iterative DO statement rather than a DO UNTIL statement and a counter variable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Using arrays, the key to making this a short program is whether you can abbreviate the list of variables in the array. For example, would this ARRAY statement define all 78 variables in the proper order?
array all {78} p1a -- p3z;
If so, the task becomes straightforward:
data want;
set have;
array all {78} p1a -- p3z;
array sums {26} sum1-sum26;
do _n_=1 to 26;
sums{_n_}=0;
end;
do _n_=1 to 78;
if all{_n_} > 0 then sums{ceil(_n_/3)} + 1;
end;
run;