DATA Step, Macro, Functions and more

Summary

Reply
Super Contributor
Posts: 673

Summary


data test;
input id jan feb mar apr;
cards;
1 732 750 748 790
2 760   0   0   0
3 750   0  765  0
4 666   0    0  0
5 777 779  880  0
6 850   0  779  0
;
run;

How to get a summary  that tells how many id's have scores present in

All months (id 1)
only first month (id 2 and 4)
two months (id 3 and 6)
consecutive months (id 1 and 5)

Super User
Posts: 11,343

Re: Summary

If existence is important then this may be one of those times where you do not want 0 but missing values as NMISS would simplify the logic.

Is your "two months" required to start with Jan or any two months, and does it have to be EXACTLY two months?

Is your consecutive required to start with Jan? Does it count for consecutive if there are only two months?

Are these consecutive or two months?

7 850    779   0 0

8 0  850   779  0

9 0   850   779  110

Super Contributor
Posts: 673

Re: Summary

well.. it deosnt have to start in Jan or feb

Super User
Super User
Posts: 7,988

Re: Summary

IMO having dates of any type as columns is a bad idea.  First thing I would do with this data is normalise it:

ID     MONTH     RESULT

1       Jan            1234

1       Feb            234

...

Its simple then to do sums and such like across the data using SQL or means procedure.  If you absolutely have to have them as columns, then name the columns a standard with suffix approach and put other data in the label:

COL1     COL2     ...

"Jan"      "Feb"

As then you can use array processing on them.  However, with 'data' as columns you may find you have thousands of columns if you have several years data.

Trusted Advisor
Posts: 2,116

Re: Summary

It appears that "0" is not a valid score.  So the first step is to change all the "0" values to missings.

Then you can do the first and third with the N function, as

all=(N(of jan--apr)=4);

two=(N(of jan--apr)=2);

The second follows as

first=((jan>.) & (N(of jan-apr)=1)));

The last one will require some if-the-else logic or a Perl expression.  My Perl is too rusty to provide useful thoughts.

Super Contributor
Posts: 673

Re: Summary

can the values be defined under one variable?

Super User
Posts: 19,869

Re: Summary

Post an example of how you'd want that to look please

Super Contributor
Posts: 275

Re: Summary

data test;

input id jan feb mar apr;

array month Jan--Apr;

length flag $15.;

if nmiss(of month(*))=0 then flag='All';

else if nmiss(of month(*))=3 then flag='First';

else if nmiss(of month(*))=1 then do;

   if whichc(.,of month(*)) in (1,4) then flag='Consecutive';

   else flag='Two Months';

end;

else if nmiss(of month(*))=2 then flag='Two Months';

cards;

1 732 750 748 790

2 760   .   .   .

3 750   .  765  .

4 666   .    .  .

5 777 779  880  .

6 850   .  779  .

;

run;

Contributor
Posts: 27

Re: Summary


data test;
input id jan feb mar apr;
cards;
1 732 750 748 790
2 760   0   0   0
3 750   0  765  0
4 666   0    0  0
5 777 779  880  0
6 850   0  779  0
;
run;

proc transposrt data=test out=new;
by id;
run;
proc print data=new;
run;


proc sql;
select count(*) from new
where col1>0
group by id;
quit;

Ask a Question
Discussion stats
  • 8 replies
  • 328 views
  • 0 likes
  • 7 in conversation