DATA Step, Macro, Functions and more

question about array coding

Accepted Solution Solved
Reply
Contributor
Posts: 44
Accepted Solution

question about array coding

Hello,

 

I'm using SAS 9.4 full edition. I have an array LM1-LM23 in which I'd like to find the most frequent field overall (in the test data below, it's "Done". The fields in the real data are character variables.  

 

Case  LM1            LM2          LM3         LM4....LM23

1        Done           .                 .               .

2        Not done    .                  .               .

3        Done          Finishing    Not Done  Done

4        Done          Started           .              .

5        Done          Done         Done        Done

6        Done          Done          Done       .

 

Goal output

Done           11/15         

Not Done     2/15

Finishing      1/15

Started         1/15

 

Will this array coding only pull the first instance of the field?  Someone is getting different numbers from me so I don't think this array is coded correctly.

array Contrib [23] $ LM1-LM23;   
do p=1 to 23;
   if Contribution [p] > ' ' then do;
      factor = Contribution [p];
end; end;

proc freq order=freq; tables factor; run;

 

Thanks for the help!


Accepted Solutions
Solution
‎01-28-2018 04:16 PM
Super User
Super User
Posts: 7,845

Re: question about array coding

When I tried changing the set name to a new name, it gave me an error that said that file doesn't exist.

The SET statement is where you tell what data to read in.  So of course it needs to exist.

If you want to make a new data set then put the new name on the DATA statement as that is the one that defines the dataset(s) that you are creating.

 

View solution in original post


All Replies
Super User
Posts: 6,534

Re: question about array coding

While your program is very close, it's actually getting the last value (not the first).  You need to add this statement:

 

output;

 

It goes right after assigning a value to FACTOR.

Contributor
Posts: 44

Re: question about array coding

Posted in reply to Astounding

That's very helpful - thanks.

 

I tried naming the output (code below) but it doesn't come up as a separate data file. Does it result in a temporary file "contributing" that I can reference in future procs?

 

array ContribFactor [23] $ CF1-CF23;   
do p=1 to 23;
   if ContribFactor [p] > ' ' then do;
      factor = ContribFactor [p];
	  output=contributing;
end; end;
Super User
Posts: 6,534

Re: question about array coding

[ Edited ]

This code doesn't stand alone.  It needs to be part of a DATA step.

 

The name on the DATA statement is the name of the SAS data set you are creating.

 

The name on the SET statement is the name of the SAS data set you are reading in.

 

Following the SET statement, add the rest of the code.

 

Following that, add a RUN statement.  (Don't forget to follow that with PROC FREQ.)

 

Finally, the OUTPUT statement stands alone, as a one-word statement:

 

output;

Contributor
Posts: 44

Re: question about array coding

Posted in reply to Astounding

Yes - I have the array code within a data step with run at the end, along with a proc freq after that. Is there a way to make whatever output this creates into a datafile on its own and not the original finalfinalfile?  I want to keep the original file wide but make a new file that is long to answer one particular question. When I tried changing the set name to a new name, it gave me an error that said that file doesn't exist.

 

Thanks

Laura

 

data finalfinalfile; set finalfinalfile;
array ContribFactor [23] $ CF1-CF23;   
do p=1 to 23;
   if ContribFactor [p] > ' ' then do;
      factor = ContribFactor [p];
	  output;
end; end;
run;

proc freq order=freq; tables factor / missing; run;
Solution
‎01-28-2018 04:16 PM
Super User
Super User
Posts: 7,845

Re: question about array coding

When I tried changing the set name to a new name, it gave me an error that said that file doesn't exist.

The SET statement is where you tell what data to read in.  So of course it needs to exist.

If you want to make a new data set then put the new name on the DATA statement as that is the one that defines the dataset(s) that you are creating.

 

Super User
Posts: 6,534

Re: question about array coding

The DATA statement names the SAS data set you are creating.  So the DATA statement (not the SET statement) requires a new name, to hold the long version instead of the wide version.

 

However ... once you ran this step, it changed the wide data set, replacing it.  Do you still have access to the wide version in its original form?

PROC Star
Posts: 8,104

Re: question about array coding

@Astounding already answered your question. What you are trying to do is make a wide file long, and then analyze the resulting file.

 

A group of us just wrote a paper and macro that does just that and you may find it useful if you have to do such tasks in the future. The still draft (but tested) macro and paper can be found at: http://www.sascommunity.org/wiki/An_Easier_and_Faster_Way_to_Untranspose_a_Wide_File

 

After running the macro, the following code would accomplish the task:

%untranspose(data=have, out=need (rename=(LM=factor)), var=LM, by=case,id=testnum)

proc freq order=freq;
  tables factor;
run;

Art, CEO, AnalystFinder.com

 

Valued Guide
Posts: 653

Re: question about array coding

As you finalize your code, be sure to make the name of the array consistent.

Super User
Posts: 10,610

Re: question about array coding

It is really easy for IML.

 

data have;
input Case  LM1     & $       LM2   $       LM3    & $ LM4 $;   
cards;
1        Done           .                 .               .
2        Not done    .                  .               .
3        Done          Finishing    Not Done  Done
4        Done          Started           .              .
5        Done          Done         Done        Done
6        Done          Done          Done       .
;

proc iml;
use have(keep=lm:);
read all var _char_ into x;
close;
levels=t(setdif(unique(x),' ')); 
percent=j(nrow(levels),1);
total=sum(x^=' ');
do i=1 to nrow(levels);
  percent[i]=sum(x=levels[i])/total;
end;

create want var{levels percent};
append;
close;
quit;
Super User
Posts: 10,610

Re: question about array coding

Actually , it could be more simple.

 

data have;
input Case  LM1     & $       LM2   $       LM3    & $ LM4 $;   
cards;
1        Done           .                 .               .
2        Not Done    .                  .               .
3        Done          Finishing    Not Done  Done
4        Done          Started           .              .
5        Done          Done         Done        Done
6        Done          Done          Done       .
;

proc iml;
use have;
read all var _char_ into x;
close;
call tabulate(level,freq,x);
percent=freq/countn(x);
level=t(level);
create want var {level percent};
append;
close;
quit;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 226 views
  • 2 likes
  • 6 in conversation