Hello,
I'm using SAS 9.4 full edition. I have an array LM1-LM23 in which I'd like to find the most frequent field overall (in the test data below, it's "Done". The fields in the real data are character variables.
Case LM1 LM2 LM3 LM4....LM23
1 Done . . .
2 Not done . . .
3 Done Finishing Not Done Done
4 Done Started . .
5 Done Done Done Done
6 Done Done Done .
Goal output
Done 11/15
Not Done 2/15
Finishing 1/15
Started 1/15
Will this array coding only pull the first instance of the field? Someone is getting different numbers from me so I don't think this array is coded correctly.
array Contrib [23] $ LM1-LM23;
do p=1 to 23;
if Contribution [p] > ' ' then do;
factor = Contribution [p];
end; end;
proc freq order=freq; tables factor; run;
Thanks for the help!
When I tried changing the set name to a new name, it gave me an error that said that file doesn't exist.
The SET statement is where you tell what data to read in. So of course it needs to exist.
If you want to make a new data set then put the new name on the DATA statement as that is the one that defines the dataset(s) that you are creating.
While your program is very close, it's actually getting the last value (not the first). You need to add this statement:
output;
It goes right after assigning a value to FACTOR.
That's very helpful - thanks.
I tried naming the output (code below) but it doesn't come up as a separate data file. Does it result in a temporary file "contributing" that I can reference in future procs?
array ContribFactor [23] $ CF1-CF23;
do p=1 to 23;
if ContribFactor [p] > ' ' then do;
factor = ContribFactor [p];
output=contributing;
end; end;
This code doesn't stand alone. It needs to be part of a DATA step.
The name on the DATA statement is the name of the SAS data set you are creating.
The name on the SET statement is the name of the SAS data set you are reading in.
Following the SET statement, add the rest of the code.
Following that, add a RUN statement. (Don't forget to follow that with PROC FREQ.)
Finally, the OUTPUT statement stands alone, as a one-word statement:
output;
Yes - I have the array code within a data step with run at the end, along with a proc freq after that. Is there a way to make whatever output this creates into a datafile on its own and not the original finalfinalfile? I want to keep the original file wide but make a new file that is long to answer one particular question. When I tried changing the set name to a new name, it gave me an error that said that file doesn't exist.
Thanks
Laura
data finalfinalfile; set finalfinalfile;
array ContribFactor [23] $ CF1-CF23;
do p=1 to 23;
if ContribFactor [p] > ' ' then do;
factor = ContribFactor [p];
output;
end; end;
run;
proc freq order=freq; tables factor / missing; run;
When I tried changing the set name to a new name, it gave me an error that said that file doesn't exist.
The SET statement is where you tell what data to read in. So of course it needs to exist.
If you want to make a new data set then put the new name on the DATA statement as that is the one that defines the dataset(s) that you are creating.
The DATA statement names the SAS data set you are creating. So the DATA statement (not the SET statement) requires a new name, to hold the long version instead of the wide version.
However ... once you ran this step, it changed the wide data set, replacing it. Do you still have access to the wide version in its original form?
@Astounding already answered your question. What you are trying to do is make a wide file long, and then analyze the resulting file.
A group of us just wrote a paper and macro that does just that and you may find it useful if you have to do such tasks in the future. The still draft (but tested) macro and paper can be found at: http://www.sascommunity.org/wiki/An_Easier_and_Faster_Way_to_Untranspose_a_Wide_File
After running the macro, the following code would accomplish the task:
%untranspose(data=have, out=need (rename=(LM=factor)), var=LM, by=case,id=testnum) proc freq order=freq; tables factor; run;
Art, CEO, AnalystFinder.com
As you finalize your code, be sure to make the name of the array consistent.
It is really easy for IML.
data have;
input Case LM1 & $ LM2 $ LM3 & $ LM4 $;
cards;
1 Done . . .
2 Not done . . .
3 Done Finishing Not Done Done
4 Done Started . .
5 Done Done Done Done
6 Done Done Done .
;
proc iml;
use have(keep=lm:);
read all var _char_ into x;
close;
levels=t(setdif(unique(x),' '));
percent=j(nrow(levels),1);
total=sum(x^=' ');
do i=1 to nrow(levels);
percent[i]=sum(x=levels[i])/total;
end;
create want var{levels percent};
append;
close;
quit;
Actually , it could be more simple.
data have;
input Case LM1 & $ LM2 $ LM3 & $ LM4 $;
cards;
1 Done . . .
2 Not Done . . .
3 Done Finishing Not Done Done
4 Done Started . .
5 Done Done Done Done
6 Done Done Done .
;
proc iml;
use have;
read all var _char_ into x;
close;
call tabulate(level,freq,x);
percent=freq/countn(x);
level=t(level);
create want var {level percent};
append;
close;
quit;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.