Help using Base SAS procedures

PROC Transpose or Array?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

PROC Transpose or Array?

Hello,

 

I have a long dataset that has 20,000 rows. These rows represent test scores by grade and subject for various schools. So this is what the file layout looks like:

 

SCHOOL, GRADE, SUBJECT, SCORE

 

In the 20,000 rows there are about 550 schools.

 

I want to be able to list every score combination based on grade and subject by school on the same row. So, in essence I want to have about 550 rows showing all the test score combinations for each school in wide format. I think that I could do this in a PROC Transpose statement but would there be a way to do this in an array or a macro statement?

 

Thanks!


Accepted Solutions
Solution
‎08-23-2016 12:23 PM
Super User
Posts: 17,912

Re: PROC Transpose or Array?

[ Edited ]

One key difference between an array method and transpose is the dynamic nature. 

Proc Transpose doesn't require any prior knowledge about the number of levels while an array method will need the number of variables ahead of time. 

A work around for this is to create a macro variable with the number of variables requires if this information is not easily available. 

 

To implement this workaround you can use a SQL step prior to get the number of vars required.

 

 

proc sql noprint;
create table n_subjects as
select count(subject) as n_subject
from have
group by school;
select max(n_subject) into :n_subject
from n_subject;
quit;

%put &n_subject;

data want;
set have;

array subjects(&n_subject) $ 50 subject1-subject&n_subject.;
...

run;

 

View solution in original post


All Replies
Super User
Posts: 17,912

Re: PROC Transpose or Array?

Ahhhh....why? 

 

Your data at is in a useable format now, the wide format will be cumbersome. 

What are you expecting to gain from transposing the data?

Occasional Contributor
Posts: 7

Re: PROC Transpose or Array?

We are required to provide a file where all school information is on one row. Currently one school, if they have 30 test and grade combinations would have 30 rows - of course some of these are different based on how many tests and grades a school has.

Super User
Posts: 17,912

Re: PROC Transpose or Array?

Since your transposing it you can use either an array or transpose. An array method is faster since it is one step but proc transpose is three steps, one transpose for each of subject, grade, score and then merge by school. 

The transpose is easier to program though

Occasional Contributor
Posts: 7

Re: PROC Transpose or Array?

Thank you for your response. I think I would know how to do the Transpose but as I am trying to broaden my SAS programming skills, I want to understand how I could do this in an array?

Super User
Posts: 17,912

Re: PROC Transpose or Array?

Here's a write up on the various scenarios. 

 

http://www.ats.ucla.edu/stat/sas/modules/longtowide_data.htm

Solution
‎08-23-2016 12:23 PM
Super User
Posts: 17,912

Re: PROC Transpose or Array?

[ Edited ]

One key difference between an array method and transpose is the dynamic nature. 

Proc Transpose doesn't require any prior knowledge about the number of levels while an array method will need the number of variables ahead of time. 

A work around for this is to create a macro variable with the number of variables requires if this information is not easily available. 

 

To implement this workaround you can use a SQL step prior to get the number of vars required.

 

 

proc sql noprint;
create table n_subjects as
select count(subject) as n_subject
from have
group by school;
select max(n_subject) into :n_subject
from n_subject;
quit;

%put &n_subject;

data want;
set have;

array subjects(&n_subject) $ 50 subject1-subject&n_subject.;
...

run;

 

PROC Star
Posts: 1,099

Re: PROC Transpose or Array?

On the one hand, I admire and support your desire to expand your SAS programming skills. However, I've been in your type of situation a number of times, and I am absolutely convinced that the PROC TRANSPOSE approach is easier, cleaner, and more supportable.

 

If you were working for me, I'd insist on you doing it that way, as I know that down the road I would have fewer problems. It's just an approach that naturally mirrors the nature of the data. And we're in the business to solve problems cost-effectively.

 

I guarantee, you'll have the opportunity to learn arrays some day, when THEY'RE the optimal approach!

 

Tom

Respected Advisor
Posts: 4,659

Re: PROC Transpose or Array?

Are you required to produce a dataset or a report table? If the latter, look at proc tabulate output:

 

proc tabulate data=myData format=5.2;
class school grade subject;
var score;
table school, grade*subject*mean=""*score="";
run;
PG
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 374 views
  • 4 likes
  • 4 in conversation