BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hkim3677
Calcite | Level 5

Hi masters!

 

I wonder how I can reformat my datafile.

 

My current format is like:

Year        Student1     Student2    

2015          1                     0             

2016          0                     1        

2017          1                     1      

 

My wanted format is like:

Student       Year      Var         

Student1     2015      1 

Student1     2016      0

Student1     2017      1

Student2     2015      0

Student2     2016      1

Student2     2017      1

 

Thank you for your help Masters!

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

One way.

Best is to provide example data in the form of a data step as shown below. This is critical because if for some reason Student1 has numeric values and Student2 has character values this will not work and a different approach would be needed.

data have;
   input Year  Student1 Student2;    
datalines;
2015          1                     0             
2016          0                     1        
2017          1                     1      
;
/* may need to sort have by YEAR if not actually so*/
proc transpose data=have out=want(rename=(col1=var))
   name=Student ;
   by year;
   var student: ;
run;

Proc sort data=want;
   by student year;
run;

Proc Transpose is designed for making columns into rows or rows into columns. Simple data like this is pretty easy. BY varaible(s) would be any group that keeps like values together but the data does need to be sorted by those.

The VAR statement lists specific variables to transpose. In this case I used a list with a Colon to indicate all variables whose names start with "Student". I am guessing that you very likely have more than 2 this gets you out of listing all of them.

Proc Transpose will have a variable by default on each row with the name of the transposed variable from the VAR statement. The Name= procedure option allows you to rename that variable. The default behavior is to create variables named Col1, Col2, Col3 etc. if needed. This would happen if Student1 had more than one value per year. Here I use a data set option on the set written by the procedure to rename the Col1 to Var.

Then sort as the order would be different than desired.

View solution in original post

5 REPLIES 5
sbxkoenk
SAS Super FREQ

"Thousands" of possibilities of course.
The below one is probably not the most performant one, but a good and clear one.

data have;
input Year Student1 Student2;
cards;
2015 1 0
2016 0 1
2017 1 1
;
run;

data Student1(rename=(Student1=Var));
 set have(keep=Year Student1);
 Student="Student1";
run;
data Student2(rename=(Student2=Var));
 set have(keep=Year Student2);
 Student="Student2";
run;

data want;
 set Student: ;
run;
/* end of program */

Regards,

Koen

hkim3677
Calcite | Level 5
Thank you Koen for your suggestion!
ballardw
Super User

One way.

Best is to provide example data in the form of a data step as shown below. This is critical because if for some reason Student1 has numeric values and Student2 has character values this will not work and a different approach would be needed.

data have;
   input Year  Student1 Student2;    
datalines;
2015          1                     0             
2016          0                     1        
2017          1                     1      
;
/* may need to sort have by YEAR if not actually so*/
proc transpose data=have out=want(rename=(col1=var))
   name=Student ;
   by year;
   var student: ;
run;

Proc sort data=want;
   by student year;
run;

Proc Transpose is designed for making columns into rows or rows into columns. Simple data like this is pretty easy. BY varaible(s) would be any group that keeps like values together but the data does need to be sorted by those.

The VAR statement lists specific variables to transpose. In this case I used a list with a Colon to indicate all variables whose names start with "Student". I am guessing that you very likely have more than 2 this gets you out of listing all of them.

Proc Transpose will have a variable by default on each row with the name of the transposed variable from the VAR statement. The Name= procedure option allows you to rename that variable. The default behavior is to create variables named Col1, Col2, Col3 etc. if needed. This would happen if Student1 had more than one value per year. Here I use a data set option on the set written by the procedure to rename the Col1 to Var.

Then sort as the order would be different than desired.

hkim3677
Calcite | Level 5
Thank you! I applied this code to my original data file. and it worked out!

Thank you ballardw!
mkeintz
PROC Star

I don't think you

 

/* may need to sort have by YEAR if not actually so*/

Just use

   by year notsorted;

in the TRANSPOSE procedue.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 672 views
  • 0 likes
  • 4 in conversation