BookmarkSubscribeRSS Feed
Chris_C
Calcite | Level 5

 While working with an excel document I want to condense the data into one row by adding variables Q1_Hours and Q1_Percent through Q6_Hours and Q6_Percent. I am confused because the normal ways to create variables through the INPUT after the data step is not working. One section of the data looks as follows (In the excel document). Anyone have any advice or tips?

 

YearExamDayTimeTypeQ1Q2Q3Q4Q5Q6Overall
2B4MidHours0.013.37.50.00.07.528.3
    Percent0.098.638.90.00.057.3

32.5

5 REPLIES 5
r_behata
Barite | Level 11

Hi @Chris_C what is the format of your input file (xlsx or csv?) . If it is in excel native format you may have to use Proc Import to create a sas dataset and use the datastep for further processing.

 

Post the code that that you have already tried .

Chris_C
Calcite | Level 5

The file is .xlsx and here is the code I am using @r_behata.

 

PROC IMPORT OUT= WORK.ExamBDay4_0
            DATAFILE= "\\Client\D$\SASDATA\Year_2_Exam_B_Day_4_0-morning.xlsx"
            DBMS=EXCELCS REPLACE;
     RANGE="Sheet1$";
RUN;
DATA ExamBDay4_0;

SET ExamBDay4_0;
INPUT Q1_Hours Q1 Percent Q2_Hours Q2_Percent Q3_Hours Q3_Percent Q4_Hours Q4_Percent Q5_Hours Q5_Percent Q6_Hours Q6_Percent Overall_Hours Overall_Percent;
RUN;
PROC PRINT DATA=ExamBDay4_0;
RUN;QUIT;

Tom
Super User Tom
Super User

The INPUT statement is for reading values from text (file or in-line data lines).  So it will not be of any help in this case since you do not have any text to read. The PROC IMPORT step will have already created a dataset for you to use.

 

To change/create variables just use assignment statements.

 

Show an example of what the data that PROC IMPORT created looks like.

Your picture of the layout of the Excel makes it look like you do not have key values on all observations so you might first need to fill those in before attempting to transpose the data.

 

Also why do you want to put the data into such a wide format? It probably will be easier to work with if you created a tall format instead.

Something like:

data want ;
  input Year Exam $ Day Time $ Period $ Hours Percent ;
cards;
2 B 4 Mid Q1 0 0
2 B 4 Mid Q2 13.3 98.6
...
;

 

Chris_C
Calcite | Level 5

The Data is not mine I am just doing some volunteer work to help an old high school teacher out so I am hesitant to alter the data that was provided to me. There are multiple excel sheets for classes so I planned on stacking them together and the easiest way to read would be if they were in the wide rows.   @Tom

 

Obs Year Exam Day Time Type   Q1 Q2   Q3  Q4  Q5  Q6  Overall 
1 .   .               Percent 0  29.7 0    0   0  0    4.9 
2     2  B   2    Mid   Hours 0  5.8  0    0   0  0    5.8 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Thats typical garbage Excel data unfortunately.  You will need to do some post processing to get it into a working state.  So you have this:

proc import out=work.exambday4_0
            datafile="\\Client\D$\SASDATA\Year_2_Exam_B_Day_4_0-morning.xlsx"
            bdms=excelcs replace;
     range="Sheet1$";
run;

And note how I use the code window - its the {i} above post area - and don't code all in shouting.

To get the data you show there.  First you need to clear up the missing year/exam/day/time data (done by coalesce() function below) and merge the two rows so you have one row.

proc sql;
  create table want as 
  select coalesce(a.year,b.year) as year,
         coalesce(a.exam,b.exam) as exam,
         coalesce(a.day,b.day) as day,
         coalesce(a.time,b.time) as time,
         a.q1 as p_q1,
         a.q2 as p_q2,
         a.q3 as p_q3,
         a.q4 as p_q4,
         a.q5 as p_q5,
         a.q6 as p_q6,
         a.overall as overall_p,
         b.q1 as h_q1,
         b.q2 as h_q2,
         b.q3 as h_q3,
         b.q4 as h_q4,
         b.q5 as h_q5,
         b.q6 as h_q6,
         b.overall as overall_h
  from   (select * from have where type="Percent") a
  full join (select * from have where type="Hours") b
  on     1=1;
quit;

You can then do further processsing on this data, such as stacking all sheets (with which you have done the above) and summing something like:

data want;
  set want;
  hours=sum(of h_:);
perc=sum(of p_:);
run;

Take it as a good example of why not to use Excel for data storage! 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 5319 views
  • 0 likes
  • 4 in conversation