BookmarkSubscribeRSS Feed
daisy6
Quartz | Level 8

Hi SAS experts,

 

I need to import the data file from survey monkey into SAS. The first two rows of the survey monkey data excel file are the headers which are more than 1000 columns. The most first row headers are the questions such as what is your highest education? How important are these mathematics knowledge topics for you to take a CFA Exam? How important are these probability and statistics knowledge topics for you to ake CFA Exam? Rate them from 0 (not important) to 4 (very important)? The belows are the first two row headers I transposed.

 

For example:

col1                                                                                                            Col2

What is your highest education degree attained?                                     Reponse

1. MathematicsHow importantare these mathematics                              1.A Analytic geometry

knowledge topics for you to take a CFA Exam?

                                                                                                                  1.B Roots of equations

                                                                                                                  1.C Calculus

                                                                                                                  1.D Differential equations

                                                                                                                   ........

                                                                                                                  1.H Linear algebra

What important topics in mathematics, if any, were                                   Topic 1

not included but should have been?

                                                                                                                   Topic 2

                                                                                                                   Topic 3

..........................

I need the combination of two columns value and set variable names to calculate the survey results. Any suggestions are really appreciated.

 

                                                                    

3 REPLIES 3
Reeza
Super User

1. Read the file from the second row which has good names. You can use GETNAMES or RANGE (if XLSX) to specify this.

2. Read the first and second row as its own data

3. Transpose the data from #2

4. Use that to create LABELS for your data, from a data set as specified here:

https://gist.github.com/statgeek/f18931085f6a0009185c

 

5.I eventually transpose my data so that the questions go down the rows and the responses are a single column. This really helps me build summary tables very quickly using PROC MEANS and PROC FREQ. 

 

data have;
input ID QUestion Answer;
cards;
1 1 2
1 2 3
1 3 1
2 1 3
2 2 4
2 3 5
...

;
run;

proc means data=have n mean std median stackods;
class question;
var answer;
ods output summary=want;
run;
ballardw
Super User

Pay close attention to the columns if you want to use proc import to bring the data into SAS. Survey monkey sometimes duplicates column headings, especially if you only use one of the header rows.

 

Generally I convert the files to CSV where I can use a data step to import the data. I would tend to use Proc import to generate a basic data step that I could modify. Which will take time for 1000 variables.

 

I have been known to copy the column headings out of the spreadsheet and paste transposed into a separate sheet. That sheet would let me assign nicer variable names and then use the remaining text to make variable labels and possibly informat/format statements.

 

I have attached (hopefully) a brief example where I add a much shorter variable name and some formula cells to create SAS syntax.

The columns labeled informat input and label can, if you add a row at the end of your data with a ; in each cell, can be copied, one column at a time and pasted into a data step to read the data. The informat should come before the input.

You can see where you can drag formulae down and the assigned variable name would create lists of common variables such as Q1_1 Q1_2. I do not use A B or C as suffixes as they are much harder to work with in SAS. The numeric suffixes work well for array processing. It would be your choice whether to use Q01_01 or Q1_1 names for sorting purposes. A "missing topic" question system for your example might name the variables Q1m_1 (or Q01m_01). The underscores may be very important if you have enough questions so that you can tell the difference between Q2_11 and Q21_1. The naming conventions allowing either Array processing or variable lists Q1m: being shorthand for all the Q1 missing topics.

 

You could make a LONGER label using both Row1 and Row2 text but since the Row 1 is your original full question text I have seen the length exceed the number of characters a SAS label will hold. Which is a secondary reason for the naming convention. I could use a title with the row1 text and then run proc freq for just the Q1_: variables.

daisy6
Quartz | Level 8

Thank you very much for answering my question. I will try your way once I have time to do it. 

sas-innovate-2024.png

Today is the last day to save with the early bird rate! Register today for just $695 - $100 off the standard rate.

 

Plus, pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 2516 views
  • 2 likes
  • 3 in conversation