04-30-2018 04:26 PM
Hi SAS experts,
I need to import the data file from survey monkey into SAS. The first two rows of the survey monkey data excel file are the headers which are more than 1000 columns. The most first row headers are the questions such as what is your highest education? How important are these mathematics knowledge topics for you to take a CFA Exam? How important are these probability and statistics knowledge topics for you to ake CFA Exam? Rate them from 0 (not important) to 4 (very important)? The belows are the first two row headers I transposed.
What is your highest education degree attained? Reponse
1. MathematicsHow importantare these mathematics 1.A Analytic geometry
knowledge topics for you to take a CFA Exam?
1.B Roots of equations
1.D Differential equations
1.H Linear algebra
What important topics in mathematics, if any, were Topic 1
not included but should have been?
I need the combination of two columns value and set variable names to calculate the survey results. Any suggestions are really appreciated.
04-30-2018 04:53 PM
1. Read the file from the second row which has good names. You can use GETNAMES or RANGE (if XLSX) to specify this.
2. Read the first and second row as its own data
3. Transpose the data from #2
4. Use that to create LABELS for your data, from a data set as specified here:
5.I eventually transpose my data so that the questions go down the rows and the responses are a single column. This really helps me build summary tables very quickly using PROC MEANS and PROC FREQ.
data have; input ID QUestion Answer; cards; 1 1 2 1 2 3 1 3 1 2 1 3 2 2 4 2 3 5 ... ; run; proc means data=have n mean std median stackods; class question; var answer; ods output summary=want; run;
04-30-2018 06:15 PM
Pay close attention to the columns if you want to use proc import to bring the data into SAS. Survey monkey sometimes duplicates column headings, especially if you only use one of the header rows.
Generally I convert the files to CSV where I can use a data step to import the data. I would tend to use Proc import to generate a basic data step that I could modify. Which will take time for 1000 variables.
I have been known to copy the column headings out of the spreadsheet and paste transposed into a separate sheet. That sheet would let me assign nicer variable names and then use the remaining text to make variable labels and possibly informat/format statements.
I have attached (hopefully) a brief example where I add a much shorter variable name and some formula cells to create SAS syntax.
The columns labeled informat input and label can, if you add a row at the end of your data with a ; in each cell, can be copied, one column at a time and pasted into a data step to read the data. The informat should come before the input.
You can see where you can drag formulae down and the assigned variable name would create lists of common variables such as Q1_1 Q1_2. I do not use A B or C as suffixes as they are much harder to work with in SAS. The numeric suffixes work well for array processing. It would be your choice whether to use Q01_01 or Q1_1 names for sorting purposes. A "missing topic" question system for your example might name the variables Q1m_1 (or Q01m_01). The underscores may be very important if you have enough questions so that you can tell the difference between Q2_11 and Q21_1. The naming conventions allowing either Array processing or variable lists Q1m: being shorthand for all the Q1 missing topics.
You could make a LONGER label using both Row1 and Row2 text but since the Row 1 is your original full question text I have seen the length exceed the number of characters a SAS label will hold. Which is a secondary reason for the naming convention. I could use a title with the row1 text and then run proc freq for just the Q1_: variables.