I've got data in excel files that looks like this:
_recordId | QID122_TEXT | QID124_TEXT | ||
Response ID | Number of Responses | Q122 - What is the name of this item? | Q124 - In the space below, list as many creative uses as you can think of for this... | Category |
R_3imYisBaKj1DltD | 1 | bar | stretching | 4 |
R_2q7PvEtAwpPZf9F | 2 | barre | ballet | 2 |
clothes | 0 | |||
R_3QVKnlbMIE7CHZO | 4 | a bar | to hold while squatting | 1 |
to balance your leg | 1 | |||
to steady your balance | 1 | |||
to use as a prop | 9 | |||
R_eRGndjYN0wEIFkR | 3 | barre | horse tie station | 10 |
clothes rack, | 5 | |||
barricade, | 8 | |||
R_2zBKJc4IOSw1bhb | 2 | dance bar | stretching | 4 |
dancing | 2 | |||
R_1CaC6q2dVvHAtyN | 4 | balance bean | do stretches | 4 |
squats | 4 | |||
help hold balance | 1 | |||
clothes rack | 5 |
The number of responses to the stimulus is a variable.
Based on the number of responses each subject has that many rows of data.
Two sheets per excel file with the same subjects responding to different stimuli.
What is the easiest way to get this data from excel to SAS?
SAS 9.4
What is the result supposed to look like? Does each Excel file represent a different topic? Does each only have the response identification, number of responses, two text fields and a category or do they change content layout between files?
Are you planning on combining this data together? Are the response IDs the same across the data sets?
If yes to all of those then it may be possible to generate a similar program to read each of those files but the first step is going to be to save them as CSV files so you control reading them. The Number of responses and categores may need a different variable name in each to differentiate when combined. You will also want to identify the longest (most characters used) in each of the columns response id, and the two text columns. The second row would provide LABELS for the variables.
The generic plan would be to read all of the files and then merge or join on the response ID field.
data QID122; infile "path/qid122Page1.csv" dlm=',' lrecl=500 obs=3; informat _recordid $25. QID122_Num best5. QID122_text $50. /* the 50 and 100 are "guesses" as to possible maximum lengths*/ QID124_text $100. /* for these to values, set as needed */ QID122_Cat best5. ; input _recordid QID122_Num QID122_text QID124_text QID122_Cat ; label _recordid ='Response Id' QID122_Num ="Number of Q122 Responses" QID122_text ="Q122 - What is the name of this item?" QID124_text ="Q124 - In the space below, list as many creative uses as you can think of for this..." QID122_Cat ="Q122 category" ; run;
Replace the dataset name, infile name to reflect the saved CSV file and various 122 and 124 pieces as needed to read each file. This is the part that is simplified by identifying the longest value across the files.
One way to combine the data (if the response Ids are matching between the sheets):
Proc sort data=QID122; by responseId; run; /* repeat sort for each data set*/ data combined; merge QID122 QID125 /* place all of the data set names here*/ ; by responseid; run;
What is the result supposed to look like? Does each Excel file represent a different topic? Does each only have the response identification, number of responses, two text fields and a category or do they change content layout between files?
Are you planning on combining this data together? Are the response IDs the same across the data sets?
If yes to all of those then it may be possible to generate a similar program to read each of those files but the first step is going to be to save them as CSV files so you control reading them. The Number of responses and categores may need a different variable name in each to differentiate when combined. You will also want to identify the longest (most characters used) in each of the columns response id, and the two text columns. The second row would provide LABELS for the variables.
The generic plan would be to read all of the files and then merge or join on the response ID field.
data QID122; infile "path/qid122Page1.csv" dlm=',' lrecl=500 obs=3; informat _recordid $25. QID122_Num best5. QID122_text $50. /* the 50 and 100 are "guesses" as to possible maximum lengths*/ QID124_text $100. /* for these to values, set as needed */ QID122_Cat best5. ; input _recordid QID122_Num QID122_text QID124_text QID122_Cat ; label _recordid ='Response Id' QID122_Num ="Number of Q122 Responses" QID122_text ="Q122 - What is the name of this item?" QID124_text ="Q124 - In the space below, list as many creative uses as you can think of for this..." QID122_Cat ="Q122 category" ; run;
Replace the dataset name, infile name to reflect the saved CSV file and various 122 and 124 pieces as needed to read each file. This is the part that is simplified by identifying the longest value across the files.
One way to combine the data (if the response Ids are matching between the sheets):
Proc sort data=QID122; by responseId; run; /* repeat sort for each data set*/ data combined; merge QID122 QID125 /* place all of the data set names here*/ ; by responseid; run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.