DATA Step, Macro, Functions and more

[University Edition] Beginner: Using input to create multiple character columns

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

[University Edition] Beginner: Using input to create multiple character columns

[ Edited ]

I'm trying to read in a 51 row by 151 column .csv file

  • The first column is Student ID #
  • The remaining columns are the questions of an exam
  • The first row is the key to the exam
  • The remaining rows are the answers submitted by the student

Here is the the first portion of the data in Excel.

image.pngIs there a way to importing the data with the columns being character vectors without needing to state each individual column as a character?

When I try to read in the date with the code below, only Student and Q150 are character vectors. The cells of Q1 through Q149 are blank (to be more accurate, they're periods). [Edit: They are blank because they are still numeric]

data Exam_Results;
	infile "filepath.csv" dlm="," dsd;
	input Student $ Q1-Q150 $;
run;

 

What I'm trying to avoid is something obnoxious like

 

data Exam_Results;
	infile "filepath.csv" dlm="," dsd;
	input Student $ Q1 $ Q2 $ Q3 $ Q4 $ Q5 $ Q6 $ Q7 $ Q8 $ Q9 $ Q0 $ Q10 $ ... Q149 $ Q150 $;
run;

 

Is there some easier, more concise to get all of the columns imported as characters?

 

Thanks in advanced everyone.


Accepted Solutions
Solution
‎10-13-2017 04:00 PM
Super User
Posts: 24,004

Re: [University Edition] Beginner: Using input to create multiple character columns

Posted in reply to thorifyer

Lazy import method.

1. Use PROC IMPORT, set GUESSINGROWS to MAX. 

2. Copy code from log, removing line numbers and modify that code as required.

 

If you want to manually specify a bunch of columns you can do so with the following:

 

input Student $ (Q1-Q150) $ ;

I would possibly consider doing two reads personally, the first one would be just the first row with the key and a second with the answers from students. You can use the FIRSTOBS option along with the OBS option to limit what you read.

View solution in original post


All Replies
Solution
‎10-13-2017 04:00 PM
Super User
Posts: 24,004

Re: [University Edition] Beginner: Using input to create multiple character columns

Posted in reply to thorifyer

Lazy import method.

1. Use PROC IMPORT, set GUESSINGROWS to MAX. 

2. Copy code from log, removing line numbers and modify that code as required.

 

If you want to manually specify a bunch of columns you can do so with the following:

 

input Student $ (Q1-Q150) $ ;

I would possibly consider doing two reads personally, the first one would be just the first row with the key and a second with the answers from students. You can use the FIRSTOBS option along with the OBS option to limit what you read.

New Contributor
Posts: 3

Re: [University Edition] Beginner: Using input to create multiple character columns

When I run

 

data Exam_Results;
	infile "/folders/myfolders/124/Final Project/FormA.csv" dlm=",";
	input Student_ID $ (Q1-Q150) $ ;
run;

I get this error in my log.

1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 61         
 62         data Exam_Results;
 63         infile "/folders/myfolders/124/Final Project/FormA.csv" dlm=",";
 64         input Student_ID $ (Q1-Q150) $ ;
                                          _
                                          79
                                          76
 ERROR 79-322: Expecting a (.
 
 ERROR 76-322: Syntax error, statement will be ignored.
 
 65         run;

Am I just missing something here?

 

New Contributor
Posts: 3

Re: [University Edition] Beginner: Using input to create multiple character columns

[ Edited ]

Oh wait, it just needs to be

input Student $ (Q1-Q150) ($) ;

Thank you Reeza!

Super User
Super User
Posts: 8,278

Re: [University Edition] Beginner: Using input to create multiple character columns

[ Edited ]
Posted in reply to thorifyer

Just define your variables before trying to use them in the INPUT statement. 

data Exam_Results;
  length Student $20 Q1-Q150 $1 ;
  infile "filepath.csv" dsd truncover;
  input Student Q1-Q150 ;
run;

In most cases it is better to define your variables instead of forcing SAS to guess at what you wanted them to be based on how you first reference the variable in some other statement whether it is INPUT, FORMAT, INFORMAT or an assignment statement.  Plus once the variables are defined your INPUT statement is much easier to write.

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 171 views
  • 6 likes
  • 3 in conversation