BookmarkSubscribeRSS Feed
CeciliaE
Fluorite | Level 6

Hi fellow SAS EG users!

 

I've been researching this problem for a while, and haven't found a solution yet. I'm hoping you can help.

 

I work with data from external agencies quite often, and each time they send us a csv file, it's in the exact same format. Same columns, same order, same field names. 

 

I've been using the Import Wizard in SAS EG to import these files, but as there are 33 columns and I have to manually specify the datatype of each field, this is tedious and time consuming. Here's the desired format of each column:

Name

Type

Length

Format

Informat

Label

Your_Unique_Identifier

Character

1

$CHAR1.

$CHAR1.

Your Unique Identifier

First_Name

Character

16

$CHAR16.

$CHAR16.

First Name

Middle_Initial

Character

1

$CHAR1.

$CHAR1.

Middle Initial

Last_Name

Character

20

$CHAR20.

$CHAR20.

Last Name

Name_Suffix

Character

1

$CHAR1.

$CHAR1.

Name Suffix

Requester_Return_Field

Character

20

$CHAR20.

$CHAR20.

Requester Return Field

Record_Found_Y_N

Character

1

$CHAR1.

$CHAR1.

Record Found Y/N

Search_Date

Date

8

YYMMDD10.

YYMMDD10.

Search Date

College_Code_Branch

Character

9

$CHAR9.

$CHAR9.

College Code/Branch

College_Name

Character

50

$CHAR50.

$CHAR50.

College Name

College_State

Character

2

$CHAR2.

$CHAR2.

College State

_2_year___4_year

Character

1

$CHAR1.

$CHAR1.

2-year / 4-year

Public___Private

Character

7

$CHAR7.

$CHAR7.

Public / Private

Enrollment_Begin

Date

8

YYMMDD10.

YYMMDD10.

Enrollment Begin

Enrollment_End

Date

8

YYMMDD10.

YYMMDD10.

Enrollment End

Enrollment_Status

Character

1

$CHAR1.

$CHAR1.

Enrollment Status

Class_Level

Character

1

$CHAR1.

$CHAR1.

Class Level

Enrollment_Major_1

Character

78

$CHAR78.

$CHAR78.

Enrollment Major 1

Enrollment_CIP_1

Numeric

8

BEST6.

BEST6.

Enrollment CIP 1

Enrollment_Major_2

Character

60

$CHAR60.

$CHAR60.

Enrollment Major 2

Enrollment_CIP_2

Numeric

8

BEST6.

BEST6.

Enrollment CIP 2

Graduated_

Character

1

$CHAR1.

$CHAR1.

Graduated?

Graduation_Date

Date

8

YYMMDD10.

YYMMDD10.

Graduation Date

Degree_Title

Character

69

$CHAR69.

$CHAR69.

Degree Title

Degree_Major_1

Character

80

$CHAR80.

$CHAR80.

Degree Major 1

Degree_CIP_1

Numeric

8

BEST6.

BEST6.

Degree CIP 1

Degree_Major_2

Character

49

$CHAR49.

$CHAR49.

Degree Major 2

Degree_CIP_2

Numeric

8

BEST6.

BEST6.

Degree CIP 2

Degree_Major_3

Character

34

$CHAR34.

$CHAR34.

Degree Major 3

Degree_CIP_3

Numeric

8

BEST6.

BEST6.

Degree CIP 3

Degree_Major_4

Character

20

$CHAR20.

$CHAR20.

Degree Major 4

Degree_CIP_4

Numeric

8

BEST6.

BEST6.

Degree CIP 4

College_Sequence

Numeric

8

BEST2.

BEST2.

College Sequence

 

Is there a way to create a template for importing these kinds of files so I don't have to manually import them each time? I've tried using PROC IMPORT:

PROC IMPORT DATAFILE= "C:\MyUsername\Desktop\filename.csv"
OUT= outdata
DBMS=csv
REPLACE;
GETNAMES=YES;
RUN;

But got the following error message:

NOTE: Unable to open parameter catalog: SASUSER.PARMS.PARMS.SLIST in update mode. Temporary parameter values will be saved to WORK.PARMS.PARMS.SLIST.
ERROR: Physical file does not exist, C:\MyUsername\Desktop\filename.csv.

 

I found this post explaining how to fix this error, but it seems like my SASUSER Directory is locked to me.

I've also tried using INFILE based on this page  I found on the SAS Help Center,  but got the same error as before ("Physical file does not exist"):

/*infile*/
data output;                   
   length file2read $60;                         /**/
   input file2read $;                            /**/
   infile datalines;                             /**/
   infile dummy filevar =  file2read end = done; /**/
                                                
   do while(not done);                           /**/
      input "Your Unique Identifier"n $ "First Name"n;                        
      output;                                    /**/
   end;                                         
datalines;                                       
C:\MyUsername\Desktop\filename.csv
;
proc print data=output; run;

Let me know of something else I can try. Anything helps!

8 REPLIES 8
Reeza
Super User

Use excel to build the data import steps. 

Copy/paste into Excel. 

Build formulas using:

`=CONCATENATE("Informat ", A2, " ", E2, ";")

Repeat for format/label/length.

 

data importFile;

infile ......;

Informat Your_Unique_Identifier $CHAR1.;	format Your_Unique_Identifier $CHAR1.;	label Your_Unique_Identifier 'Your Unique Identifier';
Informat First_Name $CHAR16.;	format First_Name $CHAR16.;	label First_Name 'First Name';
Informat Middle_Initial $CHAR1.;	format Middle_Initial $CHAR1.;	label Middle_Initial 'Middle Initial';
Informat Last_Name $CHAR20.;	format Last_Name $CHAR20.;	label Last_Name 'Last Name';
Informat Name_Suffix $CHAR1.;	format Name_Suffix $CHAR1.;	label Name_Suffix 'Name Suffix';
Informat Requester_Return_Field $CHAR20.;	format Requester_Return_Field $CHAR20.;	label Requester_Return_Field 'Requester Return Field';
Informat Record_Found_Y_N $CHAR1.;	format Record_Found_Y_N $CHAR1.;	label Record_Found_Y_N 'Record Found Y/N';
Informat Search_Date YYMMDD10.;	format Search_Date YYMMDD10.;	label Search_Date 'Search Date';
Informat College_Code_Branch $CHAR9.;	format College_Code_Branch $CHAR9.;	label College_Code_Branch 'College Code/Branch';
Informat College_Name $CHAR50.;	format College_Name $CHAR50.;	label College_Name 'College Name';
Informat College_State $CHAR2.;	format College_State $CHAR2.;	label College_State 'College State';
Informat _2_year___4_year $CHAR1.;	format _2_year___4_year $CHAR1.;	label _2_year___4_year '2-year / 4-year';
Informat Public___Private $CHAR7.;	format Public___Private $CHAR7.;	label Public___Private 'Public / Private';
Informat Enrollment_Begin YYMMDD10.;	format Enrollment_Begin YYMMDD10.;	label Enrollment_Begin 'Enrollment Begin';
Informat Enrollment_End YYMMDD10.;	format Enrollment_End YYMMDD10.;	label Enrollment_End 'Enrollment End';
Informat Enrollment_Status $CHAR1.;	format Enrollment_Status $CHAR1.;	label Enrollment_Status 'Enrollment Status';
Informat Class_Level $CHAR1.;	format Class_Level $CHAR1.;	label Class_Level 'Class Level';
Informat Enrollment_Major_1 $CHAR78.;	format Enrollment_Major_1 $CHAR78.;	label Enrollment_Major_1 'Enrollment Major 1';
Informat Enrollment_CIP_1 BEST6.;	format Enrollment_CIP_1 BEST6.;	label Enrollment_CIP_1 'Enrollment CIP 1';
Informat Enrollment_Major_2 $CHAR60.;	format Enrollment_Major_2 $CHAR60.;	label Enrollment_Major_2 'Enrollment Major 2';
Informat Enrollment_CIP_2 BEST6.;	format Enrollment_CIP_2 BEST6.;	label Enrollment_CIP_2 'Enrollment CIP 2';
Informat Graduated_ $CHAR1.;	format Graduated_ $CHAR1.;	label Graduated_ 'Graduated?';
Informat Graduation_Date YYMMDD10.;	format Graduation_Date YYMMDD10.;	label Graduation_Date 'Graduation Date';
Informat Degree_Title $CHAR69.;	format Degree_Title $CHAR69.;	label Degree_Title 'Degree Title';
Informat Degree_Major_1 $CHAR80.;	format Degree_Major_1 $CHAR80.;	label Degree_Major_1 'Degree Major 1';
Informat Degree_CIP_1 BEST6.;	format Degree_CIP_1 BEST6.;	label Degree_CIP_1 'Degree CIP 1';
Informat Degree_Major_2 $CHAR49.;	format Degree_Major_2 $CHAR49.;	label Degree_Major_2 'Degree Major 2';
Informat Degree_CIP_2 BEST6.;	format Degree_CIP_2 BEST6.;	label Degree_CIP_2 'Degree CIP 2';
Informat Degree_Major_3 $CHAR34.;	format Degree_Major_3 $CHAR34.;	label Degree_Major_3 'Degree Major 3';
Informat Degree_CIP_3 BEST6.;	format Degree_CIP_3 BEST6.;	label Degree_CIP_3 'Degree CIP 3';
Informat Degree_Major_4 $CHAR20.;	format Degree_Major_4 $CHAR20.;	label Degree_Major_4 'Degree Major 4';
Informat Degree_CIP_4 BEST6.;	format Degree_CIP_4 BEST6.;	label Degree_CIP_4 'Degree CIP 4';
Informat College_Sequence BEST2.;	format College_Sequence BEST2.;	label College_Sequence 'College Sequence';

input your_unique_identifier -- college_sequence;

run;

Add INPUT statement. 

 

Less than 5 minutes total.

 

Alternatively you could build a SAS program that read the file as shown and created the data import step, but @Tom is better at that than I am 🙂

 

CeciliaE
Fluorite | Level 6

Thanks @Reeza! I appreciate you taking the time to get back to me.

Tom
Super User Tom
Super User

it's in the exact same format. Same columns, same order, same field names. 

This is a trivial problem.  Write the code to read ONE of the files.  It will looks something like this:

data want;
  infile "myfile.csv" firstobs=2 dsd truncover ;
  length
   Your_Unique_Identifier $1
  ...
   Search_Date 8
   College_Code_Branch $9
  ...
   Enrollment_CIP_1 8
  ...
  ;
  format Search_Date Enrollment_Begin Enrollment_End Graduation_Date yymmdd10.;
  informat Search_Date Enrollment_Begin Enrollment_End Graduation_Date yymmdd.;
  label
    Your_Unique_Identifier='Your Unique Identifier'
    ...
  ;
  input Your_Unique_Identifier -- College_Sequence;
run;

You could probably read the descriptor file and generate the code, but why bother since you only have to create it ONCE.

 

Now to run this for some other file just change the name of the file being read and perhaps the name of the dataset being written.  For example by using macro variables.

%let dsname=want;
%let filename=myfile.csv;
data &dsname;
  infile "&filename" firstobs=2 dsd truncover ;
...

 

CeciliaE
Fluorite | Level 6

Thanks, @Tom!

I tried the code you sent, and a dataset with the correct field names and datatypes prints out, but it's empty. I'm getting the following error:

ERROR: Physical file does not exist, C:\Users\celhaddad\Desktop\OM_EA_FY15_NSC_V20220118.csv.

I know the filepath is correct because I'm copying and pasting it.

I've Googled the error and browsed through about half a dozen articles, and it seems like the issue is that SAS can't connect directly to my C: Drive. (I did find one solution, but I'm not sure how I'd get this to work in my situation.) Do you have any ideas about how I can get around this?

 

Thanks again for your help!

Reeza
Super User
If you're on EG on a server you'll need to upload the CSV unfortunately and then use the network path on EG. There should be an upload task or you may need to talk to someone at your company about how to do this.
Tom
Super User Tom
Super User

If you are using Enterprise Guide as a front end for creating and running SAS code then SAS itself is probably running on some other machine.  There should a Copy Files type TASK in EG that you can use to upload the file from your C drive to some place on the server were SAS is running so that SAS can read it.

ballardw
Super User

@CeciliaE wrote:

Thanks, @Tom!

I tried the code you sent, and a dataset with the correct field names and datatypes prints out, but it's empty. I'm getting the following error:

ERROR: Physical file does not exist, C:\Users\celhaddad\Desktop\OM_EA_FY15_NSC_V20220118.csv.

I know the filepath is correct because I'm copying and pasting it.

I've Googled the error and browsed through about half a dozen articles, and it seems like the issue is that SAS can't connect directly to my C: Drive. (I did find one solution, but I'm not sure how I'd get this to work in my situation.) Do you have any ideas about how I can get around this?

 

Thanks again for your help!


If your SAS is running on a server then the file needs to be someplace that the server can see the file.

That could be

1) move the file to a location the server can see using any of a number of operating system tools

2) have your PC set up so that the server can read from the folder(s) where you store the text files. This would involve your organizations IT staff.

3) create a new storage location that both your PC and the Server can use. Your PC would need write privileges, the server would only need read privileges. Move or copy the files to this new location. A cloud storage service like OneDrive should work.

4) With EG there is a task that will upload a file to the server, but I am not sure if this is the best or not as I think it would still be an added manual step.

CeciliaE
Fluorite | Level 6

Thank you! This is really helpful.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1475 views
  • 4 likes
  • 4 in conversation