Help using Base SAS procedures

Using PROC IMPORT and define field names and variable types

Accepted Solution Solved
Reply
Contributor
Posts: 26
Accepted Solution

Using PROC IMPORT and define field names and variable types

I'm trying to convert some code to use SAS on demand so I need to use PROC IMPORT.  I had this in my old statement, that defined column names, and the type of variables.  How do I do this with PROC IMPORT?

old code:

DATA rawdata;

FILENAME DATA1 DDE "EXCEL|[rawdatav2-new-line-names-checks-forSAS-code-segs.xlsx]forsas!R2C1:R6301C13";

INFILE DATA1 NOTAB DLM= '09'X DSD MISSOVER lrecl = 10240;

INPUT straw_wt panicle_wt DateEntered $ notes $ height DateEntered2 $ rep block row col plant line $ plot_ID_final;

CARDS;

proc print;

RUN;


new code:

PROC IMPORT

  OUT= WORK.rawdata

  DATAFILE= "/home/paul.tanger140/RILpop2013phenotypes/rawdatav2-new-line-names-checks-forSAS-code-segs.xlsx"

    DBMS=xlsx REPLACE;

    SHEET="forsas";

    GETNAMES=YES; * use names in file.. bad if names don't conform to stupid SAS rules (alphanumeric and underscores ONLY)

RUN;




Accepted Solutions
Solution
‎03-16-2015 08:28 PM
Super User
Posts: 3,254

Re: Using PROC IMPORT and define field names and variable types

The problem with SAS guessing rows is removed because your code becomes something like this for reading a tab-delimited file (much the same as DDE):

DATA rawdata;

FILENAME DATA1 "c:\MyFolder\rawdatav2-new-line-names-checks-forSAS-code-segs.csv";

INFILE DATA1 NOTAB DLM= '09'X DSD MISSOVER lrecl = 10240;

INPUT straw_wt panicle_wt DateEntered $ notes $ height DateEntered2 $ rep block row col plant line $ plot_ID_final;

run;

View solution in original post


All Replies
Respected Advisor
Posts: 3,799

Re: Using PROC IMPORT and define field names and variable types

EXCEL is NOT a database.

Contributor
Posts: 26

Re: Using PROC IMPORT and define field names and variable types

Posted in reply to data_null__

I have no clue how your comment helps solve my issue. I did not create the old code. It was given to me, and it works, thats all that I care about.  Unfortunately it doesn't work in SAS on demand.

I don't have much time or patience to learn the painful intricacies and nuances of SAS.  The help pages are impossible to navigate compared to other tools like R.  So can someone just tell me what I need to do to make SAS treat my variables correctly and allow me to rename them with "SAS valid" names?

Respected Advisor
Posts: 3,799

Re: Using PROC IMPORT and define field names and variable types

Did you try the PROC IMPORT that you posted?  What was the result? 

Contributor
Posts: 26

Re: Using PROC IMPORT and define field names and variable types

Posted in reply to data_null__

Yes, it works, except it defines variables as numeric or char, and I need to rename some fields since they don't have "SAS approved" names.

Respected Advisor
Posts: 3,799

Re: Using PROC IMPORT and define field names and variable types

Variables in SAS are either numeric or character.  PROC IMPORT guesses at the data-type based on the data contained in the cells.  If you don't like the guess you can convert.  Perhaps adding OPTIONS validvarname=any; will help with the names, you can also use the RENAME statement after the data are IMPORTed..  SAS has rules for variable names you can't change that.  EXCEL doesn't even have variables.  Just because a SAS data set has some resemblance to certain EXCEL sheets doesn't really mean anything.

Contributor
Posts: 26

Re: Using PROC IMPORT and define field names and variable types

Posted in reply to data_null__

All I know is this line accomplished what I needed:

INPUT straw_wt panicle_wt DateEntered $ notes $ height DateEntered2 $ rep block row col plant line $ plot_ID_final;


I was able to name the fields in SAS regardless of what the first row in excel was, and I was able to force categorical variables even though SAS wanted them to be numeric, and would treat them as continuous in later steps.  For example, the 12th field, "line" is line numbers, but need to be treated as categorical not continuous.

Thank you for your patience with me, but I'm still not quite sure how to replicate this one line of code with PROC IMPORT which I need to use now.  So I guess my questions are:

  1. How do I convert from SAS guess to whatever I want for a field type?
  2. How do I use the RENAME statement?  Ideally with a simple example so I don't have to navigate the SAS help pages which I can never do.  I tried googling, and found the MODIFY statement but couldn't figure out how to use it.
Super User
Super User
Posts: 7,050

Re: Using PROC IMPORT and define field names and variable types

Save you data in a text file instead.  You can save from Excel to Tab delimited or CSV delimited file.

Contributor
Posts: 26

Re: Using PROC IMPORT and define field names and variable types

Ok, and then what? What does the import statement become? and don't I still have an issue with SAS guessing incorrect field types?

Super User
Super User
Posts: 7,050

Re: Using PROC IMPORT and define field names and variable types

Then you can write a DATA step to read the data and have complete control.  For example something like this.

DATA rawdata;

  infile 'myfile.csv' dsd truncover lrecl=30000 firstobs=2 ;

  LENGTH straw_wt panicle_wt DateEntered 8  notes $200 height DateEntered2  rep block row col plant 8 line $20 plot_ID_final 8;

  informat DateEntered DateEntered2 YYMMDD10.;

  format DateEntered DateEntered2 YYMMDD10.;

  INPUT straw_wt -- plot_ID_final;

RUN;

Contributor
Posts: 26

Re: Using PROC IMPORT and define field names and variable types

Using this code, it doesn't actually import any data.  Does it matter than the field names in the file are different?

Here is the code with notes:

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

57 

58 DATA rawdata;

59 infile '/home/paul.tanger140/RILpop2013phenotypes/rawdatav2-new-line-names-checks-forSAS-code-segs.csv' dsd truncover

59 ! lrecl=30000 firstobs=2 ;

60 LENGTH straw_wt panicle_wt DateEntered 8 notes $200 height DateEntered2 rep block row col plant 8 line $20

60 ! plot_ID_final 8;

61 informat DateEntered DateEntered2 YYMMDD10.;

62 format DateEntered DateEntered2 YYMMDD10.;

63 INPUT straw_wt panicle_wt DateEntered $ notes $ height DateEntered2 $ rep block row col plant line $ plot_ID_final;

64 RUN;

NOTE: The infile '/home/paul.tanger140/RILpop2013phenotypes/rawdatav2-new-line-names-checks-forSAS-code-segs.csv' is:

  Filename=/home/paul.tanger140/RILpop2013phenotypes/rawdatav2-new-line-names-checks-forSAS-code-segs.csv,

  Owner Name=paul.tanger140,Group Name=oda,

  Access Permission=-rw-r--r--,

  Last Modified=17Mar2015:18:22:12,

  File Size (bytes)=317061

NOTE: 0 records were read from the infile

  '/home/paul.tanger140/RILpop2013phenotypes/rawdatav2-new-line-names-checks-forSAS-code-segs.csv'.

NOTE: The data set WORK.RAWDATA has 0 observations and 13 variables.

Solution
‎03-16-2015 08:28 PM
Super User
Posts: 3,254

Re: Using PROC IMPORT and define field names and variable types

The problem with SAS guessing rows is removed because your code becomes something like this for reading a tab-delimited file (much the same as DDE):

DATA rawdata;

FILENAME DATA1 "c:\MyFolder\rawdatav2-new-line-names-checks-forSAS-code-segs.csv";

INFILE DATA1 NOTAB DLM= '09'X DSD MISSOVER lrecl = 10240;

INPUT straw_wt panicle_wt DateEntered $ notes $ height DateEntered2 $ rep block row col plant line $ plot_ID_final;

run;

Contributor
Posts: 26

Re: Using PROC IMPORT and define field names and variable types

Thanks!  This is helpful.  I'm getting an error though:

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

57 

58 DATA rawdata;

59 FILENAME DATA1 "/home/paul.tanger140/RILpop2013phenotypes/rawdatav2-new-line-names-checks-forSAS-code-segs.csv";

60 INFILE DATA1 NOTAB DLM= '09'X DSD MISSOVER lrecl = 10240;

  _____

  23

ERROR 23-2: Invalid option name NOTAB.

here is my code:

DATA rawdata;

FILENAME DATA1 "/home/paul.tanger140/RILpop2013phenotypes/rawdatav2-new-line-names-checks-forSAS-code-segs.csv";

INFILE DATA1 NOTAB DLM= '09'X DSD MISSOVER lrecl = 10240;

INPUT straw_wt panicle_wt DateEntered $ notes $ height DateEntered2 $ rep block row col plant line $ plot_ID_final;

run;

Respected Advisor
Posts: 3,799

Re: Using PROC IMPORT and define field names and variable types

NOTAB is option for DDE access method only.  Just remove it.

Contributor
Posts: 26

Re: Using PROC IMPORT and define field names and variable types

Posted in reply to data_null__

Getting closer!  Now it says this:

NOTE: Invalid data for straw_wt in line 1 1-10240.

RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 

1 CHAR straw_wt,panicle_wt,BiomassDate,notes,height,HeightDate,rep,block,row,col,plant,line,plot_ID-final.2

  ZONE 7776757727666666577246666774676266767266666724666674676276726666627672666276667266662766754426666603

  NUMR 34217F74C01E93C5F74C29FD1334145CEF453C859784C8597844145C250C2CF3BC2F7C3FCC0C1E4CC9E5C0CF4F94D69E1CD2

and there is a bunch more of this..

These are the first two lines of the file:

straw_wt,panicle_wt,BiomassDate,notes,height,HeightDate,rep,block,row,col,plant,line,plot_ID-final

245.7,75.2,4/13/2012,,162,4/12/2012,1,1,1,1,1,845,113

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 24 replies
  • 14008 views
  • 6 likes
  • 6 in conversation