Desktop productivity for business analysts and programmers

Data truncation while Importing from CSV file.

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Data truncation while Importing from CSV file.

DATA DL.Alex;
    INFILE "/programs/ICOPSTUDY_DISC/input/Ops_Study_Query.csv" 
    DELIMITER = "," 
    MISSOVER DSD 
    FIRSTOBS=2 
    LRECL=32760;

	INPUT PRODUCT_NAME_ABBR $ SIMS_UPC $ WIC_NBR $ OPS_STUDY_DEPT_NBR $ BILLING_NDC $ PACKAGE_SIZE $ PACKAGE_QTY $ Total_Pack_Size $ DRUG_WAREHOUSE_IND $ DRUG_TYPE_CD
$ DRUG_CLASS $ ;

    
RUN;

HI,

 

I have recently started working on SAS, so my appologies if I am asking a very simple query. I tried searching the solution but seems couldn't figure out on my own.

 

I am importing a comma seprated file into a SAS data set. The import is sucessful but when I check the data, it truncates the data of all the columns to length = 8 and makes them character. Type is not my problem but truncated data is.

 

I tried setting the Length Default = 20, but is giving me error message as Numeric Field can be of length 3 - 8.

Later I tried statement like INPUT PRODUCT_NAME_ABBR $20. SIMS_UPC $.... but it then treates it as flat file and columns are no longer seprated by ','.

 

Could you please suggest how may I import data without truncating it for each column.

If they remain as Character, then it is not an issue.

Thanks!!


Accepted Solutions
Solution
2 weeks ago
Grand Advisor
Posts: 17,325

Re: Data truncation while Importing from CSV file.

You haven't specified any lengths and SAS automatically sets characters to 8. So set the lengths BEFORE the INPUT statement to read it correctly.

 

length product_name_abbr $20. sims_ups $20.;

input .......;

View solution in original post


All Replies
Regular Contributor
Posts: 213

Re: Data truncation while Importing from CSV file.

Trusted Advisor
Posts: 1,051

Re: Data truncation while Importing from CSV file.

When you use $ after each variable name you're telling SAS to create character variables. Assuming you want numeric variables, use:

 

INPUT PRODUCT_NAME_ABBR SIMS_UPC WIC_NBR OPS_STUDY_DEPT_NBR BILLING_NDC PACKAGE_SIZE PACKAGE_QTY Total_Pack_Size DRUG_WAREHOUSE_IND DRUG_TYPE_CD
DRUG_CLASS ;

 

Tom

Solution
2 weeks ago
Grand Advisor
Posts: 17,325

Re: Data truncation while Importing from CSV file.

You haven't specified any lengths and SAS automatically sets characters to 8. So set the lengths BEFORE the INPUT statement to read it correctly.

 

length product_name_abbr $20. sims_ups $20.;

input .......;
New Contributor
Posts: 4

Re: Data truncation while Importing from CSV file.

Hi Wong/Tom/Reeza,

 

 

Thanks for the response.

You gave me very good learnig information.

Reeza's response have worked for me as well.

 

Thank you all!!

 

Tom,

 

My problem is not the data type conversion from Numeric to Character, but it is the length of the columns.

 

For example, the first column is PRODUCT_NAME_ABBR. Now, in this column only 8 characters from the input file is displayed i.e. if the value in the input file is "ORUDIS 50MG CAPSULES", the value stored in the SAS data set is "ORUDIS 5".

 

This (8) is the default value set for all the columns when running through SAS Enterprise guide.

Later, I checked the SAS enterprise guide and it had an import data function.

When I used it, it generated code for me as well and it is also a solution for this issue.

 

ONE QUESTION THOUGH, is this correct to have both the Informat and Format defined for same variable?

 

Thanks for your help and time.

Smiley Happy

 

DATA WORK.Ops_study_Query;

LENGTH

 PRODUCT_NAME_ABBR $ 39

 SIMS_UPC $ 27

 WIC_NBR $ 12

 OPS_STUDY_DEPT_NBR $ 12

 BILLING_NDC $ 13

 PACKAGE_SIZE $ 13

 PACKAGE_QTY $ 13

 Total_Pack_Size $ 13

 DRUG_WAREHOUSE_IND $ 5

 DRUG_TYPE_CD $ 5

 DRUG_CLASS $ 3

;

FORMAT

PRODUCT_NAME_ABBR $CHAR39.

 SIMS_UPC $CHAR27.

 WIC_NBR $CHAR12.

OPS_STUDY_DEPT_NBR $CHAR12.

 BILLING_NDC $CHAR13.

 PACKAGE_SIZE $CHAR13.

 PACKAGE_QTY $CHAR13.

 Total_Pack_Size $CHAR13.

 DRUG_WAREHOUSE_IND $CHAR5.

 DRUG_TYPE_CD $CHAR5.

 DRUG_CLASS $CHAR3.

 ;

INFORMAT

 PRODUCT_NAME_ABBR $CHAR39.

 SIMS_UPC $CHAR27.

 WIC_NBR $CHAR12.

 OPS_STUDY_DEPT_NBR $CHAR12.

 BILLING_NDC $CHAR13.

 PACKAGE_SIZE $CHAR13.

 PACKAGE_QTY $CHAR13.

 Total_Pack_Size $CHAR13.

 DRUG_WAREHOUSE_IND $CHAR5.

 DRUG_TYPE_CD $CHAR5.

 DRUG_CLASS $CHAR3.

 ;

INFILE '/usr/local/#LN00015'

LRECL=32767

FIRSTOBS=2

ENCODING="LATIN1"

DLM='2c'x

MISSOVER

DSD ;

INPUT

 PRODUCT_NAME_ABBR : $CHAR39.

 SIMS_UPC : $CHAR27.

 WIC_NBR : $CHAR12.

 OPS_STUDY_DEPT_NBR : $CHAR12.

 BILLING_NDC : $CHAR13.

 PACKAGE_SIZE : $CHAR13.

PACKAGE_QTY : $CHAR13.

 Total_Pack_Size : $CHAR13.

 DRUG_WAREHOUSE_IND : $CHAR5.

 DRUG_TYPE_CD : $CHAR5.

 DRUG_CLASS : $CHAR3.

;


RUN;

 

 

 

Grand Advisor
Posts: 17,325

Re: Data truncation while Importing from CSV file.


ABBJ wrote:

Hi Wong/Tom/Reeza,

 

 

Thanks for the response.

You gave me very good learnig information.

Reeza's response have worked for me as well.

 

Thank you all!!

 

Tom,

 

My problem is not the data type conversion from Numeric to Character, but it is the length of the columns.

 

For example, the first column is PRODUCT_NAME_ABBR. Now, in this column only 8 characters from the input file is displayed i.e. if the value in the input file is "ORUDIS 50MG CAPSULES", the value stored in the SAS data set is "ORUDIS 5".

 

This (8) is the default value set for all the columns when running through SAS Enterprise guide.

Later, I checked the SAS enterprise guide and it had an import data function.

When I used it, it generated code for me as well and it is also a solution for this issue.

 

ONE QUESTION THOUGH, is this correct to have both the Informat and Format defined for same variable?

 

Thanks for your help and time.

Smiley Happy

 

DATA WORK.Ops_study_Query;

LENGTH

 PRODUCT_NAME_ABBR $ 39

 SIMS_UPC $ 27

 WIC_NBR $ 12

 OPS_STUDY_DEPT_NBR $ 12

 BILLING_NDC $ 13

 PACKAGE_SIZE $ 13

 PACKAGE_QTY $ 13

 Total_Pack_Size $ 13

 DRUG_WAREHOUSE_IND $ 5

 DRUG_TYPE_CD $ 5

 DRUG_CLASS $ 3

;

FORMAT

PRODUCT_NAME_ABBR $CHAR39.

 SIMS_UPC $CHAR27.

 WIC_NBR $CHAR12.

OPS_STUDY_DEPT_NBR $CHAR12.

 BILLING_NDC $CHAR13.

 PACKAGE_SIZE $CHAR13.

 PACKAGE_QTY $CHAR13.

 Total_Pack_Size $CHAR13.

 DRUG_WAREHOUSE_IND $CHAR5.

 DRUG_TYPE_CD $CHAR5.

 DRUG_CLASS $CHAR3.

 ;

INFORMAT

 PRODUCT_NAME_ABBR $CHAR39.

 SIMS_UPC $CHAR27.

 WIC_NBR $CHAR12.

 OPS_STUDY_DEPT_NBR $CHAR12.

 BILLING_NDC $CHAR13.

 PACKAGE_SIZE $CHAR13.

 PACKAGE_QTY $CHAR13.

 Total_Pack_Size $CHAR13.

 DRUG_WAREHOUSE_IND $CHAR5.

 DRUG_TYPE_CD $CHAR5.

 DRUG_CLASS $CHAR3.

 ;

INFILE '/usr/local/#LN00015'

LRECL=32767

FIRSTOBS=2

ENCODING="LATIN1"

DLM='2c'x

MISSOVER

DSD ;

INPUT

 PRODUCT_NAME_ABBR : $CHAR39.

 SIMS_UPC : $CHAR27.

 WIC_NBR : $CHAR12.

 OPS_STUDY_DEPT_NBR : $CHAR12.

 BILLING_NDC : $CHAR13.

 PACKAGE_SIZE : $CHAR13.

PACKAGE_QTY : $CHAR13.

 Total_Pack_Size : $CHAR13.

 DRUG_WAREHOUSE_IND : $CHAR5.

 DRUG_TYPE_CD : $CHAR5.

 DRUG_CLASS : $CHAR3.

;


RUN;

 

 

 


Yes, in fact, it's more 'correct' to have both an informat and format. An Informat tell SAS what format the data is in and how to read it. The Format tells SAS how to display data. They have different purposes. 

 

This is useful, because if you read data that has a mmddyy format for example, but you want to display your data as Date9 you simply apply an informat to tell SAS what format to read it in and then use a format to tell SAS how to display the data.

 

data example;
	informat myDate mmddyy10.;
	format myDate date9.;
	input myDate;
	cards;
01/01/2017
03/12/2017
12/13/2017
13/12/2017
;
run;

proc print data=example;
run;
Super User
Super User
Posts: 6,320

Re: Data truncation while Importing from CSV file.

[ Edited ]

You are using belt and suspenders and perhaps additional way to hold you pants up with that code.

 

In most cases there is no real need to attach formats or informats to character variables. If you wanted to preserve leading spaces then you could attach $CHAR format to the variable.  But since you are reading using the DSD option the leading spaces will not be in the source data anyway.

 

Also there is no need to specify an INFORMAT on the INPUT statement if you have already defined the variables type and length.  You do not even need to add the $ after character variables since SAS already knows that your variables are character because you defined them that way by using the LENGTH statement.  If your variable is a DATE or other variable that does need an INFORMAT you can attach it with an INFORMAT (or ATTRIB) statement and then you do not need to also list the informat on the INPUT statement.

 

So the INPUT statement just needs to list the variable names. And if you have already defined them in the order that they will appear in the source data file then you can use a location based variable list to save even more typing.

 

DATA WORK.Ops_study_Query;
LENGTH
 PRODUCT_NAME_ABBR $ 39
 SIMS_UPC $ 27
 WIC_NBR $ 12
 OPS_STUDY_DEPT_NBR $ 12
 BILLING_NDC $ 13
 PACKAGE_SIZE $ 13
 PACKAGE_QTY $ 13
 Total_Pack_Size $ 13
 DRUG_WAREHOUSE_IND $ 5
 DRUG_TYPE_CD $ 5
 DRUG_CLASS $ 3
;
INFILE '/usr/local/#LN00015'
  LRECL=32767
  FIRSTOBS=2
  ENCODING="LATIN1"
  DSD
  DLM='2c'x
  TRUNCOVER
;
INPUT PRODUCT_NAME_ABBR -- DRUG_CLASS ;
RUN;

 

New Contributor
Posts: 4

Re: Data truncation while Importing from CSV file.

Dear Tom/Reeza,

 

Thank you so much for this valuable information.

As I mentioned, I am pretty new to the SAS coding, thus am sure this is going to help me a lot in future as well during my SAS tasks.

 

Thanks a lot for all your time and help with this query.

 

Thank you!!

 

 

Grand Advisor
Posts: 17,325

Re: Data truncation while Importing from CSV file.

Do make sure to change the MISSOVER to TRUNCOVER as in @Tom correct solution and post above.

New Contributor
Posts: 4

Re: Data truncation while Importing from CSV file.

Thanks Reeza.

 

I did that, though I didn't find difference in output but I am now going to study the difference between MISSOVER, FLOWOVER and TRUNCOVER.

 

Thank you!!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 256 views
  • 0 likes
  • 5 in conversation