BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MickyTee
Fluorite | Level 6

Hi guys, I want to create a dataset with the datalines statement that includes Canadian bankruptcy data by NAICS (straight from https://www.ic.gc.ca/eic/site/bsf-osb.nsf/eng/br02234.html). 

 

My datastep looks a little like this:

data industry_BK;
length Industry $15 NAICS $2;
input Industry $ NAICS $ BK_rate 3.1 Ins_Rate 3.1;
datalines;
Agriculture 11 0.2 0.3
Mining 21 1.3 2.1
Utilities 22 1.2 1.2
Construction 23 1.3 1.8
;
quit;

However, when I run this, I get the following:

 

SAS_Table.png

 

It appears to grab the first digit, but loses everything after the decimal.

 

I've read through a few documentation pages, but clearly I'm just not getting it. Can someone help me to understand why this doesn't work and how I can fix it?

 

Many thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
sv_sas
Calcite | Level 5

Hi,

try this, I used the format best12. for the numeric variables.

 

 

data industry_BK;
length Industry $15 NAICS $2;
format BK_rate best12. Ins_Rate best12.;
input Industry $ NAICS $ BK_rate Ins_Rate;
datalines;
Agriculture 11 0.2 0.3
Mining 21 1.3 2.1
Utilities 22 1.2 1.2
Construction 23 1.3 1.8
;
quit;

View solution in original post

7 REPLIES 7
sv_sas
Calcite | Level 5

Hi,

try this, I used the format best12. for the numeric variables.

 

 

data industry_BK;
length Industry $15 NAICS $2;
format BK_rate best12. Ins_Rate best12.;
input Industry $ NAICS $ BK_rate Ins_Rate;
datalines;
Agriculture 11 0.2 0.3
Mining 21 1.3 2.1
Utilities 22 1.2 1.2
Construction 23 1.3 1.8
;
quit;
AlokR
Fluorite | Level 6

Try this. It should work:

data industry_BK;
length Industry $15 NAICS $2;
input Industry $ NAICS $ BK_rate Ins_Rate 3.1;
datalines;
Agriculture 11 0.2 0.3 
Mining 21 1.3 2.1 
Utilities 22 1.2 1.2 
Construction 23 1.3 1.8 
;
quit;
MickyTee
Fluorite | Level 6

Both of those approaches did the trick, thanks very much both of you.

 

Any ideas why specifying "3.1" after the input statement on BK_rate seems to ruin things?

Kurt_Bremser
Super User

If you specify a format without the colon modifier, the length of the format overrides the delimiters. So the second 3.1 format starts to read immediately after the first, and reads a blank, a digit, and the period.

Tom
Super User Tom
Super User

DO NOT use formatted input with delimited data.  You do not need to specify ANY informats in your INPUT statement for that data as SAS already knows how to read numbers, but if you do add informat specificatons then you must proceed them with the colon modifier.  That will prevent SAS from reading EXACTLY that many characters and messing up how the delimiter processing works.

 

Also DO NOT add a decimal part to an informat unless you know that the decimal points were purposely removed from the raw text values.  Otherwise any values that do not contain a period will be divided by the specified power of ten.

MickyTee
Fluorite | Level 6

Thanks Kurt and Tom for the detailed answers!

Peter_C
Rhodochrosite | Level 12
Another feature might be worth learning for reading fixed layouts.
Should the last field on a line be short the INPUT parser would find not enough to fill the informat. and skip onto the next line to fill the field from there. Usually that is not what is wanted.
INFILE whatever TRUNCOVER;
The TRUNCOVER option directs the INPUT parser to accept a shortened last field.
Alternatives are available

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 5513 views
  • 3 likes
  • 6 in conversation