BookmarkSubscribeRSS Feed
Nipun22
Obsidian | Level 7

a.
Region&State&Month&Expenses&Revenue
Southern&GA&JAN2001&2000&8000
Southern&GA&FEB2001&1200&6000
Southern&FL&FEB2001&8500&11000
Northern&NY&FEB2001&3000&4000
Northern&NY&MAR2001&6000&5000
Southern&FL&MAR2001&9800&13500
Northern&MA&MAR2001&1500&1000

 

b.
"Africa","Boot","Addis Ababa","12","$29,761","$191,821","$769"
"Asia","Boot","Bangkok","1","$1,996","$9,576","$80"
"Canada","Boot","Calgary","8","$17,720","$63,280","$472"
"Eastern Europe","Boot","Budapest","22","$74,102","$317,515","$3,341"
"Middle East","Boot","Al-Khobar","10","$15,062","$44,658","$765"
"Pacific","Boot","Auckland","12","$20,141","$97,919","$962"
"South America","Boot","Bogota","19","$15,312","$35,805","$1,229"
"United States","Boot","Chicago","16","$82,483","$305,061","$3,735"
"Western Europe","Boot","Copenhagen","2","$1,663","$4,657","$129"

 

c.
Region State Capital Bird
South Georgia Atlanta 'Brown Thrasher'
South 'North Carolina' Raleigh Cardinal
North Connecticut Hartford Robin
West Washington Olympia 'American Goldfinch'
Midwest Illinois Springfield Cardinal

 

d.
all of the above

10 REPLIES 10
Tom
Super User Tom
Super User

Are you asking us to do your homework?

 

Please show what you have tried and if you still have questions explain them more completely.

Nipun22
Obsidian | Level 7
there's nothing sort of a homework here bro. Also I think it's A but my friend says the correct answer is d
Tom
Super User Tom
Super User

Since the question says which of them COULD you import with PROC IMPORT than D is the right answer.

 

For all of them you can use the DELIMITER= statement to set the appropriate delimiter.

And for B you can use the GETNAMES= statement to tell there is no header line to use for making guesses about what names to use for the variables.

 

But as to SHOULD you use PROC IMPORT to read them?  I would say NO since none of them have so many variables or observations that would require you to use a guessing procedure like PROC IMPORT.    Instead you could just write the data step to read them yourself.

 

And for B the guesses that PROC IMPORT makes

proc import file=b dbms=csv out=b replace;
  getnames=no;
run;

for how to write the data step to read it

    data WORK.B    ;
    infile B delimiter = ',' MISSOVER DSD lrecl=32767 ;
       informat VAR1 $16. ;
       informat VAR2 $6. ;
       informat VAR3 $13. ;
       informat VAR4 $4. ;
       informat VAR5 $9. ;
       informat VAR6 $10. ;
       informat VAR7 $8. ;
       format VAR1 $16. ;
       format VAR2 $6. ;
       format VAR3 $13. ;
       format VAR4 $4. ;
       format VAR5 $9. ;
       format VAR6 $10. ;
       format VAR7 $8. ;
    input
                VAR1  $
                VAR2  $
                VAR3  $
                VAR4  $
                VAR5  $
                VAR6  $
                VAR7  $
    ;
    run;

will be much worse than what you would do.

data b;
  infile b dsd truncover ;
  input Region :$20. Product :$20. Country :$20. Count (Price1-Price3) (:comma.);
  format price1-price3 dollar10.;
run;

 Not only did PROC IMPORT not make any attempt to make meaningful names for the variables it mistakenly made every variable character even though the last four clearly look like nunbers.

 

And the code it writes does silly things like:

  • using the side effect of having the names first appear in INFORMAT statements to force the data step compiler to guess that the variable's type and length should match the informat type and informat width.
  • adding the $ modifier in the INPUT statement for variables that have already been defined as character.
  • attaching informats and formats to variables that do not need them.

Also note that the code you need to write to read the file is almost smaller than the code you need to write to have PROC IMPORT guess how to write code to read the file.

Spoiler
data a;
  infile cards dsd dlm='&' firstobs=2 truncover ;
  input Region :$8. State :$2. Month :anydtdte. Expenses Revenue;
  format Month monyy7. ;
cards;
Region&State&Month&Expenses&Revenue
Southern&GA&JAN2001&2000&8000
Southern&GA&FEB2001&1200&6000
Southern&FL&FEB2001&8500&11000
Northern&NY&FEB2001&3000&4000
Northern&NY&MAR2001&6000&5000
Southern&FL&MAR2001&9800&13500
Northern&MA&MAR2001&1500&1000
;

data b;
  infile cards dsd truncover;
  input Region :$20. Product :$20. Country :$20. Count (Price1-Price3) (:comma.);
  format price1-price3 dollar10.;
cards;
"Africa","Boot","Addis Ababa","12","$29,761","$191,821","$769"
"Asia","Boot","Bangkok","1","$1,996","$9,576","$80"
"Canada","Boot","Calgary","8","$17,720","$63,280","$472"
"Eastern Europe","Boot","Budapest","22","$74,102","$317,515","$3,341"
"Middle East","Boot","Al-Khobar","10","$15,062","$44,658","$765"
"Pacific","Boot","Auckland","12","$20,141","$97,919","$962"
"South America","Boot","Bogota","19","$15,312","$35,805","$1,229"
"United States","Boot","Chicago","16","$82,483","$305,061","$3,735"
"Western Europe","Boot","Copenhagen","2","$1,663","$4,657","$129"
;

data c;
  infile cards dsd dlm=' ' firstobs=2 truncover ;
  input Region :$7. State :$20. Capital :$30. Bird :$30.;
cards;
Region State Capital Bird
South Georgia Atlanta 'Brown Thrasher'
South 'North Carolina' Raleigh Cardinal
North Connecticut Hartford Robin
West Washington Olympia 'American Goldfinch'
Midwest Illinois Springfield Cardinal
;

 

Nipun22
Obsidian | Level 7
but my doubt is that you can import the delimited sensitive data with the data step using the DSD option but how'd do that with the PROC IMPORT?
take for example that option B and C are delimited sensitive data.
How'd you import those using PROC IMPORT?
Kurt_Bremser
Super User

What did you try, and what have your tests revealed?

 

Hints:

DBMS=DLM option for the PROC IMPORT statement.

DELIMITER statement.

GETNAMES statement.

 

BTW you would find all these by following Maxim 1 and reading the documentation for the IMPORT Procedure 

Tom
Super User Tom
Super User

@Nipun22 wrote:
but my doubt is that you can import the delimited sensitive data with the data step using the DSD option but how'd do that with the PROC IMPORT?
take for example that option B and C are delimited sensitive data.
How'd you import those using PROC IMPORT?

I answered that already.

 

For all of them you can use the DELIMITER= statement to set the appropriate delimiter.

And for B you can use the GETNAMES= statement to tell there is no header line to use for making guesses about what names to use for the variables.

 

If you are still having trouble click below to see example code:

Spoiler
proc import dbms=csv file=A out=A replace;
  delimiter='&';
run;
proc import dbms=csv file=B out=B replace;
  getnames=no;
run;
proc import dbms=csv file=C out=C replace;
  delimiter=' ';
run;
Kurt_Bremser
Super User

Maxim 4.

Try It.

 

For a professional environment, the only true answer would be

e.

You do not use PROC IMPORT for delimited files, but write the DATA step yourself.

ballardw
Super User

I generally agree with the statements about "write your own". 

 

However Proc Import does have uses when the data sources will not document the files or document when they change files. I worked for years on projects where the contractor preparing our data would routinely without notice change the 1) column orders of variables 2) add or remove variables 3) change the lengths of variables 4) change the number of header rows. Not to mention changing how to access the data in general.

 

When the program I used for the previous month files generated errors or ended up with garbage values I often used Proc Import to get an idea of where the changes were so I could tweak my code that included things like usable variable names (instead of 200+ length character column headings, special characters ), variable labels, use of custom INFORMATS to validate supposedly fixed lists of expected values.

 

Every month the mantra was "Job security, job security, job security". 

Retired now and not missing such things.

ballardw
Super User

@Kurt_Bremser wrote:

I specifically mentioned "professional environment". You were forced to work for bungling idiots.


Government actually, which doesn't preclude the "bungling idiots". Even after several contract renewals I couldn't get the the contracts with the external organization providing the services and data to include any of the data requirements.

 

And CDC, yes the Centers for Disease Control, data for one project was worse. They supplied code to "read" the data files that at one point included converting all the CSV files, about a hundred of them, to XLSX and used Proc Import to read them. Including code that replaced the SAS system options such as FMTSEARCH path and some data manipulation that was just plain wrong. Not the only instance of poor CDC data manipulation I ran into in more than 20 years.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 319 views
  • 0 likes
  • 4 in conversation