BookmarkSubscribeRSS Feed
thisisneeraj
Fluorite | Level 6

Hello Everyone,

 

Warm greetings for the day.

I have written a proc import to read a delimted file whose number of columns were constant.

 

Now, the source provider has conveyed that the new columns will keep adding/deleting every month in this delimited file.

So, could you please advise that how do I approach this problem where the number of columns keeps increasing/decreasing every month?

 

I have following approach in mind :

Guess, I can start to maintain a metadata excel, which will store the column names and its SAS data type with length and format .

This excel will act as source to identify the number of columns in the delimited file.

Then writing a macro enabled proc import with column names coming from the excel file ,could solve the issue.

I will work on this approach this weekend.

 

but ,does anyone has any better solution please ?

I will be thankful

Thanks a lot .

kind regards,

Neeraj singh

3 REPLIES 3
Reeza
Super User

You say Excel but delimited file. Those are two very different things, which one are you dealing with?

 

Your process sounds pretty much exactly what's required. 


Assuming delimited file:

 

1. Use PROC IMPORT, setting GUESSINGROWS=MAX to get better results. 

2. Maintain a table of fields. Add a 'check' process that alerts you if there are new/different columns

3. Add a step to convert field types

4. Politely chastise at data provider explaining the amount of work they've added for you - this should really be step 1. And then you go back and do 1 to 3 because you don't really have a choice 😉

 


@thisisneeraj wrote:

Hello Everyone,

 

Warm greetings for the day.

I have written a proc import to read a delimted file whose number of columns were constant.

 

Now, the source provider has conveyed that the new columns will keep adding/deleting every month in this delimited file.

So, could you please advise that how do I approach this problem where the number of columns keeps increasing/decreasing every month?

 

I have following approach in mind :

Guess, I can start to maintain a metadata excel, which will store the column names and its SAS data type with length and format .

This excel will act as source to identify the number of columns in the delimited file.

Then writing a macro enabled proc import with column names coming from the excel file ,could solve the issue.

I will work on this approach this weekend.

 

but ,does anyone has any better solution please ?

I will be thankful

Thanks a lot .

kind regards,

Neeraj singh


 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

"Now, the source provider has conveyed that the new columns will keep adding/deleting every month in this delimited file." - what does your import agreement contain?  You can code around, or re-do it each import, however it will cause problems all over the place.  Why are the columns changing?  I would bet a £1 that the provider is an Excel user, and due to zero control are just doing it themselves, or that they are using the Excel thinking of putting data across the page, i.e. years going to the right.

Fix the structure of the import data or you will never stop coding/re-coding your process.  Putting the year (in my example) going down the page means that observations change but structure does not - this is how it is done in DB's/SAS and various other tools.

So not this:

ID   2013  2014   2015  2016 2017

1     a        b         c        d       e

 

but this:

ID  YEAR  RESULT

1    2013    a

1    2014    b

...

Tom
Super User Tom
Super User

PROC IMPORT does not care how many columns are in the source file. It looks at the file and guesses how many columns there are, what to call the variables and what type length to use for the variables. So if you give a different file the PROC IMPORT code doesn't change but the generated dataset reflects the data that is in it.

 

Did you mean that you had created a data step to read the delimited file?

If the changes are predictable then perhaps your data step can automatically adjust.  For example they might mean that the data has repeated measures and some rows will have more repeats than others.  If so a simple data step can do that automatically if the structure is clear.

 

Did you mean that you had created other steps after the PROC IMPORT step that are assuming that certain variable exist?

If you you might be able to make that code more flexible also.  Perhaps by running PROC CONTENTS on the dataset that PROC IMPORT generated.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 923 views
  • 3 likes
  • 4 in conversation