BookmarkSubscribeRSS Feed
kjohnsonm
Lapis Lazuli | Level 10

Hello all,
I have a less than optimal situation, where a new system has gone live at my work and we do not have our reports nailed down, and no longer have ODBC/SQL access to raw data. I am stuck with doing the best I can with a range of totally different reports used for totally different tasks (reports-A - J).  However, the real issue is since these reports were not nailed down to mutually 'accepted' standards I have a slew of different data set versions per report that might have one more column or missing a column per report run and 11 months and counting of these reports to reconcile. The order of the appearance of columns can change in these reports. Sometimes the column shows-up but the field was 100% empty and SAS csv reader is assuming a different data type than the column actually has say on the next report run when it was corrected. In order to do the best I can to get SAS to correctly read these CSV files I have been using the ‘Proc import options’ getnames=YES; and GUESSINGROWS=MAX;
I work with metadata* all the time and can write SAS code to produce simple fixes like:
“ALTER TABLE [my_db_name].[dbo].[my_table_A] ALTER COLUMN Site_Location [varchar](70);”

*both SAS and MS SQL metadata. (to pass for example to my SQL server were all this data will end up in the end) …if for example the next data run the metadata width for a column max changes from say 65 characters to 70. (it's 100% working now and has zero results when no sizing is needed)  I can find the reports that have missing columns and am working on a solution to code in a for example:
proc sql;

Create table … as

Select

‘’ as Site_Location format = $70.,

 

Etc…

from (csv data source)

;quit;

But at this time doing this last PROC SQL seems very challenging for each add or drop of a column.


Okay end of the background, this seems like a game of endless wack-a-mole. 

Is there a better way?  My ask is not for code, but maybe links to SAS presentation documents that might help me understand and manage this tangled web to align my columns to the best of my ability without having to hard code every single report read to a yet moving standard? or SAS tools (and or other tools) that might already available, that might help with this?   (I code in base SAS)
I am not very sure how to go about these missing columns or convert to different data type, from money to float etc.

I cannot toss the data sets with missing columns we basically have to do the best we can and build a data warehouse out of this information.

Thanks in advance, sorry if this is a highly covered topic, I ran out of search ideas for discovery.  -KJ  😎


PS as a follow up for this request is if a report fails to run for any reason the system if run by hand creates excel files, and when opening them excel also makes assumptions and the data can be come dirty at that point. So a long term funneling here is likely for the excel side of this task.

oh how I miss my ODBC connections + effective dates and dislike object oriented systems. 😎 ...but it's a living.

4 REPLIES 4
kjohnsonm
Lapis Lazuli | Level 10
So per report type 'A' example the long term goal is to vertically stack all snap_shot(s) to the same relative column names. I did not say it but it's also possible that a names of a column can change. A co-worker describes this as alignment of the data columns but when I research this on line alignment seems mostly popping for left/right/center justification of data within a cell for example. I am not concerned with cell justification. if 'alignment' is a better description sorry for not including it in the original post. -KJ
ballardw
Super User

A generic comment: Time to get management involved and let them know about the costs, man-hours is money, involved with dealing with "a moving standard" for reports. I used to work as a contractor using data extracts from a client. The asked why they were getting billed for additional programming every cycle. Once we explained that the cost was because the people preparing the data 1) moved columns, 2) changed column headings so we had to verify what was in each column, 3) changed numbers of characters in fields and 4) changed numbers of columns the management team for the project got involved to standardize the files to the original file descriptions we had been provided.

 

Second, I have seen very few report formats that I would expect Proc Import to deal with properly and typically reports do end up requiring some (and frequently nasty) custom coding.

 

Perhaps you can arrange, probably through your management, additional files that are not in the report format but designed for actual data interchange that Proc Import might manage better.

 

Are those "Excel files" actually created as XLSX or CSV that some lets Excel corrupt? Don't let Excel open/save CSV or similar text files as it will corrupt data and may your work harder.

 

A somewhat comic aside: The solution to such issues is repeated lashing with wet spaghetti noodles until someone gets a clue.

kjohnsonm
Lapis Lazuli | Level 10
funny
"A somewhat comic aside: The solution to such issues is repeated lashing with wet spaghetti noodles until someone gets a clue."

I have been telling them about the costs/time. I don't expect this to last forever, but I seem to be the lowest of priorities. But the excel side of this is concerning
ballardw
Super User

@kjohnsonm wrote:
funny
"A somewhat comic aside: The solution to such issues is repeated lashing with wet spaghetti noodles until someone gets a clue."

I have been telling them about the costs/time. I don't expect this to last forever, but I seem to be the lowest of priorities. But the excel side of this is concerning

I agree about the Excel concerns, which is why I asked if they are making native XLSX files or converting something else. If you can get the "something else", if it exists, your bit might be a little less of a problem.

 

Maybe you can at least get someone involved enough to address one issue, such as the changing order. That I would typically lay at the hands of "lazy programmer; no cookie" kind of activity. Like someone letting another program generate the reports with no attempt to control the order or not knowing the software well enough to avoid that.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 785 views
  • 0 likes
  • 2 in conversation