BookmarkSubscribeRSS Feed
TBanky
Calcite | Level 5

Hello

I have a rather annoying problem I need to deal with, namely - I have a few datasets, some of them are template dataset with only variable names and no actual data. I need to compare the structure of these templates with the actual datasets. I know how to do the comparison itself but I have a problem with something else. These templates have some suffixes and prefixes, which are dates but they aren't written like 201412 but YYYYMM so the names are for example YYYYMMCARD1. Real dataset have names with dates in a normal format - 201412CARD1. So there's a problem when comparing them because in order to compare variables the tables need to have the same names. I've tried using substr but templates have lots of different prefixes and suffixes, some are shorter like YYYY or YYMM and some of these 'date formats' aren't even at the beginng or at the end of a name. I'm trying to figure out how to do that without doing dozens of different comparisons and substrings for every single possibility. Does anyone have any idea? Maybe there's a way to make SAS treat these parts of a name like numbers (some kind of format maybe? or a join with LIKE condition?) so that a normal join could be possible?


Thank you in advance
3 REPLIES 3
ballardw
Super User

Are these supposedly SAS datasets or external files, perhaps in another DBMS?

I ask as 2014CARD1 is not typically a valid SAS variable name.

If you are concerned about the data set structures i.e. variable type and size by position that may be possible.

Provide two small examples of what your have for structure and what you want as a report.

TBanky
Calcite | Level 5

201412CARD1 isn't a variable name, it's a table name. Every single table is created as a new version of a previous dataset and these versions are presented as dates like 201412. Im other words, I have many different dataset:

- customerdata

- products

- etc.

And each of these dataset have different versions, so I he sth like:

- 201411customerdata

- 201412customerdata

- products201309

- etc.

I also have a list of these tables with variables that are associated with them but the names of these tables are written like:

- YYYYMMcustomerdata

- productsYYYYMM

So it's a template name that shows that each tables needs to have a version. And what I want to do is to check whether some variables were added, deleted or maybe the strucute itself wasn't even affected when making a change.

My way of doing that was to make a dataset that would list each version and it's variables in a similiar way to the list of templates I was writing about above.

After that the comparison itself didn't seem like a problem and

as I said before, comparing the structure isn't really what I have problem with. The main issue is that in odrder for this to work I need to join this two dataset (one with table templates and the other one with actual tables) and it's not that straightforward because the names are different. I thought of joinig them on table names using substr but since I have lots of possibilities to cover it makes my code look terribly. Maybe there's a way to treat these version templates as numbers? Or maybe there's a smart way to create a join?

Reeza
Super User

A Valid table name in sas starts with a character or _.

Assuming your data is actually the way you say it is, consider looking as dictionary.column. Parse out the name, so that you have two fields - YYYYMM and table name. You can look at the compress function for this with the ka/kd options to keep alpha/numeric.

Then  transpose the data based on table name/ so the fields of interest become variables and you'll have your comparison/results.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 902 views
  • 0 likes
  • 3 in conversation