hi,
is it possible making Data Quality and just using: SAS/Base, SAS/STAT, SAS/Enterprise Miner and SAS/Enterprise Guide? or in other words: is it possible to reemplace DataFlux with other SAS applications? as far as I know DataFlux is really expensive to get it
What's your requirements?
I wanna make a Data Quality project but I don't have a budget to buy DataFlux but I have: SAS/Base, SAS/Enterprise Guide and SAS/Enterprise Miner, is it possible to do it? or in other words, is it strict to have DataFlux if I want to make this kind of project?
it would depend on which specific functions of DataFlux you are concerned with.
I use base SAS to provide a lot of data quality reports such as invalid values (dates out of range, end date before begin date, expected code or identification values ), missing values for required fields, inconsistent data (change of gender, race, ethnicity at multiple visits), out of expected range of values (low income clients with reported income exceeding $50,000 per month, montly income of less than $10, family size exceeding 20 members, parents under age 10, relative humidity instruments with values over 100%). I also can check things like actual frequency versus expected frequency of client visits or instrumental reports.
Some of decision might be based on how rapid you need things to work and how variable you data may be. If your data variables change frequently then there can be a lot of time spent in code maintenance and keeping all dependencies correct when using base SAS code.
I was thinking about using regular expressions but as far as I know I can use that functions SAS/Base has. For me DataFlux is a good software in order to reduce a lot of time in making programming codes but I think the big help is the methology than can be implemented using only the other SAS applications, am I correct?
@Edgarin1st wrote:
I was thinking about using regular expressions but as far as I know I can use that functions SAS/Base has. For me DataFlux is a good software in order to reduce a lot of time in making programming codes but I think the big help is the methology than can be implemented using only the other SAS applications, am I correct?
Regular expressions are a way to play with individual variables. They are available in a datastep. But that is only way to deal with data quality. For instance custom informats can validate ranges or specific values of numerics or code values when reading the data.
How big is the project you are working on and what are your actual data concerns? It is hard to even guess what tradoffs involved might be and which way to go. It may well be that contacting SAS Institute with very detailed requirements of the project could get you some idea of the scope of alternatives and resources needed.
" I wanna make a data quality project" isn't a requirement. What problems should your problem solve? How do you measure if you succeed? Which other persons/functions is involved? Who is the business owner of the data quality problem?
Regular expressions is a solution, but for what? They are for free in Linux, and you need to tell why you need to use SAS. Must be more to it...
Dataflux is an enterprise-level solution for data quality management hence its pricing. It all depends on your requirements. Remember that Dataflux can come with country-specific knowledge bases that deal with local spelling and can include country address data. I'd hate to have to repeat that.
For example if you are trying to scrub and geocode address data, trying to do that yourself in Base SAS yourself would be just as expensive as it would be to get Dataflux and have a solution in a few weeks, and that is assuming you have the expertise to begin with.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.