Hi everyone.
So I’ve got code and data that I’d like to share with everyone.
What I’ve done is going to sound nuts but I hope the madness makes sense.
And frankly it will make sense to the right people who understand the nature and reasons for all this.
Long Term Purpose:
Help people work with data and code for the VRDC.
One Needed Element:
We need "local" data that looks like what we experience on the VRDC.
The best way to show the usefulness of the code on VRDC is to use locally on our own machines with data that "looks" like the VRDC.
The Problem or Challenges:
Often I hear about how teams are working crazy to get "good" code ready for running on the VRDC.
Or they code one version for the VRDC and another version for something "not on the vrdc".
Teams inadvertently spend too much time with different versions of CODE and those code differences can cause issues. A "minor" difference of our code on the VRDC can cause us to spend time finding out "why"?
A part of the Challenge:
This leads to teams working on the same code, project but in different locations. That makes it hard to use GIT and version control.
That also makes it hard to have teams use more than one syntax.
Long Term Benefit:
I've got a method that works on crafting code locally - prep, QA and Code Review then we run proven code on the VRDC.
In order for that to work we need a local copy of some useful data to mechanically test our code.
This is that copy of data to ensure what we've built and tested between "us" locally is in fact ready for the VRDC.
What we Want:
1. We want to code "local" to ensure MORE of our team tests and vets our code.
2. We want to code "local" to ensure our code CAN work before we hit large data on VRDC.
3. We want to test and sample "local" as much as we can and prepare novice SAS users for real VRDC work.
This method of "code local" and "run proven code on VRDC" - ensures higher quality.
So in order to accomplish that - we need a LOCAL copy of data that emulates the VRDC data as much as possible.
More To Share:
I will share more of how we can use this overall process and GIT to improve our team workflow.
Again - free. I just need to see people want this and i'm happy to share what i have and know - for free.
What I've Done with PUF Data:
I’ve taken the PUF CMS data and spun it into VRDC versions for everyone to use.
I’ll release the actual code that produces this sample data later on. It’s not passing my ocd perfection standards just yet.
For now - lets start with this data and what it has and offers.
Details on what I've done with PUF to VRDC Data:
I’ve got that PUF synth data 2008 thru 2010 data.
Spun up thru 2016. Yep - anyone with knowledge about the PUF data will have questions.
I'll write up more of the "what was done" in my git repo as release notes.
PUF was turned into: MAX - By year, By State
PUF was turned into: RIF - By year, By month
PUF was turned into: MBSF - By year
If a "field" was compatible as PUF - i flipped it to a VRDC configuration - as much as possible.
Heck I even crafted some synthetic ICD10 values to pepper in there. Again - more details soon.
I've even dealt with components like HCPCS and also peppered some values from the PUF pharma table.
Again the goal is for more people to have a analog way of working w code and data that matches the VRDC.
Why?
- Because the code I’m releasing as well is really helpful. Well imho.
- Macros to create views and spin thru years and numerous data tables that are found in the crazy VRDC libs.
- Macros to create rate ratios.
- Macros for summary cubes.
- Method of code that allows toggle between say Max and RIF
To me that "toggle" code - ability to quickly switch between versions like RIF, LDS, MAX is very important. Again. All that code I'm going to share needs a place for people to test and play with. Hence the data im sharing here. Not to mention the need for people to learn how to use RSubmit to really get performance improvement from the VRDC.
I feel like some of these methods can not accurately be tested or demonstrated without a useful local environment.
My GIT VRDC repo will be one place to start to find the Google Drive link.
I'll update the repo I've listed below with Google Drive Links to the data this Monday 2/24.
Max data: Its ready. RIF data: Its ready. So you can download those zip files and get that set.
This way as i release macros and other code for you to test and play with - there is a useful place to test that as well.
If anyone is interested in a walk thru overview - contact me.
I'll see about putting a short webinar tutorial in early March.
I’ve also drafted and building a class on GIT for teams who work w various syntax languages and sas for the vrdc. That will be for the fall sas conferences. But if time permits. I might be able to sneak in a “practice” class before the conferences in summer so I can rehearse. So might ask for one or two people to take part of it for free.
I'm also attending SAS Global Forum in DC next month.
If anyone is attending - lets gather and I'm happy to cover these concepts in an Ad-Hoc presentation.
So I hope this is all of value to people.
Code great things!
Best
Zeke Torres
#PUF #CMS #VRDC #LDS #RIF #Medicare #Medicaid #Claims #Healthcare #HealthCareAnalytics #Resdac #AdvancedMacros #Macros #Rsubmit #MBSF #GIT #ICD9 #ICD10 #Pharma #NPI #Physician #HCPCS