BookmarkSubscribeRSS Feed
SASPhile
Quartz | Level 8
Problem Description:
We get montly data in excel files every month from various vendors.The data type of variables for each pharmacy is not consistent.I would like to define the data type for all the variables and I would like to check before the any analysis is done. How to achieve this.
Say for instance:
I want the patient_birth_year and zip to be numeric so that I can apply z5 format on zip later in my program. Some vendors give the zip in character and some give in numeric. I would like to check the variables at the very beginning.
2 REPLIES 2
ChrisNZ
Tourmaline | Level 20
The proper way is to analyse the variable types one by one (using vtypex() or proc contents ) and fix the bad ones.
A faster way is to redefine them all in bulk and let sas do the translation, like:

[pre]
data clean_up;
length varc1 varc2 $8 varn1 varn2 8;
keep varc1 varc2 varn1 varn2 ;
set excel_import(keep=tempvarc1 tempvarc2 tempvarn1 tempvarn2);
varc1=tempvarc1;
varc2=tempvarc2;
varn1=tempvarn1;
varn2=tempvarn2;
run;
[/pre]
Depending on how much clean-up is required, the 2nd way might be sufficient.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
I would say that any variability permitted with input data is much like a crap-shoot.

The example with ZIPCODE is a good one - do you intend to truncate a ZIP+4 (character string) at the first 5 characters when submitted, just so you can treat the data as numeric? Why not take the data as the sender has intended and convert it properly to character, maintaining leading zero prefix substring for the certain zip codes?

If you intend to maintain job security by allowing uncontrolled input formats, so be it -- and SAS can handle the task, presuming the individual supporting the application can maintain the input/decoding SAS program(s) needed to ensure data quality along the way.

Scott Barry
SBBWorks, Inc.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 658 views
  • 0 likes
  • 3 in conversation