BookmarkSubscribeRSS Feed
LisaYIN9309
Obsidian | Level 7

I've come across this task that requires me to concatenate patient information from 30 different sites/30 different tables, each table has 20 columns, I've rearranged them so that all the column orders and names are the same across 30 dataset.

 

However, each table was submitted by different sites, so there were lots of messy formats going on. i.e., patient_Ids are charter values in some sites' data (they were unique identifier, but can be anything, like UDWGER15581DE2) but are numeric values in others (1,2,3....). Same case for other variables, like drug_ID. For the same reason, this also means the length of the values, if there were character, were different, some Patient_ID are $15, some are $49.....

 

To sum up, I was wondering if there is any easy way to systematically unify them? Instead of changing 20*30=600 variables by hand?

 

Any suggestion will be greatly appreciated.

Thank you

4 REPLIES 4
ballardw
Super User

If by this:  I've rearranged them so that all the column orders and names are the same across 30 dataset. You mean that you have arranged some SAS datasets with the variables in a common order I would be very tempted to:

 

1) Proc Export all of the sets to CSV files, may want to use the option Putnames=no so it is easier in the next step

2) combine all the csv files into a single one

3) Proc import the large csv file with a large value for guessing rows.

4) either modify the data step generated by proc import to use your desired variable names and formats or use proc datasets to do so.

 

 

Reeza
Super User

You'd never do it by hand, I like @ballardw solution the best, but if you want to keep it in SAS you should set up a master data set that has the formats you want and then use that to check variables one by one and convert. You'd design it for one data set and then automate but it's a pain. The writing it to a CSV file and then importing is a good idea. 

 

 

PGStats
Opal | Level 21

If the problem is that some columns sometimes come up as numeric but you would like them to be character, you can convert the problematic columns with something like:

 

length tmp1 $nn;

tmp1 = cats(myProblematicColumn);

drop myProblematicColumn;

rename tmp1=myProblematicColumn;

 

This doesn't affect the column if it is already character.

PG
s_lassen
Meteorite | Level 14

@LisaYIN9309:

If you can define an output format where all variables are character, this method may work: use the VVALUEX function.

Given a couple of tables with different variable types, but same variable names:

data a;
  length x 8 y $30 z $14;
  input x y z;
cards;
1 2 3
5 fdsfgh 2
3 c 31254
;run;

data b;
  length x $3 y 8 z $4;
  input x y z;
cards;
1rr 2 3
25 2 gg
33 31254 f
;run;

You then define a model table where all variables are character, and long enough to hold all values, e.g.:

data want;
  length x $8 y $30 z $20;
  stop;
run;

You can then submit the same datastep for each input to add your data to the model dataset - you may want to put that in a macro:

%macro add_table(table);
data want;
  if 0 then modify want;
  array out _character_;
  set &table(rename=(x=_1 y=_2 z=_3));
  do _N_=1 to dim(out);
    out(_N_)=left(vvaluex(cats('_',_N_)));
    end;
  output;
run;
%mend;

Just make sure that the variables in the RENAME option do not collide with the real variable names.

Now you need just one line of code for each input table:

%add_table(a);
%add_table(b);

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 832 views
  • 0 likes
  • 5 in conversation