BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
robertrao
Quartz | Level 8

Hi,

I have 4 datasets created after running a macro.

All 4 have 2 variables.

category   value

I want to merge by CATEGORY.and rename the value in each of the 4 datasets so that all of them are side by side??

Regards

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

My guess is you're using the macro to create a table of characteristics for different groups, ie control and intervention if it was a clinical trial.

My suggestion is to build that capability into the macro rather than modify the end results. Though what I've done, when in a rush is output it all to excel and merge it manually (eyeballing) it.

It needs to get added in to a manuscript usually so does require doing some manual work anyways.

View solution in original post

15 REPLIES 15
Haikuo
Onyx | Level 15

how about :

data want;

  merge  h1(rename=value=value1)  h2(rename=value=value2)  h3(rename=value=value3)  h4(rename=value=value4);

by category;

run;

Haikuo

robertrao
Quartz | Level 8

That was a quick response. I got the answer I wanted because i was wanting to it 4 different steps.

But I have a small issue . I donot want to disturb the order of the category while I merge. So I merge with out the sorting of the cotegorical variable

I get this error!!!1

3093  data combine;
3094  merge STATC_OUT(rename=value=value1)  STATQ_OUT(rename=value=value2)
3094! STATCQ_OUT(rename=value=value3)  statCQN_OUT(rename=value=value4);
3095  by categorical;
3096  run;

ERROR: BY variables are not properly sorted on data set WORK.STATC_OUT.
categorical=Range value1=16.0 - 77.0 value2=16.0 - 72.0 value3=16.0 - 77.0 value4=2.0 - 86.0
FIRST.categorical=1 LAST.categorical=1 _ERROR_=1 _N_=5

robertrao
Quartz | Level 8

i have something like this(showing the first dataset here)

categorical                   value

AGE

count(missing)           800

mean                         20.3

Range                       13

SEX

Female                      500

Male                           300

RACE

aSIAN                         400                    

HISPANIC                   400

Haikuo
Onyx | Level 15

You have to sort them before using 'merge by'.  proc sql does not require presorting, but 1.) it is different to data step merge, so it may not produce what you want. 2) It is still not guarantee the order unless you using monotonic() in one of the inputing datasets.

So for your problem, if you are certain that all 4 of your incoming tables have exact structure row by row, same amount of total obs and same amount of obs for every category, and at the same order, simply put they are the same tables except the value contents, then you can use:

1. merge without by variable

2. set.

Haikuo

Astounding
PROC Star

Yes, or to put it another way ...

If thy BY offends thee, pluck it out.

It sounds like all 4 data sets have the same categories in the same order.  So suggestion #1 seems appropriate.

robertrao
Quartz | Level 8

Hiiiiiiiiii

I went for it without the BY.

Unfortunately, for example all 106 had no AIDS, so AIDS+ row is eliminated in one dataset.Maybe in the other some people had AIDS and that adds to an extra row which is messing up things!! Any suggestions????

Categorical    value

AIDS

AIDS-              106

AIDS+       

In other words 3 data sets of the 4 are having 55 observations but the odd one has 49.     Thats messing up the merge                 

Astounding
PROC Star

Unfortunately, you have theoretically defined a problem without a unique solution.  Look at this possiblity for CATEGORICAL:

dataset 1:

AIDS

AIDS-

dataset 2:

AIDS-

AIDS+

dataset 3:

AIDS+

What would the proper order be, when you combine them?  There are a few possibilities, which means the computer shouldn't decide ...  you will have to.  One approach would create a format, such as:

value $order

'AIDS'='01'

'AIDS+'='02'

...

Typing this out would be tedious, but you don't need to type it out if you would like to designate one of your "complete" data sets as having the "proper" order.  Use that data set to define the order:

data proper_order;

   set dataset_1;

   retain fmtname '$order';

   start=categorical;

   label = put(_n_, z3.);   /* just in case you have more than 100 categories one day */

run;

proc format cntlin=proper_order;

run;

Then apply the format, assigning an order variable to each data set.  Illustrating with just dataset_4:

data dataset_4;
   set dataset_4;

   order = put(categorical, $order.);

run;

Finally, sort and merge BY ORDER.

Good luck.

Reeza
Super User

Look into adding the sparse option in the proc freq, that way you should have an option for each group

Or look at changing the macro to accommodate by groups on a different level, ie with each result before it stacks them together. I'd started that a bunch of times, but never had a chance to complete it.

robertrao
Quartz | Level 8

BTW I am using the macro you provided me.  I will look into the sparse option of the proc freq and see

I dint understand your second sentence.

Regards

Reeza
Super User

My guess is you're using the macro to create a table of characteristics for different groups, ie control and intervention if it was a clinical trial.

My suggestion is to build that capability into the macro rather than modify the end results. Though what I've done, when in a rush is output it all to excel and merge it manually (eyeballing) it.

It needs to get added in to a manuscript usually so does require doing some manual work anyways.

robertrao
Quartz | Level 8

Hi,

In the macro for proc freq I tried to incorporate the SPARSE option and it does nt work for me...

Reeza
Super User

What version of SAS do you have?

robertrao
Quartz | Level 8

9.1

robertrao
Quartz | Level 8

Hi,

I went to export each of the 4 datasets and merged manually in Excel and then imported Back.

Thanks for the help

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 15 replies
  • 2188 views
  • 6 likes
  • 4 in conversation