BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Skillside
Calcite | Level 5

Hi,

I have data with 4 variables: drugs A, B, C, D. They are in separate columns. Now, I would like to merge them into one variable called drugs. So variable drug = observations from A, then after A, observations from B etc. And of course, I will need second variable (a grouping variable) named groups where A = 1, then B=2, C=3.

So now it looks like this:

Dataset1:

 

dose1 dose2

1.23        2.25

1.25        3.50

 

and i want to have:

 

Dataset1:

 

dose

 

1.23

1.25

2.25

3.50

 

and for specific values i would like to create grouping variable so for 1.23 and 1.25 group A, for 2.25 and 3.50 group B

I used to do it manually but i am tired of this 🙂

I couldn't find a code for this problem, sorry!

Thank in advance for any resposne.

Regards,

Skillside

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

You're close, but need a couple of changes to use the code I posted.  Try it this way:

 

data want;
group = 'A';
do until (done1);
set work.a (rename=(lekr=dose)) end=done1;
output;
end;
group = 'B';
do until (done2);
set work.a (rename=(drar=dose)) end=done2;
output;
end;
keep group dose;
run;

proc print;
run;

 

This assumes that your data set work.a contains both LEKR and DRAR.

View solution in original post

13 REPLIES 13
Reeza
Super User
TRANSPOSE.

How to deal with your groups depends on your data structure, but for now PROC TRANSPOSE is all you need or a data step.

Google "UCLA SAS wide to long + either PROC TRANSPOSE or DATA STEP" and you'll find two really good tutorials on how to do this step by step.
PaigeMiller
Diamond | Level 26

Are we able to assume that the first column of numbers that you call dose1 (even if there are more than two) always are A, and the second column of numbers that you call dose2 (even if there are more than two) are always B? Where is C and D that you mention?

 

Can you give us a slightly larger and more realistic data set (that matches your description) and desired outcome?

--
Paige Miller
Skillside
Calcite | Level 5

Thank you for the all answers. I know UCLA website, however, it doesn't solve the problem. I attached a little part of my database.

I would like to merge two variables (even more) and create another one which will be a grouping variable. For first variable from attachment it should give values “1” and for second variable should give value “2”.

Tom
Super User Tom
Super User

It is really not at all clear what you what to do. Especially because you are using the verb MERGE that has very specific meaning in SAS syntax for something that does not appear to be at all related to a merge operation.

 

Sound like you want to just tranpose your data, but it will be hard to use PROC TRANPOSE if you don't actually have "real" dataset that has some type of unique identifier of the observations.

 

Why not just combine the two "columns" into one "column". 

 

So if your original dataset is named HAVE and has two variable named VAR1 and VAR2 that are of compatible type you can create a new dataset named WANT that has one variable named NEWVAR with all of the values from both variables in the original dataset.

data want ;
  set have(keep=var1 rename=(var1=newvar))
      have(keep=var2 rename=(var2=newvar))
  ;
run;

 

PaigeMiller
Diamond | Level 26

@Skillside wrote:

Thank you for the all answers. I know UCLA website, however, it doesn't solve the problem. I attached a little part of my database.

I would like to merge two variables (even more) and create another one which will be a grouping variable. For first variable from attachment it should give values “1” and for second variable should give value “2”.


Many people cannot or will not download Microsoft Office documents because they are a security threat. Whatever explanation or data that you provided in the .docx file can be pasted right here into this window as text (not as a picture). Thanks

--
Paige Miller
Astounding
PROC Star

Here's one approach:

 

data want;
   group = 'A';
   do until (done1);
      set have (rename=(dose1=dose)) end=done1;
      output;
   end;
   group = 'B';
   do until (done2);
      set have (rename=(dose2=dose)) end=done2;
      output;
   end;
   keep group dose;
run;
   

That should accomplish what you illustrated.  If you have other conditions to add, the program will have to be adapted to match.

Skillside
Calcite | Level 5

It seems like your suggestions is fine, however, I use data which is in library "work". So when i run your code it says that there are no observations.

Here is the code:

data want;
set work.a;
group = 'A';
do until (done1);
set have (rename=(lekr=dose)) end=done1;
output;
end;
group = 'B';
do until (done2);
set have (rename=(drar=dose)) end=done2;
output;
end;
keep group dose;
run;

proc print;
run;

 

Sorry for docx file.

Tom
Super User Tom
Super User

Is the dataset with your data named A (you can also call that as WORK.A) or HAVE (you can also call that WORK.HAVE).

Either way why are you trying to read both of them?

Skillside
Calcite | Level 5

Probably the problem is the way I learnt it.

So firstly i imported data from xls to sasuser then I create virtual working data called a which exsits in library WORK.

So i was sure that statement "data.name" is for creating virtual database, while set statement is about: let's read data from source that i want (in this case sasuser.name).

So I don't understand why you mention that i want to read both of them?

Thanks for your answer!

Tom
Super User Tom
Super User

@Skillside wrote:

Probably the problem is the way I learnt it.

So firstly i imported data from xls to sasuser then I create virtual working data called a which exsits in library WORK.

So i was sure that statement "data.name" is for creating virtual database, while set statement is about: let's read data from source that i want (in this case sasuser.name).

So I don't understand why you mention that i want to read both of them?

Thanks for your answer!


The DATA statement is the start of a data step.  You are right that you put the name(s) of the dataset(s) that you want to create with this data step on that statement following the keyword DATA.  Note there is a space between the keyword data and the dataset name (and if you are creating multiple output datasets in the same step a space between each of them).  Perhaps you are mixing up the two level naming convention that you should use to refer to datasets?  LIBREF.MEMNAME where LIBREF is the name you created with you LIBNAME statement (or one of those automatically created by SAS at startup like WORK or SASUSER) and MEMNAME is the name of the dataset you are referencing that is stored in that library.  If you don't use the two level name than that is the same using WORK as the libref.

 

The SET statement is for reading in datasets. You can also use MERGE statement and UPDATE statement to specify datasets to be read.

 

I asked which dataset you wanted to read because YOUR code was reading from two different datasets in the three SET statements that it contained.

...
set work.a;
...
set have (rename=(lekr=dose)) end=done1;
...
set have (rename=(drar=dose)) end=done2;
...

 

WORK datasets are not virtual.  It is just that SAS will delete them when you are done.  That is one of the bad habits that I see in a lot of SAS novices. They are always adding data steps to make copies of their data for no good reason.  It just makes their code more complex and take longer.  So if your real source dataset is in SASUSER then use that name in your SET statement. No need to first copy it to a work dataset.

 

Perhaps it is a side effect of how they are taught?

Skillside
Calcite | Level 5

Dear Tom, now I understand what You meant. Thank you for little lesson that you gave me. However, please, keep in mind that with your knowledge it is probably easier to understand me then the other way around. Of course there might be some problems with definitions but i hope we can quickly sort it out! Thank you

Astounding
PROC Star

You're close, but need a couple of changes to use the code I posted.  Try it this way:

 

data want;
group = 'A';
do until (done1);
set work.a (rename=(lekr=dose)) end=done1;
output;
end;
group = 'B';
do until (done2);
set work.a (rename=(drar=dose)) end=done2;
output;
end;
keep group dose;
run;

proc print;
run;

 

This assumes that your data set work.a contains both LEKR and DRAR.

Skillside
Calcite | Level 5

This is it!. Thank you for help and all interesting comments guys. One day I would like to have your SAS knowledge/skills! 

Regards,

Skillside

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 18070 views
  • 0 likes
  • 5 in conversation