Help using Base SAS procedures

Merging data in a long format

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 15
Accepted Solution

Merging data in a long format

Greetings, I hope to get help on the following issue:

I have a dataset arranged in a long format, with repeated obervations for each subject; something like this:

PATIENT ID    VISIT         DATE          VAR X

1                      1         01/01/2005         ...

1                      2         01/07/2005         ...

1                      3               ...                ...

2                      1               ...                ...

2                      2               :..                ...

2                      3               ...                ...

Then I need to merge it with another dataset which has a similar structure but contains variable Y. I can't use the "merge" statement in a data step using the patient ID as the key because it is repeated and I can't use VISIT as a key neither because it's not unique. Does anyone has any idea on how to merge them?

Thank you,

Manuel


Accepted Solutions
Solution
‎10-16-2011 07:39 PM
Esteemed Advisor
Posts: 7,072

Merging data in a long format

Can't you just use "patient_id visit" as the by variable?

Otherwise, it would help if you provide examples of both datasets, the code you are trying, and what you want to accomplish.

View solution in original post


All Replies
Solution
‎10-16-2011 07:39 PM
Esteemed Advisor
Posts: 7,072

Merging data in a long format

Can't you just use "patient_id visit" as the by variable?

Otherwise, it would help if you provide examples of both datasets, the code you are trying, and what you want to accomplish.

Occasional Contributor
Posts: 15

Re: Merging data in a long format

I didn't know that I could use more than one variable in the BY statement. Indeed the "unique" key is the combination of both PATIENT_ID and VISIT variables.

I will try it to see if it works. Otherwise, I will give more specific info on my datasets and code.

EDIT: it's working. Thank you so much.

Manuel

Esteemed Advisor
Posts: 7,072

Merging data in a long format

Yes you definitely can include multiple variables in a by statement.  Let the Forum know if that corrects your problem.

Occasional Contributor
Posts: 15

Re: Merging data in a long format

Thank you.

Is there a way to create a variable that ranks each observation in order of time? In example, from the following dataset:

patient_id     date_of_measurement     value     rank

1                         40608                      1.5        1

1                         40707                      1.8        2

2                         31609                      2.0        1

2                         32705                      2.0        2

2                         40606                      2.2        3

The "rank" variable measures the temporal order of measurements within each patient; it is absent in the original dataset and should be created.

Many thanks.

Esteemed Advisor
Posts: 7,072

Re: Merging data in a long format

I think you are only asking for something like:

data have;

  input patient_id date_of_measurement  value;

  cards;

1   40608      1.5

1   40707      1.8

2   31609      2.0

2   32705      2.0

2   40606      2.2

;

proc sort data=have;

  by patient_id date_of_measurement;

run;

data want;

  set have;

  by patient_id;

  if first.patient_id then rank=1;

  else rank+1;

run;

Occasional Contributor
Posts: 15

Re: Merging data in a long format

Exactly. Another way would be PROC RANK, but yours can be embedded in my data step, so I will go for this.

Thank you.

Esteemed Advisor
Posts: 7,072

Re: Merging data in a long format

proc rank would give you ties on same date entries.  Would you want that?

Occasional Contributor
Posts: 15

Re: Merging data in a long format

I shouldn't have such data in my dataset. But then again, it might serve as a double check if I run PROC RANK and then look for ties within the same subject.

Even if performing the procedure inside the data step just seems more elegant to me (not a big issue, I know).

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 181 views
  • 6 likes
  • 2 in conversation