BookmarkSubscribeRSS Feed
NonSleeper
Quartz | Level 8

Suppose the hypothetical data looks like this:

data temp;

input ID Rec1 OrderRec1 Rec2 OrderRec2;

datalines;

001     10     1     .         1

001      .       2      10     2

001     11     3      11     3

001     12     4     .         4

001     .        5     12      5

;

The set-up is: For each cluster of ID (here I give only ID=001), there always exists pairs of Rec1 and Rec2 such that Rec1=Rec2. These variables can be non-missing for each observation (as in the third row) or one can be missing in one observation and appears in the other. The order variables are always available, they have certain categories (here 5) and indicate the ordering of Rec1 and Rec2.

I now want to create a data that contrast Rec1 and Rec2 for each cluster of ID, and if they are equal, create a new variable (say OrderRec2New) that equals the value of OrderRec2.  In other words, I want to make a new data with records that look like row 3. Unnecessary information from Rec2 and OrderRec2 can be deleted, it looks like this:

001     10     1     10     2

001     11     3     11     3

001     12     4     12     5

4 REPLIES 4
Fugue
Quartz | Level 8

Let's see if I've got this right . . . .

If REC1 = REC2 then keep the record (e.g. 3rd record in your sample input data).

If REC2 is missing, then combine the record with the next sequential record (e.g. first two records of your input data become one row, with REC1, ORDERREC1 from the first row and REC2, ORDERREC2 from the second row).

Assuming I understand this correctly, here's a PROC SQL approach:

proc sql;

     create table want as

     /* get records that already have matches */

     ( select *

     from temp

     where rec1 = rec2 )

    UNION

     /* merge non-matches */

     ( select t1.*, t2.rec2, t2.orderrec2

     from temp t1

          /* get the missing values from the next obs */

          inner join

          ( select t2.*

            from temp t2

            where t2.rec1 is missing and t2.rec2 is not missing ) r2

     on r2.id = t1.id

          and r2.recno = t1.recno + 1

     where t1.rec1 is not missing and t1.rec2 is missing )

;

quit;

RichardinOz
Quartz | Level 8

A datastep merge might be simpler, assuming the data is ordered as shown:

Data want ;

     Merge    have (where = (Rec1 is not Null)

                           drop = Rec2 orderRec2

                          )

                  have (where = (Rec2 is not Null)                        

                           drop = Rec1 orderRec1

                          )

                  ;

     By     ID ;

Run ;

Richard

LinusH
Tourmaline | Level 20

Even a simple SQL inner join would work, and does not require that the data is presorted.

Data never sleeps
AnnaBrown
Community Manager

I'm glad you received a couple of helpful answers, NonSleeper. I wanted to let you know that there is a SAS Support Data Management Community where you can post questions like these. You may find other useful resources there as well.

Anna


Join us for SAS Community Trivia
SAS Bowl XXIX, The SAS Hackathon
Wednesday, March 8, 2023, at 10 AM ET | #SASBowl

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 620 views
  • 10 likes
  • 5 in conversation