BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hjjijkkl
Pyrite | Level 9

I have a big data where the ID repeats multiple times (table 1) in a column and I want to reduce into 1/observation (table 2). Ones the IDs are reduces to 1 per obs, I want to merge table 2 with table 3.

Table 1

ID

1

1

2

3

3

3

4

4

 

I want the output to be

Table 2

ID

1

2

3

4

 

I want to merge table 2 with table 3 below to have an output like table 4

Table 3

ID

Visit1

Visit2

Visit3

1

1/4/18

6/9/18

10/12/18

3

7/8/16

7/30/18

9/6/18

 

I want the output to be

Table 4

ID

Visit1

Visit2

Visit3

1

1/4/18

6/9/18

10/12/18

2

.

.

.

3

7/8/16

7/30/18

9/6/18

4

.

.

.

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

If your data is sorted by id then one way:

data table_2;
   set table_1;
   by id;
   if first.id;
run;

When the data step uses a BY statement each variable on the statement gets two automatic temporary variables that have values 1 (true) and 0 (false) to indicate if the current observation is the First or Last of the by group. You access those with  the  First.variablename or Last.variablename construct.

The IF is a subsetting if and only keeps records for the output data set where the condition is true, i.e. the first record with that value of Id.

View solution in original post

3 REPLIES 3
Astounding
PROC Star

It might be easier to skip creating Table_2 entirely:

data table_4;
   merge table_1 table_3;
   by ID;
   if first.id;
run;

Of course this assumes that you don't really need Table_2 as long as you can get Table_4.

 

It requires both data sets to be in sorted order as a starting point.

hjjijkkl
Pyrite | Level 9
Thanks! But, I still need table 2? Please show me how to generate table 2.
ballardw
Super User

If your data is sorted by id then one way:

data table_2;
   set table_1;
   by id;
   if first.id;
run;

When the data step uses a BY statement each variable on the statement gets two automatic temporary variables that have values 1 (true) and 0 (false) to indicate if the current observation is the First or Last of the by group. You access those with  the  First.variablename or Last.variablename construct.

The IF is a subsetting if and only keeps records for the output data set where the condition is true, i.e. the first record with that value of Id.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 582 views
  • 2 likes
  • 3 in conversation