DATA Step, Macro, Functions and more

Grouping of Ids

Reply
N/A
Posts: 0

Grouping of Ids

Hello Sir/Madam,

Here is the sample data.I want to group together id's together based
on the value of id2 and finally have an idnew which will have the value
of the latest year's id as the final id value.I have tried to
illustrate with an example below.I have just shown the grouping for
one case it has to be the same everywhere and the id pertaining to the
latest year should be populated elsewhere (like 93693 is the id for
2003 that is why the new id has this value everywhere).Hope I will get
answers to this unique problem.

data ttt;
infile cards truncover;
input year id $ id1 $;
cards;
1999 8888
2000 66789 96876
2003 93693 42541
1996 78965 78545
2002 42541
2000 7796 8888
2001 42541 7796
2005 7798
2005 7798 1245
2004 4472 6668
2005 63373 737373
;

This is the sample data
Year id id1
1999 8888
2000 66789 96876
2003 93693 42541
2003 65574 64647
1996 78965 78545
2004 64647 63373
2002 42541
2000 7796 8888
2001 42541 7796
2005 7798
2005 7798 1245
2004 4472 6668
2005 63373 737373

Final Output

Year id id1 idnew

2003 93693 42541 93693
2002 42541 93693
2001 42541 7796 93693
2000 7796 8888 93693
1999 8888 93693

2005 737373 63373 737373
2004 63373 64647 737373
2003 64647 65574 737373

so on and so forth.........

Similarly other groups should be created in the same data.
Super Contributor
Super Contributor
Posts: 3,174

Re: Grouping of Ids

A DATA step will work here - after sorting your data file first in sequence using a BY variable list of your choice.

To "RETAIN" a running list of related variable values, you must declare a new SAS character variable of suitable "LENGTH" and then in the DATA step, use SAS FIRST. (the last variable in your BY list) to initialize your value-list variable and also LAST. to output your new SAS file/observation.

Since you must concatenate each related input observation to your "running list" variable, you will want to use a DATA step function like CATT / CATX / TRIM as you build your string of values.

The SAS support http://support.sas.com/ website has SAS-hosted documentation (HTML-format and PDF documents) and supplemental technical / conference papers. Also you can use the SEARCH facility to locate a related topic-oriented paper on the subject. A few are provided below for reference and review.

For what it's worth, here is a Google advanced search argument which can help you locate documents and other references on the SAS.COM domain/site:

data step programming site:sas.com


Scott Barry
SBBWorks, Inc.

DATA Step Processing
http://support.sas.com/documentation/cdl/en/lrcon/61722/HTML/default/a001281588.htm

How the DATA Step Identifies BY Groups
Processing Observations in a BY Group
http://support.sas.com/documentation/cdl/en/lrcon/61722/HTML/default/a000761931.htm


A Hands-On Introduction to SAS® DATA Step Programming
Debbie Buck, D. B. & P. Associates, Houston, TX
http://www2.sas.com/proceedings/sugi30/134-30.pdf


Advanced DATA Step Topics
Neil Howard, Independent Consultant, Charlottesville, VA
http://www2.sas.com/proceedings/sugi24/Advtutor/p47-24.pdf


http://www.sas.com/offices/NA/canada/newsletter/insights/Sept08/SUGI99.pdf
The RETAIN Statement: One Window into the SASâ Data Step
Paul Gorrell, Westat, Rockville, MD
N/A
Posts: 0

Re: Grouping of Ids

Sir,

Please read the post carefully it is beyond retaining a value.I want to co-relate the id based on a old id value and then group them together and then finally have a variable which will have the latest years value as its value through out.


Sachin
Super Contributor
Super Contributor
Posts: 3,174

Re: Grouping of Ids

Your description mentions "id2" but there is no such variable/column in your input or output/final data. Focus on explaining with SAS variable names as you have provided what should be input data and output results. I still believe that if you do some reading on DATA step programming and the references provided, it may help with your personal SAS programming development to learn and compose a SAS solution to your problem. Ideally this is a forum for learning, not just getting answers from others in the form of program code.

Scott Barry
SBBWorks, Inc.
Super Contributor
Posts: 474

Re: Grouping of Ids

Hello snair1981.

Could you please clarify the problem?
Because what you described don't match with the sample data.

> data ttt;
> infile cards truncover;
> input year id $ id1 $;
> cards;
> 1999 8888
> 2000 66789 96876
> 2003 93693 42541
> 1996 78965 78545
> 2002 42541
> 2000 7796 8888
> 2001 42541 7796
> 2005 7798
> 2005 7798 1245
> 2004 4472 6668
> 2005 63373 737373
> ;

First, the above code will not produce the data bellow (imissing some rows).

> This is the sample data
> Year id id1
> 1999 8888
> 2000 66789 96876
> 2003 93693 42541
> 2003 65574 64647
> 1996 78965 78545
> 2004 64647 63373
> 2002 42541
> 2000 7796 8888
> 2001 42541 7796
> 2005 7798
> 2005 7798 1245
> 2004 4472 6668
> 2005 63373 737373

Then...

> Final Output
>
> Year id id1 idnew
>
> 2003 93693 42541 93693
> 2002 42541 93693
> 2001 42541 7796 93693
> 2000 7796 8888 93693
> 1999 8888 93693

OK for the above group, but...

> 2005 737373 63373 737373
> 2004 63373 64647 737373
> 2003 64647 65574 737373

Here id's are swapped from original sample...

So...
Should the data be paired then pairs should be swapped and paired again?!?

Could you please be more specific about your task?

Cheers from Portugal.

Daniel Santos @ www.cgd.pt.
Ask a Question
Discussion stats
  • 4 replies
  • 125 views
  • 0 likes
  • 3 in conversation