Combine data between rows

afernandezj · Posted 08-19-2015 08:41 AM

Hi,

I'm looking for a way to compare data between rows, and only show the distinct values, the idea is to create a repot that shows the all the possible contact numbers. I have the data from multiple sources and the same number is used as another contact number. The idea is to have all the unique contact numbers.

Is there a way to achieve this?, any help would be appreciated

Example of the data:

id	ph1	ph2	ph3	ph4	ph5
1	111			111
1			222
1		222
2				333
2		444	555
2					555
3	666
3				777
3	888	888

Desired Output

id	ph1	ph2	ph3	ph4	ph5
1	111		222
2		444		333	555
3	666	888		777

RW9 · Posted 08-19-2015 09:05 AM

Mmm, seems to be a lot of these questions at the moment. You are struggling due to the nature of the thinking. The data does not need to be transposed in this way, you will find life much easier in a normalised table. Consider this, the second datastep creates a normalised structure with the results going down the page. This can then easily be nodupkey sorted to give the unique values.

data have;

retain id ph1-ph5;

id=1;

ph1=111;

ph4=111;

output;

ph1=.;

ph4=.;

ph3=222;

output;

ph2=222;

output;

run;

data want (keep=id res);

set have;

array ph{5};

do i=1 to 5;

if ph{i} ne . then do;

res=ph{i};

output;

end;

run;

proc sort data=want nodupkey;

by id res;

run;

afernandezj · Posted 08-19-2015 09:26 AM

Hi, thank you for your response, I had that though and I was able to convert the data to use the same column for the phone number, and another column to display the type of phone number:

Example:

id	number	type
1	111	ph1
1	222	ph2
1	111	ph5
2	333	ph3
2	444	ph4

My problem here is removing the duplicate numbers, and since every ID has at least 3 numbers, I have a lot of data. The result was would not fit on an excel spreadsheet.

Is there a way to "clean" the data in this format?

RW9 · Posted 08-19-2015 09:42 AM

Why does the type matter? In your example in the first post you take 222 from ph3, why not from ph2? Doesn't seem to make sense to me. You basically just want to find a list of unique numbers per id, so drop type and nodupkey sort it, i.e. id=1 only has 2, 1 is 111, the other 222.

On the second point, you don't really want to use Excel, I mean what reason would you have. PDF for output, CSV or XML for transfer.

afernandezj · Posted 08-19-2015 09:49 AM

I'm being asked to display the type of contact information, for example if it is the house number, a cell, fax, etc, etc.

If there is a duplicate number, I should take just one, it does not matter witch type I choose.

RW9 · Posted 08-19-2015 10:06 AM

So, just proc sort, nodupkey, by id and number. This will give you 1 row per unique id+number, with a ype (probably the first which appears in the dataset.

afernandezj · Posted 08-19-2015 10:16 AM

Thank you, I'm generating the information, so I can try what you suggested. I will let you know of the results.

afernandezj · Posted 08-20-2015 09:12 AM

Using the proc sort nodupkey by id and number, i was able to get 1 row, per id and number, Thanks!

I'm still removing numbers the users type just to fill in the fields, then I will need to transpose the result.

Thank you for your help!

Combine data between rows

Re: Combine data between rows

Re: Combine data between rows

Re: Combine data between rows

Re: Combine data between rows

Re: Combine data between rows

Re: Combine data between rows

Re: Combine data between rows

SAS Innovate 2025: Call for Content

Classroom Training Available!