Hi All,
I need help with below case. Below is input [first_name,last_name,addr]
Tony;Greg;apt#711;
Tony;Greg;1276 pan ave;
Tony;Greg;TUCSON AZ 85705;
Tony;Greg;usa;
I need to group by first_name and Last_name and put address value per each record.
Output: [first_name,last_name,addr1,addr2,addr3,addr4,addr5,addr6]
Tony;Greg;apt#711;1276 pan ave;TUCSON AZ 85705;usa; ; ;
can anyone please help me how I can achieve this.
Best Regards,
The simplest way is using proc means + idgroup .
Or MERGE skill . Me ,Matt, Arthur.T have talked about at this paper .
http://support.sas.com/resources/papers/proceedings15/2785-2015.pdf
data have;
infile cards truncover expandtabs;
input (Firstname Lastname NewAddr OldAddr) (: $40.);
cards;
Tony Greg apt#1276 apt#909
Tony Greg DRAGRAM DRAGRAM
Tony Greg TUCSON AZ 85705 TUCSON AZ 85705
Tony Greg USA USA
Ricky Pointing apt#711 300
Ricky Pointing Jugle ave East BOYLSTON AVE E
Ricky Pointing PHOENIX SEATTLE
Ricky Pointing AZ 85123 WA 98102
Ricky Pointing USA USA
arjuna ranatunga 100 MAIN ST apt#1276
arjuna ranatunga PO BOX 1022 100 Islington rd west
arjuna ranatunga SEATTLE WA 98104 Seattle
arjuna ranatunga USA WA 93102
arjuna ranatunga USA
;
run;
proc sql;
select max(n) into : n
from (select count(*) as n from have group by Firstname,Lastname);
quit;
proc summary data=have nway;
class Firstname Lastname;
output out=want idgroup(out[&n] (NewAddr OldAddr)=);
run;
Xia Keshan
Message was edited by: xia keshan
Assuming your data are sorted:
proc transpose data=have prefix='addr' out=want;
by first_name last_name;
var addr;
run;
Hi,
Yes, I tried this. But I need 6 variables for address.
In the example, there are only 4 occurance; therefore, addr5, addr6 should be blank.
Best Regards,
any help.
Best Regards,
You need to provide better sample data and sample required output. Please also include some more details in your question.
Reeza wrote:
You need to provide better sample data and sample required output. Please also include some more details in your question.
Sorry, I will try to be more clear.
I have input data as shown below. I need to group based on first and last name and populate address(s) across [horizontal] like shown below [max 6 address variables each].
Input:
First name | Last name | New Addr | Old Addr |
Tony | Greg | apt#1276 | apt#909 |
Tony | Greg | DRAGRAM | DRAGRAM |
Tony | Greg | TUCSON AZ 85705 | TUCSON AZ 85705 |
Tony | Greg | USA | USA |
Ricky | Pointing | apt#711 | 300 |
Ricky | Pointing | Jugle ave East | BOYLSTON AVE E |
Ricky | Pointing | PHOENIX | SEATTLE |
Ricky | Pointing | AZ 85123 | WA 98102 |
Ricky | Pointing | USA | USA |
arjuna | ranatunga | 100 MAIN ST | apt#1276 |
arjuna | ranatunga | PO BOX 1022 | 100 Islington rd west |
arjuna | ranatunga | SEATTLE WA 98104 | Seattle |
arjuna | ranatunga | USA | WA 93102 |
arjuna | ranatunga | USA |
Output:
First name | Last name | New addr1 | New addr2 | New addr3 | New addr4 | New addr5 | New addr6 | Old addr1 | Old addr2 | Old addr3 | Old addr4 | Old addr5 | Old addr6 |
Tony | Greg | apt#1276 | DRAGRAM | TUCSON AZ 85705 | USA | apt#909 | DRAGRAM | TUCSON AZ 85705 | USA | ||||
Ricky | Pointing | apt#711 | Jugle ave East | PHOENIX | AZ 85123 | USA | 300 | BOYLSTON AVE E | SEATTLE | WA 98102 | USA | ||
arjuna | ranatunga | 100 MAIN ST | PO BOX 1022 | SEATTLE WA 98104 | USA | apt#1276 | 100 Islington rd west | Seattle | WA 93102 | USA |
Best Regards,
The simplest way is using proc means + idgroup .
Or MERGE skill . Me ,Matt, Arthur.T have talked about at this paper .
http://support.sas.com/resources/papers/proceedings15/2785-2015.pdf
data have;
infile cards truncover expandtabs;
input (Firstname Lastname NewAddr OldAddr) (: $40.);
cards;
Tony Greg apt#1276 apt#909
Tony Greg DRAGRAM DRAGRAM
Tony Greg TUCSON AZ 85705 TUCSON AZ 85705
Tony Greg USA USA
Ricky Pointing apt#711 300
Ricky Pointing Jugle ave East BOYLSTON AVE E
Ricky Pointing PHOENIX SEATTLE
Ricky Pointing AZ 85123 WA 98102
Ricky Pointing USA USA
arjuna ranatunga 100 MAIN ST apt#1276
arjuna ranatunga PO BOX 1022 100 Islington rd west
arjuna ranatunga SEATTLE WA 98104 Seattle
arjuna ranatunga USA WA 93102
arjuna ranatunga USA
;
run;
proc sql;
select max(n) into : n
from (select count(*) as n from have group by Firstname,Lastname);
quit;
proc summary data=have nway;
class Firstname Lastname;
output out=want idgroup(out[&n] (NewAddr OldAddr)=);
run;
Xia Keshan
Message was edited by: xia keshan
Hi Xia,
Thank you. I used Transpose + merge and it worked as expected.
Best Regards,
PROC TRANSPOSE will create as many variables are are actually needed by your data. If you are only getting four address variables, it's because that is as many as is required to transpose your data. You can easily convert that to six variables with two blank ones. Operating on the output of the PROC TRANSPOSE:
data want;
set want;
length addr5 addr6 $ 100;
run;
Pick whatever length you feel is appropriate for the extra two variables.
Hi Astounding,
But it is not always that first 4 records will have address.
There is a case where address is only one record i.e. addr1 has entire data and addr2 thru addr6 should be space.
There is a case where address is in 6 records i.e. addr1 thru addr6 has value [complete address].
Best Regards,
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.