Hello,
I have the below sample data that contain phone numbers of employees. Each phone number is flagged with status which could be invalid OR bad OR null.
Uniq ID | Phone | phone_code |
123 | 3438748789 | INVL |
123 | 2349854087 | BAD |
123 | 3986540912 | |
333 | 9835273309 | INVL |
333 | 7635876123 | BAD |
333 | 3897640987 | |
333 | 4589076678 | BAD |
333 | 3322998761 | |
532 | 4758572343 | |
532 | 4298128908 | INVL |
532 | 2892483040 | BAD |
532 | 4989812908 | |
532 | 8830981234 | INVL |
532 | 7812965432 | |
532 | 6098345785 | INVL |
987 | 4898735275 | INVL |
987 | 9612433890 | INVL |
987 | 5412985671 | BAD |
765 | 8745454353 | BAD |
765 | 4834953934 | |
765 | 9834623483 | |
765 | 9823349855 | |
654 | 7686767866 | |
438 | 3498375095 | BAD |
438 | 2953058303 | BAD |
I am looking to transpose the data to one record per uniq ID and create the below flags(all_good, all_bad & GEONE_bad).
All_good = Y when there is no INVL OR BAD status flagged to any of the non-missing phone numbers.
GEONE_BAD = Y when INVL OR BAD status flagged to atleast one phone number.
All_bad = Y when there is INVL OR BAD status flagged to all of the non-missing phone numbers
Below is my desired output:
Uniq ID | phone1 | phone2 | phone3 | phone4 | phone5 | phone6 | phone7 | phone_code1 | phone_code2 | phone_code3 | phone_code4 | phone_code5 | phone_code6 | phone_code7 | All_good | GEONE_Bad | All_bad |
123 | 3438748789 | 2349854087 | 3986540912 | INVL | BAD | N | Y | N | |||||||||
333 | 9835273309 | 7635876123 | 3897640987 | 4589076678 | 3322998761 | INVL | BAD | BAD | N | Y | N | ||||||
532 | 4758572343 | 4298128908 | 2892483040 | 4989812908 | 8830981234 | 7812965432 | 6098345785 | INVL | BAD | INVL | INVL | N | Y | N | |||
987 | 4898735275 | 9612433890 | 5412985671 | INVL | INVL | BAD | N | Y | Y | ||||||||
765 | 8745454353 | 4834953934 | 9834623483 | 9823349855 | Y | N | N | ||||||||||
654 | 7686767866 | Y | N | N | |||||||||||||
438 | 3498375095 | 2953058303 | BAD | BAD | N | Y | Y |
Any help is much appreciated. Thank you!!.
Data aa;
infile cards truncover;
input uniq_id $ phone phone_code $;
cards;
123 3438748789 INVL
123 2349854087 BAD
123 3986540912
333 9835273309 INVL
333 7635876123 BAD
333 3897640987
333 4589076678 BAD
333 3322998761
532 4758572343
532 4298128908 INVL
532 2892483040 BAD
532 4989812908
532 8830981234 INVL
532 7812965432
532 6098345785 INVL
987 4898735275 INVL
987 9612433890 INVL
987 5412985671 BAD
765 8745454353 BAD
765 4834953934
765 9834623483
765 9823349855
654 7686767866
438 3498375095 BAD
438 2953058303 BAD
;
Quit;
proc sql noprint;
select max(n) into : n
from (select count(*) as n from aa group by uniq_id);
quit;
proc summary data=aa nway;
class uniq_id;
output out=temp(drop=_:) idgroup(out[&n] (phone phone_code)=);
run;
proc sql;
create table key as
select uniq_id,
ifc(sum(phone_code in ('INVL' 'BAD')), 'N','Y') as All_good,
ifc(sum(phone_code in ('INVL' 'BAD')), 'Y','N') as GEONE_BAD,
ifc(sum(phone_code in ('INVL' 'BAD'))=count(*), 'Y','N') as All_bad
from aa
group by uniq_id;
create table want as
select a.*,All_good,GEONE_BAD,All_bad
from temp as a left join key as b
on a.uniq_id=b.uniq_id;
quit;
proc print noobs;run;
Please post data in a data step with datalines, so we have something easily usable for testing.
Sorry, here you go:
Data aa;
infile cards;
input uniq_id $3. @5 phone $10. @17 phone_code $4.;
cards;
123 3438748789 INVL
123 2349854087 BAD
123 3986540912
333 9835273309 INVL
333 7635876123 BAD
333 3897640987
333 4589076678 BAD
333 3322998761
532 4758572343
532 4298128908 INVL
532 2892483040 BAD
532 4989812908
532 8830981234 INVL
532 7812965432
532 6098345785 INVL
987 4898735275 INVL
987 9612433890 INVL
987 5412985671 BAD
765 8745454353 BAD
765 4834953934
765 9834623483
765 9823349855
654 7686767866
438 3498375095 BAD
438 2953058303 BAD
;
Quit;
See this:
Data aa;
infile cards truncover;
input uniq_id :$3. phone :$10. phone_code :$4.;
cards;
123 3438748789 INVL
123 2349854087 BAD
123 3986540912
333 9835273309 INVL
333 7635876123 BAD
333 3897640987
333 4589076678 BAD
333 3322998761
532 4758572343
532 4298128908 INVL
532 2892483040 BAD
532 4989812908
532 8830981234 INVL
532 7812965432
532 6098345785 INVL
987 4898735275 INVL
987 9612433890 INVL
987 5412985671 BAD
765 8745454353 BAD
765 4834953934
765 9834623483
765 9823349855
654 7686767866
438 3498375095 BAD
438 2953058303 BAD
;
data valid;
set aa;
by uniq_id notsorted;
if first.uniq_id
then do;
count = 0;
bad_count = 0;
end;
count + 1;
if phone_code > '' then bad_count + 1;
if last.uniq_id;
if bad_count = 0
then do;
all_good = 'Y';
geone_bad = 'N';
all_bad = 'N';
end;
else do;
all_good = 'N';
geone_bad = 'Y';
if bad_count = count
then all_bad = 'Y';
else all_bad = 'N';
end;
keep uniq_id all_good geone_bad all_bad;
run;
proc transpose
data=aa
out=trans1
;
by uniq_id notsorted;
var phone phone_code;
run;
proc transpose
data=trans1
out=trans2
;
var col:;
by uniq_id notsorted _name_ notsorted;
run;
data trans3;
set trans2;
by uniq_id notsorted _name_ notsorted;
length name $32;
if first._name_
then count = 1;
else count + 1;
name = cats(_name_,count);
run;
proc transpose
data=trans3
out=trans4 (drop=_name_)
;
by uniq_id notsorted;
id name;
var col1;
run;
data want;
merge
trans4
valid
;
run;
Thank You!
Data aa;
infile cards truncover;
input uniq_id $ phone phone_code $;
cards;
123 3438748789 INVL
123 2349854087 BAD
123 3986540912
333 9835273309 INVL
333 7635876123 BAD
333 3897640987
333 4589076678 BAD
333 3322998761
532 4758572343
532 4298128908 INVL
532 2892483040 BAD
532 4989812908
532 8830981234 INVL
532 7812965432
532 6098345785 INVL
987 4898735275 INVL
987 9612433890 INVL
987 5412985671 BAD
765 8745454353 BAD
765 4834953934
765 9834623483
765 9823349855
654 7686767866
438 3498375095 BAD
438 2953058303 BAD
;
Quit;
proc sql noprint;
select max(n) into : n
from (select count(*) as n from aa group by uniq_id);
quit;
proc summary data=aa nway;
class uniq_id;
output out=temp(drop=_:) idgroup(out[&n] (phone phone_code)=);
run;
proc sql;
create table key as
select uniq_id,
ifc(sum(phone_code in ('INVL' 'BAD')), 'N','Y') as All_good,
ifc(sum(phone_code in ('INVL' 'BAD')), 'Y','N') as GEONE_BAD,
ifc(sum(phone_code in ('INVL' 'BAD'))=count(*), 'Y','N') as All_bad
from aa
group by uniq_id;
create table want as
select a.*,All_good,GEONE_BAD,All_bad
from temp as a left join key as b
on a.uniq_id=b.uniq_id;
quit;
proc print noobs;run;
@vicky07 wrote:
Uniq_id should be actually numeric variable. Would it change anything in the code.
No, no and triple no. ID's are not used for calculations, so store them as character.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.