BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Giovani
Obsidian | Level 7

Hi SAS experts!

I am facing a situation and I am not sure how to overpass this.

I have a database organized as the data Current. Does anyone know how I can remove the empty cells and create new information as the Desired data?

Thanks

 

data Current;
   input ID name $ Number Street $ City $;
   datalines;
101 Riley 1132 . . 
101 Riley . MappleAve . 
101 Riley . . Downtown
101 Riley . . Dream
101 Riley . 5thStreet .
101 Riley 1012 . .
222 Apple 132 . . 
222 Apple . . . 
222 Apple . . Greenville
222 Apple . . York
222 Apple . ElevenAve .
222 Apple 112 . .
;
run;

 

data Desired;

   input ID name $ Number Street $ City $;

   datalines;

101 Riley 1132 MappleAve Downtown

101 Riley 1012 5thStreet Dream

222 Apple 132 . Greenville

222 Apple 112 ElevenAve York

;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Almost nothing to change. Can you spot the differences? Make sure you understand the code before using it.

 

data Current;
input ID name $ Number $ Street $ City $;
datalines;
101 Riley 1132A . .
101 Riley . MappleAve .
101 Riley . . Downtown
101 Riley . . Dream
101 Riley . 5thStreet .
101 Riley 1012A . .
101 Riley . . Dreams
101 Riley . 4thStreet .
101 Riley 1013B . .
101 Riley 1014CB . .
101 Riley . . Life
101 Riley . 3thStreet .
222 Apple B132 . .
222 Apple . . .
222 Apple . . Greenville
222 Apple . . York
222 Apple . ElevenAve .
222 Apple A112 . .
;

data want;
set current; by id name;
length _N _S _C $24;
retain _N _S _C;
if  not missing(_N) and not missing(Number) or
    not missing(_S) and not missing(Street) or
    not missing(_C) and not missing(City) then do;
    output;
    call missing(_N, _S, _C);
    end;
_N = coalescec(_N, Number);
_S = coalescec(_S, Street);
_C = coalescec(_C, City);
if last.name then do;
    output;
    call missing(_N, _S, _C);
    end;
keep ID name _N _S _C;
rename _N=Number _S=Street _C=City;
run;

proc print data=want noobs;
run;
PG

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20

hello @Giovani 

 

Taking a chance

data Current;
   input ID name $ Number Street :$12. City :$12.;
   datalines;
101 Riley 1132 . . 
101 Riley . MappleAve . 
101 Riley . . Downtown
101 Riley . . Dream
101 Riley . 5thStreet .
101 Riley 1012 . .
222 Apple 132 . . 
222 Apple . . . 
222 Apple . . Greenville
222 Apple . . York
222 Apple . ElevenAve .
222 Apple 112 . .
;
run;

data temp;
update current(obs=0) current;
by id;
if city>' ' then output;
run;


data want;
set temp;
by id name city	notsorted;
if last.city;
run;




Giovani
Obsidian | Level 7

@novinosrin 

 

Your solution was not able to handle when I had more them 2 information's per ID like in the case bellow

data Current;
input ID name $ Number $ Street $ City $;
datalines;
101 Riley 1132A . .
101 Riley . MappleAve .
101 Riley . . Downtown
101 Riley . . Dream
101 Riley . 5thStreet .
101 Riley 1012A . .
101 Riley . . Dreams
101 Riley . 4thStreet .
101 Riley 1013B . .
101 Riley 1014CB . .
101 Riley . . Life
101 Riley . 3thStreet .
222 Apple B132 . .
222 Apple . . .
222 Apple . . Greenville
222 Apple . . York
222 Apple . ElevenAve .
222 Apple A112 . .
;
run;

data Desired;
input ID name $ Number $ Street $ City $;
datalines;
101 Riley 1132A MappleAve Downtown
101 Riley 1012A 5thStreet Dream
101 Riley 1013B 4thStreet Dream
101 Riley 1014C 3thStreet Life
222 Apple B132 . Greenville
222 Apple A112 ElevenAve York
;
run;

PGStats
Opal | Level 21

Assuming you want a new address when a new id, name, number, street or city is encountered:

 

data want;
set current; by id name;
length _N 8 _S _C $24;
retain _N _S _C;
if  not missing(_N) and not missing(Number) or
    not missing(_S) and not missing(Street) or
    not missing(_C) and not missing(City) then do;
    output;
    call missing(_N, _S, _C);
    end;
_N = coalesce(_N, Number);
_S = coalescec(_S, Street);
_C = coalescec(_C, City);
if last.name then do;
    output;
    call missing(_N, _S, _C);
    end;
keep ID name _N _S _C;
rename _N=Number _S=Street _C=City;
run;
PG
Giovani
Obsidian | Level 7

@PGStats 

Thanks for your solutions it seem to be close to handle the issue. I release that I have a information's for my variable named as "Number" as character and not numeric. Any idea how overpass this error "Invalid numeric data, Number='1132A'" ?

 

data Current;
input ID name $ Number $ Street $ City $;
datalines;
101 Riley 1132A . .
101 Riley . MappleAve .
101 Riley . . Downtown
101 Riley . . Dream
101 Riley . 5thStreet .
101 Riley 1012A . .
101 Riley . . Dreams
101 Riley . 4thStreet .
101 Riley 1013B . .
101 Riley 1014CB . .
101 Riley . . Life
101 Riley . 3thStreet .
222 Apple B132 . .
222 Apple . . .
222 Apple . . Greenville
222 Apple . . York
222 Apple . ElevenAve .
222 Apple A112 . .
;
run;

 


data want2;
set current; by id name;
length _N 8 _S _C $24;
retain _N _S _C;
if not missing(_N) and not missing(Number) or
not missing(_S) and not missing(Street) or
not missing(_C) and not missing(City) then do;
output;
call missing(_N, _S, _C);
end;
_N = coalesce(_N, Number);
_S = coalescec(_S, Street);
_C = coalescec(_C, City);
if last.name then do;
output;
call missing(_N, _S, _C);
end;
keep ID name _N _S _C;
rename _N=Number _S=Street _C=City;
run;

PGStats
Opal | Level 21

Almost nothing to change. Can you spot the differences? Make sure you understand the code before using it.

 

data Current;
input ID name $ Number $ Street $ City $;
datalines;
101 Riley 1132A . .
101 Riley . MappleAve .
101 Riley . . Downtown
101 Riley . . Dream
101 Riley . 5thStreet .
101 Riley 1012A . .
101 Riley . . Dreams
101 Riley . 4thStreet .
101 Riley 1013B . .
101 Riley 1014CB . .
101 Riley . . Life
101 Riley . 3thStreet .
222 Apple B132 . .
222 Apple . . .
222 Apple . . Greenville
222 Apple . . York
222 Apple . ElevenAve .
222 Apple A112 . .
;

data want;
set current; by id name;
length _N _S _C $24;
retain _N _S _C;
if  not missing(_N) and not missing(Number) or
    not missing(_S) and not missing(Street) or
    not missing(_C) and not missing(City) then do;
    output;
    call missing(_N, _S, _C);
    end;
_N = coalescec(_N, Number);
_S = coalescec(_S, Street);
_C = coalescec(_C, City);
if last.name then do;
    output;
    call missing(_N, _S, _C);
    end;
keep ID name _N _S _C;
rename _N=Number _S=Street _C=City;
run;

proc print data=want noobs;
run;
PG
Tom
Super User Tom
Super User

What exactly is the logic for creating those particular rows to output?

Giovani
Obsidian | Level 7
Organizing data from a vertical to horizontal format for further graphic visualization

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1466 views
  • 1 like
  • 4 in conversation