BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Giovani
Obsidian | Level 7

Hi SAS experts!

I am facing a situation and I am not sure how to overpass this.

I have a database organized as the data Current. Does anyone know how I can remove the empty cells and create new information as the Desired data?

Thanks

 

data Current;
   input ID name $ Number Street $ City $;
   datalines;
101 Riley 1132 . . 
101 Riley . MappleAve . 
101 Riley . . Downtown
101 Riley . . Dream
101 Riley . 5thStreet .
101 Riley 1012 . .
222 Apple 132 . . 
222 Apple . . . 
222 Apple . . Greenville
222 Apple . . York
222 Apple . ElevenAve .
222 Apple 112 . .
;
run;

 

data Desired;

   input ID name $ Number Street $ City $;

   datalines;

101 Riley 1132 MappleAve Downtown

101 Riley 1012 5thStreet Dream

222 Apple 132 . Greenville

222 Apple 112 ElevenAve York

;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Almost nothing to change. Can you spot the differences? Make sure you understand the code before using it.

 

data Current;
input ID name $ Number $ Street $ City $;
datalines;
101 Riley 1132A . .
101 Riley . MappleAve .
101 Riley . . Downtown
101 Riley . . Dream
101 Riley . 5thStreet .
101 Riley 1012A . .
101 Riley . . Dreams
101 Riley . 4thStreet .
101 Riley 1013B . .
101 Riley 1014CB . .
101 Riley . . Life
101 Riley . 3thStreet .
222 Apple B132 . .
222 Apple . . .
222 Apple . . Greenville
222 Apple . . York
222 Apple . ElevenAve .
222 Apple A112 . .
;

data want;
set current; by id name;
length _N _S _C $24;
retain _N _S _C;
if  not missing(_N) and not missing(Number) or
    not missing(_S) and not missing(Street) or
    not missing(_C) and not missing(City) then do;
    output;
    call missing(_N, _S, _C);
    end;
_N = coalescec(_N, Number);
_S = coalescec(_S, Street);
_C = coalescec(_C, City);
if last.name then do;
    output;
    call missing(_N, _S, _C);
    end;
keep ID name _N _S _C;
rename _N=Number _S=Street _C=City;
run;

proc print data=want noobs;
run;
PG

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20

hello @Giovani 

 

Taking a chance

data Current;
   input ID name $ Number Street :$12. City :$12.;
   datalines;
101 Riley 1132 . . 
101 Riley . MappleAve . 
101 Riley . . Downtown
101 Riley . . Dream
101 Riley . 5thStreet .
101 Riley 1012 . .
222 Apple 132 . . 
222 Apple . . . 
222 Apple . . Greenville
222 Apple . . York
222 Apple . ElevenAve .
222 Apple 112 . .
;
run;

data temp;
update current(obs=0) current;
by id;
if city>' ' then output;
run;


data want;
set temp;
by id name city	notsorted;
if last.city;
run;




Giovani
Obsidian | Level 7

@novinosrin 

 

Your solution was not able to handle when I had more them 2 information's per ID like in the case bellow

data Current;
input ID name $ Number $ Street $ City $;
datalines;
101 Riley 1132A . .
101 Riley . MappleAve .
101 Riley . . Downtown
101 Riley . . Dream
101 Riley . 5thStreet .
101 Riley 1012A . .
101 Riley . . Dreams
101 Riley . 4thStreet .
101 Riley 1013B . .
101 Riley 1014CB . .
101 Riley . . Life
101 Riley . 3thStreet .
222 Apple B132 . .
222 Apple . . .
222 Apple . . Greenville
222 Apple . . York
222 Apple . ElevenAve .
222 Apple A112 . .
;
run;

data Desired;
input ID name $ Number $ Street $ City $;
datalines;
101 Riley 1132A MappleAve Downtown
101 Riley 1012A 5thStreet Dream
101 Riley 1013B 4thStreet Dream
101 Riley 1014C 3thStreet Life
222 Apple B132 . Greenville
222 Apple A112 ElevenAve York
;
run;

PGStats
Opal | Level 21

Assuming you want a new address when a new id, name, number, street or city is encountered:

 

data want;
set current; by id name;
length _N 8 _S _C $24;
retain _N _S _C;
if  not missing(_N) and not missing(Number) or
    not missing(_S) and not missing(Street) or
    not missing(_C) and not missing(City) then do;
    output;
    call missing(_N, _S, _C);
    end;
_N = coalesce(_N, Number);
_S = coalescec(_S, Street);
_C = coalescec(_C, City);
if last.name then do;
    output;
    call missing(_N, _S, _C);
    end;
keep ID name _N _S _C;
rename _N=Number _S=Street _C=City;
run;
PG
Giovani
Obsidian | Level 7

@PGStats 

Thanks for your solutions it seem to be close to handle the issue. I release that I have a information's for my variable named as "Number" as character and not numeric. Any idea how overpass this error "Invalid numeric data, Number='1132A'" ?

 

data Current;
input ID name $ Number $ Street $ City $;
datalines;
101 Riley 1132A . .
101 Riley . MappleAve .
101 Riley . . Downtown
101 Riley . . Dream
101 Riley . 5thStreet .
101 Riley 1012A . .
101 Riley . . Dreams
101 Riley . 4thStreet .
101 Riley 1013B . .
101 Riley 1014CB . .
101 Riley . . Life
101 Riley . 3thStreet .
222 Apple B132 . .
222 Apple . . .
222 Apple . . Greenville
222 Apple . . York
222 Apple . ElevenAve .
222 Apple A112 . .
;
run;

 


data want2;
set current; by id name;
length _N 8 _S _C $24;
retain _N _S _C;
if not missing(_N) and not missing(Number) or
not missing(_S) and not missing(Street) or
not missing(_C) and not missing(City) then do;
output;
call missing(_N, _S, _C);
end;
_N = coalesce(_N, Number);
_S = coalescec(_S, Street);
_C = coalescec(_C, City);
if last.name then do;
output;
call missing(_N, _S, _C);
end;
keep ID name _N _S _C;
rename _N=Number _S=Street _C=City;
run;

PGStats
Opal | Level 21

Almost nothing to change. Can you spot the differences? Make sure you understand the code before using it.

 

data Current;
input ID name $ Number $ Street $ City $;
datalines;
101 Riley 1132A . .
101 Riley . MappleAve .
101 Riley . . Downtown
101 Riley . . Dream
101 Riley . 5thStreet .
101 Riley 1012A . .
101 Riley . . Dreams
101 Riley . 4thStreet .
101 Riley 1013B . .
101 Riley 1014CB . .
101 Riley . . Life
101 Riley . 3thStreet .
222 Apple B132 . .
222 Apple . . .
222 Apple . . Greenville
222 Apple . . York
222 Apple . ElevenAve .
222 Apple A112 . .
;

data want;
set current; by id name;
length _N _S _C $24;
retain _N _S _C;
if  not missing(_N) and not missing(Number) or
    not missing(_S) and not missing(Street) or
    not missing(_C) and not missing(City) then do;
    output;
    call missing(_N, _S, _C);
    end;
_N = coalescec(_N, Number);
_S = coalescec(_S, Street);
_C = coalescec(_C, City);
if last.name then do;
    output;
    call missing(_N, _S, _C);
    end;
keep ID name _N _S _C;
rename _N=Number _S=Street _C=City;
run;

proc print data=want noobs;
run;
PG
Tom
Super User Tom
Super User

What exactly is the logic for creating those particular rows to output?

Giovani
Obsidian | Level 7
Organizing data from a vertical to horizontal format for further graphic visualization

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 806 views
  • 1 like
  • 4 in conversation