i have data as below
data class;
input Name$ Sex$ Age Height Weight;
infile datalines dlm=' ' dsd;
datalines;
Alfred M 14 69.0 112.5
Alice F 13 56.5 84.0
Barbara F 13 65.3 98.0
Carol F 14 62.8 102.5
Henry M 14 63.5 102.5
;
data zipcodes;
input name $ zipcode;
infile datalines dlm=' ' dsd;
datalines;
Alfred 91940
Alice 75008
Barbara 92330
Carol 94150
Henry 93140
;
data cities;
input zipcode City :$30. Department :$30.;
infile datalines dlm=' ' dsd;
datalines;
75008 Paris 8è Paris
91940 Les Ulis Essonne
92330 Sceaux Hauts-de-Seine
93140 Bondy Seine-Saint-Denis
94150 Rungis Val-de-Marne
;
i wand the output as
Name | ZipCode | City | Department |
Alfred | 91940 | Les Ulis | Essonne |
Alice | 75008 | Paris 8è | Paris |
Barbara | 92330 | Sceaux | Hauts-de-Seine |
Carol | 94150 | Rungis | Val-de-Marne |
Henry | 93140 | Bondy | Seine-Saint-Denis |
i have coded as below but not able to make it happen:
data need(keep=name zipcode city department);
if 0 then set zipcodes cities;
if _n_=1 then do;
dcl hash z(dataset:'zipcodes');
z.definekey('name','zipcode');
z.definedata(all:'Y');
z.definedone();
dcl hash c(dataset:'cities');
c.definekey('zipcode');
c.definedata(all:'Y');
c.definedone();
end;
do until(eof);
set class end=eof;
if z.find(key:name,key:zipcode)=0 and c.find(key:zipcode)=0;
end;
stop;
Hello @subhani4
First make the cities table comma delimited. I believe your objective is to join the datasets to obtain an output as you desired.
Second it is easy to use proc sql to make the join. Please see the code below .
Information from zipcode and cities table is sufficient to produce the desired output.
data class;
input Name$ Sex$ Age Height Weight;
infile datalines dlm=' ' dsd;
datalines;
Alfred M 14 69.0 112.5
Alice F 13 56.5 84.0
Barbara F 13 65.3 98.0
Carol F 14 62.8 102.5
Henry M 14 63.5 102.5
;
data zipcodes;
input name $ zipcode;
infile datalines dlm=' ' dsd;
datalines;
Alfred 91940
Alice 75008
Barbara 92330
Carol 94150
Henry 93140
;
data cities;
input zipcode City :$30. Department :$30.;
infile datalines dlm=',' dsd;
datalines;
75008, Paris 8è ,Paris
91940, Les Ulis, Essonne
92330,Sceaux, Hauts-de-Seine
93140, Bondy , Seine-Saint-Denis
94150, Rungis, Val-de-Marne
;
proc sql;
create table want as
select z.name, z.zipcode, ci.City Format $30. length=30 , Ci.Department Format $30. length=30
from zipcodes z
Left join Cities ci ON z.zipcode=ci.zipcode
order by Name;
quit;
proc print;
run;
The output will be as follows
do until(eof);
set class end=eof;
if z.find(key:name,key:zipcode)=0 and c.find(key:zipcode)=0;
end;
stop;
Since there is no OUTPUT statement in your loop, and the STOP statement terminates the data step before the implicit output is done, nothing is written to the dataset.
Use the "natural loop" of the data step instead:
data need(keep=name zipcode city department);
set class;
if 0 then set zipcodes cities;
if _n_=1 then do;
length zipcode 8 city department $30;
dcl hash z(dataset:'zipcodes');
z.definekey('name');
z.definedata('zipcode');
z.definedone();
dcl hash c(dataset:'cities');
c.definekey('zipcode');
c.definedata('city','department');
c.definedone();
end;
if z.find()=0;
if c.find()=0;
run;
Note that I adapted the hash definitions, as dataset class does not have a zipcode variable. Therefore it is also necessary to split the subsetting IFs, to make sure that the second FIND() is executed after the first one that retrieves the zipcode.
The code is not tested in any way, as I am posting from my tablet.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.