I am trying to combine three data sets: Src_iowaresidents, Src_ms_citizens and Src_utah_records into one data set called Work.Contacts. Each data set by itself is working fine, but when I combine them together something goes wrong. Each data set looks like this:
PS. Sorry parts of it are in all caps. My professor prefers it that way.
OPTIONS FMTSEARCH = (HypTabs.HypFormats WORK LIBRARY);
DATA WORK.Contact_IA;
RETAIN SSN Inits City StateCd ZipCd;
SET HypTabs.Src_IowaResidents (RENAME =(ZipCd = ZipCdSrc));
Inits = SUBSTR(Initials, 3, 2)||SUBSTR(Initials, 1, 1);
City = PROPCASE(City);
StateCd = "IA";
ZipCd = PUT(ZipCdSrc, 5.);
KEEP SSN Inits City StateCd ZipCd;
LABEL SSN = "Social Security Number"
Inits = "Subject Initials"
City = "City"
StateCd = "State Code"
ZipCd = "Zip Code";
FORMAT StateCd $StateCd.;
RUN;
PROC SORT;
BY SSN;
RUN;
DATA WORK.Contact_MS;
RETAIN SSN Inits City StateCd ZipCd;
SET HypTabs.Src_MS_Citizens (RENAME =(SocSecNum = SSN));
Inits = COMPRESS(CATS(FirstInit, MiddleInit, LastInit), '.');
City = SCAN(CityState, 1, ',');
StateCd = "MS";
KEEP SSN Inits City StateCd ZipCd;
LABEL SSN = "Social Security Number"
Inits = "Subject Initials"
City = "City"
StateCd = "State Code"
ZipCd = "Zip Code";
FORMAT StateCd $StateCd.;
RUN;
PROC SORT;
BY SSN;
RUN;
DATA WORK.Contact_UT;
RETAIN SSN Inits City StateCd ZipCd;
SET HypTabs.Src_UT_Records ( RENAME = ( Inits = InitsTemp
ZipCode = ZipCd )
);
SSN1 = INPUT(ID, 12.);
SSN = PUT(SSN1, SSN11.);
Inits = COMPRESS(InitsTemp, '.');
City = SCAN(CitySt, 1, ',');
StateCd = "UT";
KEEP SSN Inits City StateCd ZipCd;
LABEL SSN = "Social Security Number"
Inits = "Subject Initials"
City = "City"
StateCd = "State Code"
ZipCd = "Zip Code";
FORMAT StateCd $StateCd.;
RUN;
PROC SORT;
BY SSN;
RUN;
DATA WORK.Contact (LABEL = "Contact Information");
SET HypTabs.Src_IowaResidents
HypTabs.Src_MS_Citizens
HypTabs.Src_UT_Records;
KEEP SSN Inits City StateCd ZipCd;
RUN;
You created 3 datasets named with prefix: work.contact_
but you use the original source datasets:
DATA WORK.Contact (LABEL = "Contact Information");
SET HypTabs.Src_IowaResidents
HypTabs.Src_MS_Citizens
HypTabs.Src_UT_Records;
KEEP SSN Inits City StateCd ZipCd;
RUN;
Replace the 3 input datasets to those you created .
@marianhabesland wrote:
I am trying to combine three data sets: Src_iowaresidents, Src_ms_citizens and Src_utah_records into one data set called Work.Contacts. Each data set by itself is working fine, but when I combine them together something goes wrong.
It helps if you show what you tried and what went wrong.
Nothing shown in the code here combines data and you don’t say what’s 'wrong'.
Sorry about that. Just realized part of the code got cut off. It is now marked in red. As for what goes wrong the data set WORK.Contacts does not even show up in the output. The only error message I get is
ERROR: Variable ZipCd has been defined as both character and numeric.
This does not really make sense to me because when I look at the data sets ZipCd is always a character variable.
Run proc contents with each of the 3 datasets and check type of ZIPCD.
You will find which are numeric and which are not.
The given code itself is not enough to guess the type.
Thanks for your reply. I have checked and it is a character variable in all three data sets. I have even put in a statement to ensure this. This is the full message I get if that gives any more clues to the problem:
ERROR: Variable ZipCd has been defined as both character and numeric.
Post the proc contents output from the three runs please.
The CONTENTS Procedure
HYPTABS.SRC_IOWARESIDENTS | 196 |
DATA | 9 |
V9 | 0 |
11:17:29 15.09.2017 | 80 |
11:17:29 15.09.2017 | 0 |
NO | |
NO | |
WINDOWS_64 | |
wlatin1 Western (Windows) |
65536 |
1 |
1 |
817 |
196 |
0 |
YES |
/folders/myfolders/BIOS6680/Hypertension Study/1 - Data/2 - Tabulations/src_iowaresidents.sas7bdat |
9.0401M2 |
X64_8PRO |
77 |
rwxrwx--- |
root |
128KB |
131072 |
BirthDt | Num | 8 |
City | Char | 20 |
Ethnicity | Char | 12 |
Initials | Char | 4 |
Race | Char | 5 |
SSN | Char | 11 |
Sex | Char | 6 |
State | Char | 4 |
ZipCd | Num | 8 |
The CONTENTS Procedure
HYPTABS.SRC_MS_CITIZENS | 207 |
DATA | 10 |
V9 | 0 |
12:46:45 15.09.2017 | 110 |
12:46:45 15.09.2017 | 0 |
NO | |
NO | |
WINDOWS_64 | |
wlatin1 Western (Windows) |
65536 |
1 |
1 |
594 |
207 |
0 |
YES |
/folders/myfolders/BIOS6680/Hypertension Study/1 - Data/2 - Tabulations/src_ms_citizens.sas7bdat |
9.0401M2 |
X64_8PRO |
82 |
rwxrwx--- |
root |
128KB |
131072 |
CityState | Char | 35 |
DOB | Char | 9 |
Eth | Char | 22 |
FirstInit | Char | 2 |
Gender | Char | 6 |
LastInit | Char | 2 |
MiddleInit | Char | 2 |
Racial | Char | 16 |
SocSecNum | Char | 11 |
ZipCd | Char | 5 |
The CONTENTS Procedure
HYPTABS.SRC_UT_RECORDS | 195 |
DATA | 10 |
V9 | 0 |
09:08:11 15.09.2017 | 72 |
09:08:11 15.09.2017 | 0 |
NO | |
NO | |
WINDOWS_64 | |
wlatin1 Western (Windows) |
65536 |
1 |
1 |
908 |
195 |
0 |
YES |
/folders/myfolders/BIOS6680/Hypertension Study/1 - Data/2 - Tabulations/src_ut_records.sas7bdat |
9.0401M2 |
X64_8PRO |
83 |
rwxrwx--- |
root |
128KB |
131072 |
BirthDay | Num | 8 |
BirthMonth | Num | 8 |
BirthYear | Num | 8 |
CitySt | Char | 24 |
EthnicityCode | Char | 1 |
GenderCode | Char | 1 |
ID | Num | 8 |
Inits | Char | 6 |
RaceCode | Char | 1 |
ZipCode | Char | 5 |
Add to your code, starting with:
DATA WORK.Contact_IA;
RETAIN SSN Inits City StateCd ZipCd;
SET HypTabs.Src_IowaResidents (RENAME =(ZipCd = ZipCdSrc));
a line:
DATA WORK.Contact_IA;
LENGTH zipcd $5; /*** line to add ***/
RETAIN SSN Inits City StateCd ZipCd;
SET HypTabs.Src_IowaResidents (RENAME =(ZipCd = ZipCdSrc));
I hope it will sove your problem.
PLEASE edit your post to remove any displayed SSN. I may be paranoid about such things but better safe than sorry.
Initials + residence+ zip code and SSN could compromise an individual's SSN for exploitation.
@ballardw I sort of assumed with the word professor there it would be fake data...but they are valid SSNs and do map to people. And it's also weird that US even has online systems that show me that....
You created 3 datasets named with prefix: work.contact_
but you use the original source datasets:
DATA WORK.Contact (LABEL = "Contact Information");
SET HypTabs.Src_IowaResidents
HypTabs.Src_MS_Citizens
HypTabs.Src_UT_Records;
KEEP SSN Inits City StateCd ZipCd;
RUN;
Replace the 3 input datasets to those you created .
Yes, that makes sense. Silly mistake. Thank you!!
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.