Here is something that should help you get started.
data example;
informat n fn $25. org $40. title $25. phonework phonehome phonecell phonefax $20.
adrwork $100.
;
input @;
/* see if we have found the beginning of a record*/
if _infile_=:"BEGIN:VCARD" then do;
input /
/@'N:' n
/@'FN:' fn
/@'ORG:' org
/@'TITLE:' title
;
/* no into likely conditional reads*/
do until (_infile_=:'TEL');
input ;
end;
do while (_infile_=:'TEL');
if index(_infile_,'WORK')>0 then do;
if index(_infile_,'VOICE')>0 THEN phonework= substr(_infile_,index(_infile_,'('));
ELSE IF index(_infile_,'FAX')>0 THEN phonefax= substr(_infile_,index(_infile_,'('));
end;
if index(_infile_,'HOME')>0 then phonehome= substr(_infile_,index(_infile_,'('));
if index(_infile_,'CELL')>0 then phonecell= substr(_infile_,index(_infile_,'('));
input;
end;
/* at this point you do something similar to reading the phones to
read address(es) */
/* only output after reading the fields needed*/
output;
end;
else input;
datalines4;
BEGIN:VCARD
VERSION:2.1
N:Doe;John;;;
FN:John Doe
ORG:Doe Company, The;
TITLE: President
NOTE;ENCODING=QUOTED-PRINTABLE: This is a note associated with this
contact=0D=0A
TEL;WORK;VOICE:(987) 123-4567
TEL;HOME;VOICE:(987) 765-4321
TEL;CELL;VOICE:(987) 135-8642
TEL;WORK;FAX:(987) 246-1357
ADR;WORK:;;1234 North Street;Anytown;TX 751234;;United States of America
LABEL;WORK;ENCODING=QUOTED-PRINTABLE:1234 North Street=0D=0AAnytown, TX
751234 =0D=0AUnited States of America
URL:
URL:<WWLINK TYPE="GENERIC"
VALUE="http://www.doeweb.com">http://www.doeweb.com</WWLINK>
EMAIL;PREF;INTERNET:jdoe@nowhere.com
REV:19980114T170559Z
END:VCARD
;;;;
Key elements:
INput is executeable, you can request an input any time you want while going through a file.
The @ holds a current line.
The /@ on an input says advance a line in the file (the /) and at the position the text is found, @'N:' for example start reading the variable name that follows.
_INFILE_ is an automatic variable that holds the value of the current line form the file being read. So you can parse it using any text search functions.
I did not do anything to parse out the address because I am afraid that you are likely to found to may addresses that have different things missing or poorly formatted. I suspect your example has an error of a Zip code in the state field. Each of those ; in the ADR probably indicate a different value but I found reference to at least 3 different VCARD formats and the addresses are one of the things that changes. So you may need to actually read that VERSION information to know how to parse a specific card.
The serious logic goes into identifying which things are SKIPPED. I do not know what one of these might look like without an EMAIL for instance. I suspect you may not even have a line that starts with EMAIL.
The forum has multiple examples of reading many similar files in a single data step. Search for one that uses the FILENAME option on an INFILE statement. That is used to restart reading from the top of a new file.
... View more