FriedEgg's recent thread on ciphering, and his revealing that he is familiar with Perl, led me to post this question here.
A jpg file (at least those that I'm interested in) follow a set of specifications known as exif (see, e.g., http://www.exif.org/Exif2-2.PDF ).
And, the entire task has already been implemented in Perl (see, e.g., http://www.sno.phy.queensu.ca/~phil/exiftool/ .. which, by the way, does everyting I want to do and more and the Perl source code is available at that site).
Unfortunately, I don't know Perl, don't think the specifications were designed so that a psychologist would know what they mean, and I can't figure out where to begin.
What I would like to do, using SAS, is simply parse out the date a picture was taken, the gps coordinates (if they exist), and the subject.(if it exists).
Anyone up for a challenge?
Ksharp, Please try it again with the following code. There was definitely something wrong with the code, particularly how it identified a second set of tags:
%let path=c:\art\;
options datestyle=ymd;
proc format;
value ttwo 18761='pibr2.'
19789='s370fpib2.';
value tfour 18761='pibr4.'
19789='s370fpib4.';
run;
filename indata pipe "dir &path.*.jpg /b";
data want (keep=picture dt_taken coordinates title
width height);
length fil2read title $80;
retain _exif_pattern_num;
format dt_taken datetime19.;
format lat lon $1.;
length coordinates $35;
infile indata truncover;
if _n_ = 1 then _exif_pattern_num=
PRXPARSE("/\x45\x78\x69\x66/");
informat f2r $50.;
input f2r;
fil2read="&path."||f2r;
done=0;
infile dummy filevar=fil2read RECFM=n lrecl=12000
end=done;
picture=fil2read;
do while(not done);
input VAR1 $char12000.;
POSITIONX = PRXMATCH(_EXIF_PATTERN_NUM,var1)+6;
endian2=put(input(substr(var1,positionx,2),pibr2.),ttwo.);
endian4=put(input(substr(var1,positionx,2),pibr2.),tfour.);
numberx=inputn(substr(var1,positionx+inputn(substr(
var1,positionx+4,4),endian4),2),endian2);
offset=positionx+inputn(substr(var1,positionx+4,4),
endian4)+2;
gps_offset=0;
subject_offset=0;
date_offset=0;
last_offset=0;
do i=0 to numberx-1;
xtag=inputn(substr(var1,offset+i*12,2),endian2);
if xtag eq 34665 then do;
numberx2=positionx+inputn(substr(
var1,offset+i*12+8,4),endian4);
last_offset=positionx+inputn(substr(
var1,offset+i*12+8,4),endian4)+2;
end;
else if xtag eq 34853 then do;
gps_bytes=inputn(substr(var1,offset+i*12+2,2),
endian2);
if gps_bytes eq 2 then gps_offset=positionx+
inputn(substr(var1,offset+i*12+10,2),endian2);
else gps_offset=positionx+
inputn(substr(var1,offset+i*12+8,4),endian4);
end;
else if xtag eq 40091 then do;
title_offset=positionx+inputn(substr(
var1,offset+i*12+8,4),endian4);
title_length=inputn(substr(
var1,offset+i*12+4,4),endian4);
end;
end;
if last_offset then do;
do i=0 to numberx2-1;
xtag=inputn(substr(var1,last_offset+i*12,2),endian2);
if xtag eq 36867 then date_offset=
inputn(substr(
var1,last_offset+i*12+8,4),endian4);
else if xtag eq 40962 then width=
inputn(substr(
var1,last_offset+i*12+8,4),endian4);
else if xtag eq 40963 then height=
inputn(substr(
var1,last_offset+i*12+8,4),endian4);
end;
end;
if title_offset then title=compress(input(substr(
var1,title_offset,title_length),$50.),
"/ ?+#$%&","knp");
else title="No Title";
if gps_offset then do;
numberx=inputn(substr(var1,gps_offset,2),endian2);
offset=gps_offset+2;
do i=0 to numberx-1;
xtag=inputn(substr(var1,offset+i*12,2),endian2);
if xtag eq 1 then lat=input(substr(
var1,offset+i*12+8,1),$1.);
else if xtag eq 2 then do;
lat_offset=positionx+inputn(substr(var1,
offset+i*12+8,2),endian4);
latdeg=inputn(substr(var1,lat_offset+ 0,4),endian4)/
inputn(substr(var1,lat_offset+ 4,4),endian4);
latmin=inputn(substr(var1,lat_offset+ 8,4),endian4)/
inputn(substr(var1,lat_offset+12,4),endian4);
latsec=inputn(substr(var1,lat_offset+16,4),endian4)/
inputn(substr(var1,lat_offset+20,4),endian4);
end;
else if xtag eq 3 then lon=input(substr(
var1,offset+i*12+8,1),$1.);
else if xtag eq 4 then do;
lon_offset=positionx+inputn(substr(var1,
offset+i*12+8,2),endian4);
londeg=inputn(substr(var1,lon_offset+ 0,4),endian4)/
inputn(substr(var1,lon_offset+ 4,4),endian4);
lonmin=inputn(substr(var1,lon_offset+ 8,4),endian4)/
inputn(substr(var1,lon_offset+12,4),endian4);
lonsec=inputn(substr(var1,lon_offset+16,4),endian4)/
inputn(substr(var1,lon_offset+20,4),endian4);
end;
end;
coordinates=strip(put(latdeg,best12.))||" "||
strip(put(latmin,best12.))||"' "||
strip(put(latsec,best12.))||'" '||
strip(lat)||","||
strip(put(londeg,best12.))||" "||
strip(put(lonmin,best12.))||"' "||
strip(put(lonsec,best12.))||'" '||
strip(lon);
end;
else coordinates="No GPS";
if date_offset then dt_taken=
input(substr(var1,positionx+date_offset,19),anydtdtm19.);
else dt_taken=0;
output;
done=1;
end;
run;
Can I cheat and just make SAS use pipes from perl?
I will look into it, I always like a good challenge!
Update: I am going to be working with this sample file: http://www.exif.org/samples/sanyo-vpcsx550.jpg
Has to be able to work on SAS on all operating systems and nothing 3rd party can be brought in.
Looks like this file may not contain all the information you are looking for, but it does have the datetime of exposure and a comment, so I will work of extracting those two pieces of information:
Filename : sanyo-vpcsx550[1].jpg
JFIF_APP1 : Exif
Comment : Et voila. For starters we have warmed goats cheese in breadcrumbs on a herb salad base...
Main Information
ImageDescription : SANYO DIGITAL CAMERA
Make : SANYO Electric Co.,Ltd.
Model : SX113
Orientation : left-hand side
XResolution : 72/1
YResolution : 72/1
ResolutionUnit : Inch
Software : V113p-73
DateTime : 2000:11:18 21:14:19
YCbCrPositioning : co-sited
ExifInfoOffset : 284
Sub Information
ExposureTime : 1/48.3Sec
FNumber : F2.4
ISOSpeedRatings : 400
ExifVersion : 0210
DateTimeOriginal : 2000:11:18 21:14:19
DateTimeDigitized : 2000:11:18 21:14:19
ComponentConfiguration : YCbCr
CompressedBitsPerPixel : 17/10 (bit/pixel)
ExposureBiasValue : EV0.0
MaxApertureValue : F2.8
MeteringMode : CenterWeightedAverage
LightSource : Unidentified
Flash : Not fired
FocalLength : 6.00(mm)
MakerNote : SANYO Format : 178Bytes (Offset:904)
UserComment :
FlashPixVersion : 0100
ColorSpace : sRGB
ExifImageWidth : 640
ExifImageHeight : 480
ExifInteroperabilityOffset : 862
FileSource : DSC
Vendor Original Information
Unknown (0200)4,3 : 0,0,0
Unknown (0201)3,1 : 258
Macro mode : Off
Unknown (0203)3,1 : 0
Digital Zoom : Off
Unknown (0F00)4,18 : 0,0,1411907584,-1038942208,-134021120,0,0,-1995767808,-524338785,1562902528,72532304,0,0,0,-1051707323,0,0,-805288192
ExifR98
ExifR : R98
Version : 0100
Thumbnail Information
Compression : OLDJPEG
XResolution : 72/1
YResolution : 72/1
ResolutionUnit : Inch
JPEGInterchangeFormat : 1070
JPEGInterchangeFormatLength : 13234
I'm not sure if I can post a jpg here, but I'll add some coordinates and subject and try to post it here.
Here is the modified file:
To confirm, I now have your file with GPS Information included. Here is the information:
JFIF_APP1 : Exif
Comment : Et voila. For starters we have warmed goats cheese in breadcrumbs on a herb salad base...
Main Information
Make : SANYO Electric Co.,Ltd.
Model : SX113
Orientation : left-hand side
XResolution : 72/1
YResolution : 72/1
ResolutionUnit : Inch
Software : V113p-73
DateTime : 2000:11:18 21:14:19
YCbCrPositioning : co-sited
ExifInfoOffset : 322
GPSInfoOffset : 1280
Unknown (9C9B)1,30 : 530061006D0070006C006500200050006900630074007500720065000000
Unknown (9C9F)1,44 : 410020004E0069006300650020004C006F006F006B0069006E0067002000440069006E006E00650072000000
Sub Information
ExposureTime : 1/48.3Sec
FNumber : F2.4
ISOSpeedRatings : 400
ExifVersion : 0210
DateTimeOriginal : 2000:11:18 21:14:19
DateTimeDigitized : 2000:11:18 21:14:19
ComponentConfiguration : YCbCr
CompressedBitsPerPixel : 17/10 (bit/pixel)
ExposureBiasValue : EV0.0
MaxApertureValue : F2.8
MeteringMode : CenterWeightedAverage
LightSource : Unidentified
Flash : Not fired
FocalLength : 6.00(mm)
MakerNote : SANYO Format : 178Bytes (Offset:716)
UserComment : Et voila. For starters we have warmed goats cheese in breadcrumbs on a herb salad base...
UserComment : Et voila. For starters we have warmed goats cheese in breadcrumbs on a herb salad base...
UserComment : Et voila. For starters we have warmed goats cheese in breadcrumbs on a herb salad base...
UserComment :
FlashPixVersion : 0100
ColorSpace : sRGB
ExifImageWidth : 640
ExifImageHeight : 480
FileSource : DSC
GPS Informtion
GPSLatitudeRef : N
GPSLatitude : 28 2153.07 [DMS]
GPSLongitudeRef : W
GPSLongitude : 81 3330.43 [DMS]
Vendor Original Information
Unknown (0200)4,3 : 1634494831,1866866734,1953702002
Unknown (0201)3,1 : 258
Macro mode : Off
Unknown (0203)3,1 : 0
Digital Zoom : 0.85x
Unknown (0F00)4,18 : 1634213989,1998611830,1701671521,1869029476,544437345,1701144675,1763730803,1919033454,1667522917,1651340658,1852776563,1746952480,543322725,1634492787,1633820772,774792563,1950679086,1768912416
Thumbnail Information
Compression : OLDJPEG
XResolution : 72/1
YResolution : 72/1
ResolutionUnit : Inch
JPEGInterchangeFormat : 1476
JPEGInterchangeFormatLength : 13234
Correct, sort of, except your coordinates are off and you left off one of the critical fields: subject
which is A Nice Looking Dinner
The coordinates actually are 28 21' 53.07" N and 81 33' 3043" W
You also left off the title, Sample Picture, but I'm not interested in that, only Subject, Date Original and GPS Coordinates.
I also didn't notice a coding tag in your example, which will be needed for the 2 character numbers. It is little endian.
Art, can you produce a 1x1 pixel image with the EXIF data you wish to capture present? I seem to be having a little trouble reducing the amount of noise as I am figuring out my approach.
Any luck with creating this file Art? The issue I am finding is the there are a number of varieties based on manufacturer and exif version, an I do not seem to have a camera or program that allows me to create a subject tag that you describe. A simple 1x1 pixel image in your desired format and exif version would provide the simpliest file for me to work with.
There are my findings at this point:
A marker for EXIF data is the hexidecimal value 0xFF
0xF8 = start of file ( and end of header ) -- making the 16bit hexidecimal word for a coment 0xFFF8
0xF9 = end of image file ( or end of quantization table/s ) -- not really relevant to us
This header block contains the date information you are looking for. Not sure of an exact method to target the dates with at this time. The first date is 'datetime', then 'datetileoriginal', then 'datetimedigitized'. I am thinking to just extract the values with regex from header block.
00000000 FF D8 FF E1 37 E8 45 78 69 66 00 00 49 49 2A 00 08 00 00 00 0B 00 0E 01 02 00 15 00 00 00 92 00 ÿØÿá7èExif..II*...............’.
00000020 00 00 0F 01 02 00 18 00 00 00 B2 00 00 00 10 01 02 00 07 00 00 00 CA 00 00 00 12 01 03 00 01 00 ..........²...........Ê.........
00000040 00 00 01 00 00 00 1A 01 05 00 01 00 00 00 D8 00 00 00 1B 01 05 00 01 00 00 00 E0 00 00 00 28 01 ..............Ø...........à...(.
00000060 03 00 01 00 00 00 02 00 00 00 31 01 02 00 09 00 00 00 E8 00 00 00 32 01 02 00 14 00 00 00 08 01 ..........1.......è...2.........
00000080 00 00 13 02 03 00 01 00 00 00 02 00 00 00 69 87 04 00 01 00 00 00 1C 01 00 00 00 03 00 00 53 41 ..............i‡..............SA
000000A0 4E 59 4F 20 44 49 47 49 54 41 4C 20 43 41 4D 45 52 41 00 00 00 00 00 00 00 00 00 00 00 00 53 41 NYO DIGITAL CAMERA............SA
000000C0 4E 59 4F 20 45 6C 65 63 74 72 69 63 20 43 6F 2E 2C 4C 74 64 2E 00 53 58 31 31 33 20 00 00 00 00 NYO Electric Co.,Ltd..SX113 ....
000000E0 00 00 00 00 48 00 00 00 01 00 00 00 48 00 00 00 01 00 00 00 56 31 31 33 70 2D 37 33 00 00 00 00 ....H.......H.......V113p-73....
00000100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 32 30 30 30 3A 31 31 3A 31 38 20 32 ....................2000:11:18 2
00000120 31 3A 31 34 3A 31 39 00 16 00 9A 82 05 00 01 00 00 00 2A 02 00 00 9D 82 05 00 01 00 00 00 32 02 1:14:19...š‚......*...‚......2.
00000140 00 00 27 88 03 00 01 00 00 00 90 01 00 00 00 90 07 00 04 00 00 00 30 32 31 30 03 90 02 00 14 00 ..'ˆ................0210.....
00000160 00 00 3A 02 00 00 04 90 02 00 14 00 00 00 4E 02 00 00 01 91 07 00 04 00 00 00 01 02 03 00 02 91 ..:..........N....‘...........‘
00000180 05 00 01 00 00 00 62 02 00 00 04 92 0A 00 01 00 00 00 6A 02 00 00 05 92 05 00 01 00 00 00 72 02 ......b....’......j....’......r.
000001A0 00 00 07 92 03 00 01 00 00 00 02 00 00 00 08 92 03 00 01 00 00 00 00 00 00 00 09 92 03 00 01 00 ...’...........’...........’....
000001C0 00 00 00 00 00 00 0A 92 05 00 01 00 00 00 7A 02 00 00 7C 92 07 00 B2 00 00 00 7C 03 00 00 86 92 .......’......z...|’..²...|...†’
000001E0 07 00 7D 00 00 00 82 02 00 00 00 A0 07 00 04 00 00 00 30 31 30 30 01 A0 03 00 01 00 00 00 01 00 ..}...‚.... ......0100. ........
00000200 00 00 02 A0 04 00 01 00 00 00 80 02 00 00 03 A0 04 00 01 00 00 00 E0 01 00 00 05 A0 04 00 01 00 ... ......€.... ......à.... ....
00000220 00 00 5E 03 00 00 00 A3 07 00 01 00 00 00 03 00 00 00 00 00 00 00 0A 00 00 00 E3 01 00 00 18 00 ..^....£..................ã.....
00000240 00 00 0A 00 00 00 32 30 30 30 3A 31 31 3A 31 38 20 32 31 3A 31 34 3A 31 39 00 32 30 30 30 3A 31 ......2000:11:18 21:14:19.2000:1
00000260 31 3A 31 38 20 32 31 3A 31 34 3A 31 39 00 11 00 00 00 0A 00 00 00 00 00 00 00 0A 00 00 00 03 00 1:18 21:14:19...................
00000280 00 00 01 00 00 00 3C 00 00 00 0A 00 00 00 00 00 00 00 00 00 00 00 20 20 20 20 20 20 20 20 20 20 ......<...............
000002A0 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
000002C0 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
000002E0 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
00000300 20 20 20 20 20 20 20 20 20 20 20 00 06 00 03 01 03 00 01 00 00 00 06 00 00 00 1A 01 05 00 01 00 .....................
00000320 00 00 4E 03 00 00 1B 01 05 00 01 00 00 00 56 03 00 00 28 01 03 00 01 00 00 00 02 00 00 00 01 02 ..N...........V...(.............
00000340 04 00 01 00 00 00 2E 04 00 00 02 02 04 00 01 00 00 00 B2 33 00 00 00 00 00 00 48 00 00 00 01 00 ..................²3......H.....
00000360 00 00 48 00 00 00 01 00 00 00 02 00 01 00 02 00 04 00 00 00 52 39 38 00 02 00 07 00 04 00 00 00 ..H.................R98.........
00000380 30 31 30 30 00 00 00 00 53 41 4E 59 4F 00 01 00 06 00 00 02 04 00 03 00 00 00 D2 03 00 00 01 02 0100....SANYO.............Ò.....
000003A0 03 00 01 00 00 00 02 01 00 00 02 02 03 00 01 00 00 00 00 00 00 00 03 02 03 00 01 00 00 00 00 00 ................................
000003C0 00 00 04 02 05 00 01 00 00 00 DE 03 00 00 00 0F 04 00 12 00 00 00 E6 03 00 00 00 00 00 00 00 00 ..........Þ...........æ.........
000003E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0A 00 00 00 00 00 00 00 00 00 00 00 00 00 28 54 00 00 ............................(T..
00000400 13 C2 00 00 03 F8 00 00 00 00 00 00 00 00 00 00 0B 89 9F 39 BF E0 00 00 28 5D 50 C1 52 04 00 00 .Â...ø...........‰Ÿ9¿à..(]PÁR...
00000420 00 00 00 00 00 00 00 00 00 00 45 38 50 C1 00 00 00 00 00 00 00 00 00 47 00 D0 FF D8 FF DB 00 C5 ..........E8PÁ.........G.ÐÿØÿÛ.Å
The comment block is signified by the subvalue 0xFE making the 16bit hexidecimal word for a coment 0xFFFE
Here is the example files comment information from the jpeg file:
000037E0 D7 AD 4A 0D 3D 4F 9F C4 4E 2C FF D9 FF FE 00 5B 45 74 20 76 6F 69 6C 61 2E 20 46 6F 72 20 73 74 ×J.=OŸÄN,ÿÙÿþ.[Et voila. For st
00003800 61 72 74 65 72 73 20 77 65 20 68 61 76 65 20 77 61 72 6D 65 64 20 67 6F 61 74 73 20 63 68 65 65 arters we have warmed goats chee
00003820 73 65 20 69 6E 20 62 72 65 61 64 63 72 75 6D 62 73 20 6F 6E 20 61 20 68 65 72 62 20 73 61 6C 61 se in breadcrumbs on a herb sala
00003840 64 20 62 61 73 65 2E 2E 2E FF DB 00 43 00 03 03 03 03 03 02 03 03 03 03 04 04 03 04 05 08 05 05 d base...ÿÛ.C...................
I have yet to locate information specific to data for storage of GPS data.
I've posted my own attempt, thus far (even though it is WRONG), below. I changed my mind and decided that the three fields of interest are title, gps info, and date picture was taken. Date picture was taken is easy since it is the first datetime field that exists in the file. Most of my code is an attempt to parse the exif information from the file. However, I'm posting a secondary thread because I can't figure out how to get the code to correctly differentiate between big and little endian representations.
%let path=c:\art\;
options datestyle=ymd;
proc format;
value endian 18761='pibr4.'
19789='pib4.';
run;
filename indata pipe "dir &path.*.jpg /b";
data want (drop=var1);
length fil2read title $80;
retain _exif_pattern_num _dt_pattern_num;
format dt_taken datetime19.;
format lat lon $1.;
length coordinates $35;
infile indata truncover;
if _n_ = 1 then do;
_exif_pattern_num =PRXPARSE("/\x45\x78\x69\x66/");
_dt_pattern_num=PRXPARSE(
"/\d\d\d\d\:\d\d\:\d\d\ \d\d\:\d\d\:\d\d/");
end;
informat f2r $50.;
input f2r;
fil2read="&path."||f2r;
done=0;
infile dummy filevar=fil2read RECFM=n lrecl=12000
end=done;
picture=fil2read;
do while(not done);
input VAR1 $char12000.;
POSITIONX = PRXMATCH(_EXIF_PATTERN_NUM,var1)+6;
type=put(input(substr(var1,positionx,2),pib4.), endian.);
numberx=inputn(put(substr(var1,positionx+8,1),$hex2.)||
put(substr(var1,positionx+9,1),$hex2.),type);
numberx=inputn(substr(var1,positionx+8,2),type);
gps_offset=0;
title_offset=0;
do i=0 to (numberx-1);
xtag=inputn(substr(var1,positionx+10+i*12,2),type);
if xtag eq 34853 then do;
gps_offset=inputn(substr(var1,positionx+10+i*12+8,2),
type)+positionx;
end;
else if xtag eq 40091 then do;
title_offset=inputn(substr(var1,positionx+10+i*12+8,2),
type)+positionx;
title_length=inputn(substr(var1,positionx+10+i*12+4,2),
type)+positionx;
end;
end;
if title_offset then do;
title=compress(input(substr(var1,title_offset,title_length),$50.),
"/ ?+#$%&","knp");
end;
else title="No Title";
if gps_offset then do;
do i=0 to (inputn(substr(var1,gps_offset,2),type)-1);
gps_tag=inputn(substr(var1,gps_offset+2+i*12,2),type);
if gps_tag eq 1 then lat=
input(substr(var1,gps_offset+2+i*12+8,1),$1.);
else if gps_tag eq 2 then do;
lat_offset=inputn(substr(var1,gps_offset+2+i*12+8,2),type);
latdeg=inputn(substr(var1,lat_offset,4),type)/
inputn(substr(var1,lat_offset+4,4),type);
latmin=inputn(substr(var1,lat_offset+8,4),type)/
inputn(substr(var1,lat_offset+12,4),type);
latsec=inputn(substr(var1,lat_offset+16,4),type)/
inputn(substr(var1,lat_offset+20,4),type);
end;
else if gps_tag eq 3 then lon=
input(substr(var1,gps_offset+2+i*12+8,1),$1.);
else if gps_tag eq 4 then do;
lon_offset=inputn(substr(var1,gps_offset+2+i*12+8,2),type);
londeg=inputn(substr(var1,lat_offset,4),type)/
inputn(substr(var1,lat_offset+4,4),type);
lonmin=inputn(substr(var1,lat_offset+8,4),type)/
inputn(substr(var1,lat_offset+12,4),type);
lonsec=inputn(substr(var1,lat_offset+16,4),type)/
inputn(substr(var1,lat_offset+20,4),type);
end;
end;
coordinates=strip(put(latdeg,best12.))||" "||
strip(put(latmin,best12.))||"' "||
strip(put(latsec,best12.))||'" '||
strip(lat)||","||
strip(put(londeg,best12.))||" "||
strip(put(lonmin,best12.))||"' "||
strip(put(lonsec,best12.))||'" '||
strip(lon);
end;
else coordinates="No GPS";
POSITION = PRXMATCH(_DT_PATTERN_NUM,var1);
dt_taken=input(substr(var1,position,19),anydtdtm19.);
output;
done=1;
end;
run;
With some invaluable assistance from DLing and Tom I was able to solve the problem. The "almost" final code is shown below. The code searches a directory for all JPG files and then outputs a file called "want" that contains the file names, coordinates where each picture was taken, the dates the pictures were taken, the titles that were assigned to the pictures, and the picures' heights and widths:
%let path=c:\art\;
options datestyle=ymd;
proc format;
value ttwo 18761='pibr2.'
19789='s370fpib2.';
value tfour 18761='pibr4.'
19789='s370fpib4.';
run;
filename indata pipe "dir &path.*.jpg /b";
data want (keep=picture dt_taken coordinatesn title
width height);
length fil2read title $80;
retain _exif_pattern_num _dt_pattern_num;
format dt_taken datetime19.;
format lat lon $1.;
length coordinates $35;
infile indata truncover;
if _n_ = 1 then _exif_pattern_num=
PRXPARSE("/\x45\x78\x69\x66/");
informat f2r $50.;
input f2r;
fil2read="&path."||f2r;
done=0;
infile dummy filevar=fil2read RECFM=n lrecl=12000
end=done;
picture=fil2read;
do while(not done);
input VAR1 $char12000.;
POSITIONX = PRXMATCH(_EXIF_PATTERN_NUM,var1)+6;
endian2=put(input(substr(var1,positionx,2),pibr2.),ttwo.);
endian4=put(input(substr(var1,positionx,2),pibr2.),tfour.);
numberx=inputn(substr(var1,positionx+inputn(substr(
var1,positionx+4,4),endian4),2),endian2);
offset=positionx+inputn(substr(var1,positionx+4,4),
endian4)+2;
gps_offset=0;
subject_offset=0;
date_offset=0;
last_offset=0;
do i=0 to numberx-1;
xtag=inputn(substr(var1,offset+i*12,2),endian2);
if xtag eq 34853 then do;
gps_bytes=inputn(substr(var1,offset+i*12+2,2),
endian2);
if gps_bytes eq 2 then gps_offset=positionx+
inputn(substr(var1,offset+i*12+10,2),endian2);
else gps_offset=positionx+
inputn(substr(var1,offset+i*12+8,4),endian4);
end;
else if xtag eq 40091 then do;
title_offset=positionx+inputn(substr(
var1,offset+i*12+8,4),endian4);
title_length=inputn(substr(
var1,offset+i*12+4,4),endian4);
end;
if i eq numberx-1 then do;
last_offset=positionx+inputn(substr(
var1,offset+i*12+8,4),endian4);
last_length=inputn(substr(
var1,offset+i*12+4,4),endian4);
numberx2=inputn(substr(var1,last_offset+last_length,
2),endian2);
last_offset+(last_length+2);
end;
end;
if last_offset then do;
do i=0 to numberx2-1;
xtag=inputn(substr(var1,last_offset+i*12,2),endian2);
if xtag eq 36867 then date_offset=
inputn(substr(
var1,last_offset+i*12+8,4),endian4);
else if xtag eq 40962 then width=
inputn(substr(
var1,last_offset+i*12+8,4),endian4);
else if xtag eq 40963 then height=
inputn(substr(
var1,last_offset+i*12+8,4),endian4);
end;
end;
if title_offset then title=compress(input(substr(
var1,title_offset,title_length),$50.),
"/ ?+#$%&","knp");
else title="No Title";
if gps_offset then do;
numberx=inputn(substr(var1,gps_offset,2),endian2);
offset=gps_offset+2;
do i=0 to numberx-1;
xtag=inputn(substr(var1,offset+i*12,2),endian2);
if xtag eq 1 then lat=input(substr(
var1,offset+i*12+8,1),$1.);
else if xtag eq 2 then do;
lat_offset=positionx+inputn(substr(var1,
offset+i*12+8,2),endian4);
latdeg=inputn(substr(var1,lat_offset+ 0,4),endian4)/
inputn(substr(var1,lat_offset+ 4,4),endian4);
latmin=inputn(substr(var1,lat_offset+ 8,4),endian4)/
inputn(substr(var1,lat_offset+12,4),endian4);
latsec=inputn(substr(var1,lat_offset+16,4),endian4)/
inputn(substr(var1,lat_offset+20,4),endian4);
end;
else if xtag eq 3 then lon=input(substr(
var1,offset+i*12+8,1),$1.);
else if xtag eq 4 then do;
lon_offset=positionx+inputn(substr(var1,
offset+i*12+8,2),endian4);
londeg=inputn(substr(var1,lon_offset+ 0,4),endian4)/
inputn(substr(var1,lon_offset+ 4,4),endian4);
lonmin=inputn(substr(var1,lon_offset+ 8,4),endian4)/
inputn(substr(var1,lon_offset+12,4),endian4);
lonsec=inputn(substr(var1,lon_offset+16,4),endian4)/
inputn(substr(var1,lon_offset+20,4),endian4);
end;
end;
coordinates=strip(put(latdeg,best12.))||" "||
strip(put(latmin,best12.))||"' "||
strip(put(latsec,best12.))||'" '||
strip(lat)||","||
strip(put(londeg,best12.))||" "||
strip(put(lonmin,best12.))||"' "||
strip(put(lonsec,best12.))||'" '||
strip(lon);
end;
else coordinates="No GPS";
if date_offset then dt_taken=
input(substr(var1,positionx+date_offset,19),anydtdtm19.);
else dt_taken=0;
output;
done=1;
end;
run;
Excellent work Art, very impressive! Wish I had a little more time to have dedicate to this one.
Art.
But I test it. It looks like it is not right.
I test the photo is FriedEgg offered.
Update: I am going to be working with this sample file: http://www.exif.org/samples/sanyo-vpcsx550.jpg
Your code get date taken is 01jan1960:00:00:00 .
But I take a look at this photo .it 2000-11-18:21:14:00.
and I also notice that if the photo were modified ,then your code will generated an errror.
Ksharp
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.