DATA Step, Macro, Functions and more

Deciphering a jpg file

Accepted Solution Solved
Reply
PROC Star
Posts: 7,358
Accepted Solution

Deciphering a jpg file

FriedEgg's recent thread on ciphering, and his revealing that he is familiar with Perl, led me to post this question here.

A jpg file (at least those that I'm interested in) follow a set of specifications known as exif (see, e.g., http://www.exif.org/Exif2-2.PDF ).

And, the entire task has already been implemented in Perl (see, e.g., http://www.sno.phy.queensu.ca/~phil/exiftool/ .. which, by the way, does everyting I want to do and more and the Perl source code is available at that site).

Unfortunately, I don't know Perl, don't think the specifications were designed so that a psychologist would know what they mean, and I can't figure out where to begin.

What I would like to do, using SAS, is simply parse out the date a picture was taken, the gps coordinates (if they exist), and the subject.(if it exists).

Anyone up for a challenge?


Accepted Solutions
Solution
‎10-12-2011 09:18 AM
PROC Star
Posts: 7,358

Re: Deciphering a jpg file

Ksharp,  Please try it again with the following code.  There was definitely something wrong with the code, particularly how it identified a second set of tags:

%let path=c:\art\;

options datestyle=ymd;

proc format;

  value ttwo  18761='pibr2.'

              19789='s370fpib2.';

  value tfour 18761='pibr4.'

              19789='s370fpib4.';

run;

filename indata pipe "dir &path.*.jpg /b";

data want (keep=picture dt_taken coordinates title

                width height);

  length fil2read title $80;

  retain _exif_pattern_num;

  format dt_taken datetime19.;

  format lat lon $1.;

  length coordinates $35;

  infile indata truncover;

  if _n_ = 1 then _exif_pattern_num=

   PRXPARSE("/\x45\x78\x69\x66/");

  informat f2r $50.;

  input f2r;

  fil2read="&path."||f2r;

  done=0;

  infile dummy filevar=fil2read RECFM=n lrecl=12000

    end=done;

  picture=fil2read;

  do while(not done);

    input VAR1 $char12000.;

    POSITIONX = PRXMATCH(_EXIF_PATTERN_NUM,var1)+6;

    endian2=put(input(substr(var1,positionx,2),pibr2.),ttwo.);

    endian4=put(input(substr(var1,positionx,2),pibr2.),tfour.);

    numberx=inputn(substr(var1,positionx+inputn(substr(

     var1,positionx+4,4),endian4),2),endian2);

    offset=positionx+inputn(substr(var1,positionx+4,4),

     endian4)+2;

    gps_offset=0;

    subject_offset=0;

    date_offset=0;

    last_offset=0;

    do i=0 to numberx-1;

      xtag=inputn(substr(var1,offset+i*12,2),endian2);

      if xtag eq 34665 then do;

        numberx2=positionx+inputn(substr(

         var1,offset+i*12+8,4),endian4);

        last_offset=positionx+inputn(substr(

         var1,offset+i*12+8,4),endian4)+2;

      end;

      else if xtag eq 34853 then do;

        gps_bytes=inputn(substr(var1,offset+i*12+2,2),

         endian2);

              if gps_bytes eq 2 then gps_offset=positionx+

         inputn(substr(var1,offset+i*12+10,2),endian2);

              else gps_offset=positionx+

         inputn(substr(var1,offset+i*12+8,4),endian4);

      end;

      else if xtag eq 40091 then do;

        title_offset=positionx+inputn(substr(

         var1,offset+i*12+8,4),endian4);

        title_length=inputn(substr(

         var1,offset+i*12+4,4),endian4);

      end;

    end;

    if last_offset then do;

      do i=0 to numberx2-1;

        xtag=inputn(substr(var1,last_offset+i*12,2),endian2);

        if xtag eq 36867 then date_offset=

          inputn(substr(

           var1,last_offset+i*12+8,4),endian4);

        else if xtag eq 40962 then width=

          inputn(substr(

           var1,last_offset+i*12+8,4),endian4);

        else if xtag eq 40963 then height=

          inputn(substr(

           var1,last_offset+i*12+8,4),endian4);

     end;

    end;

    if title_offset then title=compress(input(substr(

     var1,title_offset,title_length),$50.),

     "/ ?+#$%&","knp");

    else title="No Title";

    if gps_offset then do;

      numberx=inputn(substr(var1,gps_offset,2),endian2);

      offset=gps_offset+2;

      do i=0 to numberx-1;

        xtag=inputn(substr(var1,offset+i*12,2),endian2);

        if xtag eq 1 then lat=input(substr(

         var1,offset+i*12+8,1),$1.);

        else if xtag eq 2 then do;

          lat_offset=positionx+inputn(substr(var1,

           offset+i*12+8,2),endian4);

          latdeg=inputn(substr(var1,lat_offset+ 0,4),endian4)/

                 inputn(substr(var1,lat_offset+ 4,4),endian4);

          latmin=inputn(substr(var1,lat_offset+ 8,4),endian4)/

                 inputn(substr(var1,lat_offset+12,4),endian4);

          latsec=inputn(substr(var1,lat_offset+16,4),endian4)/

                 inputn(substr(var1,lat_offset+20,4),endian4);

        end;

        else if xtag eq 3 then lon=input(substr(

         var1,offset+i*12+8,1),$1.);

        else if xtag eq 4 then do;

          lon_offset=positionx+inputn(substr(var1,

           offset+i*12+8,2),endian4);

          londeg=inputn(substr(var1,lon_offset+ 0,4),endian4)/

                 inputn(substr(var1,lon_offset+ 4,4),endian4);

          lonmin=inputn(substr(var1,lon_offset+ 8,4),endian4)/

                 inputn(substr(var1,lon_offset+12,4),endian4);

          lonsec=inputn(substr(var1,lon_offset+16,4),endian4)/

                 inputn(substr(var1,lon_offset+20,4),endian4);

        end;

      end;

      coordinates=strip(put(latdeg,best12.))||" "||

      strip(put(latmin,best12.))||"' "||

      strip(put(latsec,best12.))||'" '||

      strip(lat)||","||

      strip(put(londeg,best12.))||" "||

      strip(put(lonmin,best12.))||"' "||

      strip(put(lonsec,best12.))||'" '||

      strip(lon);

    end;

    else coordinates="No GPS";

    if date_offset then dt_taken=

     input(substr(var1,positionx+date_offset,19),anydtdtm19.);

    else dt_taken=0;

    output;

    done=1;

  end;

run;

View solution in original post


All Replies
Trusted Advisor
Posts: 1,300

Re: Deciphering a jpg file

Can I cheat and just make SAS use pipes from perl? Smiley Happy

I will look into it, I always like a good challenge!

Update: I am going to be working with this sample file: http://www.exif.org/samples/sanyo-vpcsx550.jpg

PROC Star
Posts: 7,358

Deciphering a jpg file

Has to be able to work on SAS on all operating systems and nothing 3rd party can be brought in.

Trusted Advisor
Posts: 1,300

Re: Deciphering a jpg file

Looks like this file may not contain all the information you are looking for, but it does have the datetime of exposure and a comment, so I will work of extracting those two pieces of information:

Filename : sanyo-vpcsx550[1].jpg

JFIF_APP1 : Exif

Comment : Et voila. For starters we have warmed goats cheese in breadcrumbs on a herb salad base...

Main Information

ImageDescription : SANYO DIGITAL CAMERA

Make : SANYO Electric Co.,Ltd.

Model : SX113

Orientation : left-hand side

XResolution : 72/1

YResolution : 72/1

ResolutionUnit : Inch

Software : V113p-73

DateTime : 2000:11:18 21:14:19

YCbCrPositioning : co-sited

ExifInfoOffset : 284

Sub Information

ExposureTime : 1/48.3Sec

FNumber : F2.4

ISOSpeedRatings : 400

ExifVersion : 0210

DateTimeOriginal : 2000:11:18 21:14:19

DateTimeDigitized : 2000:11:18 21:14:19

ComponentConfiguration : YCbCr

CompressedBitsPerPixel : 17/10 (bit/pixel)

ExposureBiasValue : EV0.0

MaxApertureValue : F2.8

MeteringMode : CenterWeightedAverage

LightSource : Unidentified

Flash : Not fired

FocalLength : 6.00(mm)

MakerNote : SANYO Format : 178Bytes (Offset:904)

UserComment :                                                                                                                     

FlashPixVersion : 0100

ColorSpace : sRGB

ExifImageWidth : 640

ExifImageHeight : 480

ExifInteroperabilityOffset : 862

FileSource : DSC

Vendor Original Information

Unknown (0200)4,3 : 0,0,0

Unknown (0201)3,1 : 258

Macro mode : Off

Unknown (0203)3,1 : 0

Digital Zoom : Off

Unknown (0F00)4,18 : 0,0,1411907584,-1038942208,-134021120,0,0,-1995767808,-524338785,1562902528,72532304,0,0,0,-1051707323,0,0,-805288192

ExifR98

ExifR : R98

Version : 0100

Thumbnail Information

Compression : OLDJPEG

XResolution : 72/1

YResolution : 72/1

ResolutionUnit : Inch

JPEGInterchangeFormat : 1070

JPEGInterchangeFormatLength : 13234

PROC Star
Posts: 7,358

Re: Deciphering a jpg file

I'm not sure if I can post a jpg here, but I'll add some coordinates and subject and try to post it here.

PROC Star
Posts: 7,358

Re: Deciphering a jpg file

Here is the modified file:


pic550.jpg
Trusted Advisor
Posts: 1,300

Re: Deciphering a jpg file

To confirm, I now have your file with GPS Information included.  Here is the information:

JFIF_APP1 : Exif

Comment : Et voila. For starters we have warmed goats cheese in breadcrumbs on a herb salad base...

Main Information

Make : SANYO Electric Co.,Ltd.

Model : SX113

Orientation : left-hand side

XResolution : 72/1

YResolution : 72/1

ResolutionUnit : Inch

Software : V113p-73

DateTime : 2000:11:18 21:14:19

YCbCrPositioning : co-sited

ExifInfoOffset : 322

GPSInfoOffset : 1280

Unknown (9C9B)1,30 : 530061006D0070006C006500200050006900630074007500720065000000

Unknown (9C9F)1,44 : 410020004E0069006300650020004C006F006F006B0069006E0067002000440069006E006E00650072000000

Sub Information

ExposureTime : 1/48.3Sec

FNumber : F2.4

ISOSpeedRatings : 400

ExifVersion : 0210

DateTimeOriginal : 2000:11:18 21:14:19

DateTimeDigitized : 2000:11:18 21:14:19

ComponentConfiguration : YCbCr

CompressedBitsPerPixel : 17/10 (bit/pixel)

ExposureBiasValue : EV0.0

MaxApertureValue : F2.8

MeteringMode : CenterWeightedAverage

LightSource : Unidentified

Flash : Not fired

FocalLength : 6.00(mm)

MakerNote : SANYO Format : 178Bytes (Offset:716)

UserComment : Et voila. For starters we have warmed goats cheese in breadcrumbs on a herb salad base...

UserComment : Et voila. For starters we have warmed goats cheese in breadcrumbs on a herb salad base...

UserComment : Et voila. For starters we have warmed goats cheese in breadcrumbs on a herb salad base...

UserComment :                                                                                                                     

FlashPixVersion : 0100

ColorSpace : sRGB

ExifImageWidth : 640

ExifImageHeight : 480

FileSource : DSC

GPS Informtion

GPSLatitudeRef : N

GPSLatitude : 28 2153.07 [DMS]

GPSLongitudeRef : W

GPSLongitude : 81 3330.43 [DMS]

Vendor Original Information

Unknown (0200)4,3 : 1634494831,1866866734,1953702002

Unknown (0201)3,1 : 258

Macro mode : Off

Unknown (0203)3,1 : 0

Digital Zoom : 0.85x

Unknown (0F00)4,18 : 1634213989,1998611830,1701671521,1869029476,544437345,1701144675,1763730803,1919033454,1667522917,1651340658,1852776563,1746952480,543322725,1634492787,1633820772,774792563,1950679086,1768912416

Thumbnail Information

Compression : OLDJPEG

XResolution : 72/1

YResolution : 72/1

ResolutionUnit : Inch

JPEGInterchangeFormat : 1476

JPEGInterchangeFormatLength : 13234

PROC Star
Posts: 7,358

Re: Deciphering a jpg file

Correct, sort of, except your coordinates are off and you left off one of the critical fields: subject

which is A Nice Looking Dinner

The coordinates actually are 28 21' 53.07" N and 81 33' 3043" W

You also left off the title, Sample Picture, but I'm not interested in that, only Subject, Date Original and GPS Coordinates. 

I also didn't notice a coding tag in your example, which will be needed for the 2 character numbers.  It is little endian.

Trusted Advisor
Posts: 1,300

Deciphering a jpg file

Art, can you produce a 1x1 pixel image with the EXIF data you wish to capture present?  I seem to be having a little trouble reducing the amount of noise as I am figuring out my approach.

Trusted Advisor
Posts: 1,300

Deciphering a jpg file

Any luck with creating this file Art?  The issue I am finding is the there are a number of varieties based on manufacturer and exif version, an I do not seem to have a camera or program that allows me to create a subject tag that you describe.  A simple 1x1 pixel image in your desired format and exif version would provide the simpliest file for me to work with.

Trusted Advisor
Posts: 1,300

Re: Deciphering a jpg file

There are my findings at this point:

A marker for EXIF data is the hexidecimal value 0xFF

0xF8 = start of file ( and end of header ) --  making the 16bit hexidecimal word for a coment 0xFFF8

0xF9 = end of image file ( or end of quantization table/s ) -- not really relevant to us

This header block contains the date information you are looking for.  Not sure of an exact method to target the dates with at this time.  The first date is 'datetime', then 'datetileoriginal', then 'datetimedigitized'.  I am thinking to just extract the values with regex from header block.

00000000  FF D8 FF E1 37 E8 45 78 69 66 00 00 49 49 2A 00 08 00 00 00 0B 00 0E 01 02 00 15 00 00 00 92 00    ÿØÿá7èExif..II*...............’.

00000020  00 00 0F 01 02 00 18 00 00 00 B2 00 00 00 10 01 02 00 07 00 00 00 CA 00 00 00 12 01 03 00 01 00    ..........²...........Ê.........

00000040  00 00 01 00 00 00 1A 01 05 00 01 00 00 00 D8 00 00 00 1B 01 05 00 01 00 00 00 E0 00 00 00 28 01    ..............Ø...........à...(.

00000060  03 00 01 00 00 00 02 00 00 00 31 01 02 00 09 00 00 00 E8 00 00 00 32 01 02 00 14 00 00 00 08 01    ..........1.......è...2.........

00000080  00 00 13 02 03 00 01 00 00 00 02 00 00 00 69 87 04 00 01 00 00 00 1C 01 00 00 00 03 00 00 53 41    ..............i‡..............SA

000000A0  4E 59 4F 20 44 49 47 49 54 41 4C 20 43 41 4D 45 52 41 00 00 00 00 00 00 00 00 00 00 00 00 53 41    NYO DIGITAL CAMERA............SA

000000C0  4E 59 4F 20 45 6C 65 63 74 72 69 63 20 43 6F 2E 2C 4C 74 64 2E 00 53 58 31 31 33 20 00 00 00 00    NYO Electric Co.,Ltd..SX113 ....

000000E0  00 00 00 00 48 00 00 00 01 00 00 00 48 00 00 00 01 00 00 00 56 31 31 33 70 2D 37 33 00 00 00 00    ....H.......H.......V113p-73....

00000100  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 32 30 30 30 3A 31 31 3A 31 38 20 32    ....................2000:11:18 2

00000120  31 3A 31 34 3A 31 39 00 16 00 9A 82 05 00 01 00 00 00 2A 02 00 00 9D 82 05 00 01 00 00 00 32 02    1:14:19...š‚......*...‚......2.

00000140  00 00 27 88 03 00 01 00 00 00 90 01 00 00 00 90 07 00 04 00 00 00 30 32 31 30 03 90 02 00 14 00    ..'ˆ................0210.....

00000160  00 00 3A 02 00 00 04 90 02 00 14 00 00 00 4E 02 00 00 01 91 07 00 04 00 00 00 01 02 03 00 02 91    ..:..........N....‘...........‘

00000180  05 00 01 00 00 00 62 02 00 00 04 92 0A 00 01 00 00 00 6A 02 00 00 05 92 05 00 01 00 00 00 72 02    ......b....’......j....’......r.

000001A0  00 00 07 92 03 00 01 00 00 00 02 00 00 00 08 92 03 00 01 00 00 00 00 00 00 00 09 92 03 00 01 00    ...’...........’...........’....

000001C0  00 00 00 00 00 00 0A 92 05 00 01 00 00 00 7A 02 00 00 7C 92 07 00 B2 00 00 00 7C 03 00 00 86 92    .......’......z...|’..²...|...†’

000001E0  07 00 7D 00 00 00 82 02 00 00 00 A0 07 00 04 00 00 00 30 31 30 30 01 A0 03 00 01 00 00 00 01 00    ..}...‚.... ......0100. ........

00000200  00 00 02 A0 04 00 01 00 00 00 80 02 00 00 03 A0 04 00 01 00 00 00 E0 01 00 00 05 A0 04 00 01 00    ... ......€.... ......à.... ....

00000220  00 00 5E 03 00 00 00 A3 07 00 01 00 00 00 03 00 00 00 00 00 00 00 0A 00 00 00 E3 01 00 00 18 00    ..^....£..................ã.....

00000240  00 00 0A 00 00 00 32 30 30 30 3A 31 31 3A 31 38 20 32 31 3A 31 34 3A 31 39 00 32 30 30 30 3A 31    ......2000:11:18 21:14:19.2000:1

00000260  31 3A 31 38 20 32 31 3A 31 34 3A 31 39 00 11 00 00 00 0A 00 00 00 00 00 00 00 0A 00 00 00 03 00    1:18 21:14:19...................

00000280  00 00 01 00 00 00 3C 00 00 00 0A 00 00 00 00 00 00 00 00 00 00 00 20 20 20 20 20 20 20 20 20 20    ......<...............         

000002A0  20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20                                   

000002C0  20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20                                   

000002E0  20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20                                   

00000300  20 20 20 20 20 20 20 20 20 20 20 00 06 00 03 01 03 00 01 00 00 00 06 00 00 00 1A 01 05 00 01 00               .....................

00000320  00 00 4E 03 00 00 1B 01 05 00 01 00 00 00 56 03 00 00 28 01 03 00 01 00 00 00 02 00 00 00 01 02    ..N...........V...(.............

00000340  04 00 01 00 00 00 2E 04 00 00 02 02 04 00 01 00 00 00 B2 33 00 00 00 00 00 00 48 00 00 00 01 00    ..................²3......H.....

00000360  00 00 48 00 00 00 01 00 00 00 02 00 01 00 02 00 04 00 00 00 52 39 38 00 02 00 07 00 04 00 00 00    ..H.................R98.........

00000380  30 31 30 30 00 00 00 00 53 41 4E 59 4F 00 01 00 06 00 00 02 04 00 03 00 00 00 D2 03 00 00 01 02    0100....SANYO.............Ò.....

000003A0  03 00 01 00 00 00 02 01 00 00 02 02 03 00 01 00 00 00 00 00 00 00 03 02 03 00 01 00 00 00 00 00    ................................

000003C0  00 00 04 02 05 00 01 00 00 00 DE 03 00 00 00 0F 04 00 12 00 00 00 E6 03 00 00 00 00 00 00 00 00    ..........Þ...........æ.........

000003E0  00 00 00 00 00 00 00 00 00 00 00 00 00 00 0A 00 00 00 00 00 00 00 00 00 00 00 00 00 28 54 00 00    ............................(T..

00000400  13 C2 00 00 03 F8 00 00 00 00 00 00 00 00 00 00 0B 89 9F 39 BF E0 00 00 28 5D 50 C1 52 04 00 00    .Â...ø...........‰Ÿ9¿à..(]PÁR...

00000420  00 00 00 00 00 00 00 00 00 00 45 38 50 C1 00 00 00 00 00 00 00 00 00 47 00 D0 FF D8 FF DB 00 C5    ..........E8PÁ.........G.ÐÿØÿÛ.Å

The comment block is signified by the subvalue 0xFE making the 16bit hexidecimal word for a coment 0xFFFE

Here is the example files comment information from the jpeg file:

000037E0  D7 AD 4A 0D 3D 4F 9F C4 4E 2C FF D9 FF FE 00 5B 45 74 20 76 6F 69 6C 61 2E 20 46 6F 72 20 73 74    ×­J.=OŸÄN,ÿÙÿþ.[Et voila. For st

00003800  61 72 74 65 72 73 20 77 65 20 68 61 76 65 20 77 61 72 6D 65 64 20 67 6F 61 74 73 20 63 68 65 65    arters we have warmed goats chee

00003820  73 65 20 69 6E 20 62 72 65 61 64 63 72 75 6D 62 73 20 6F 6E 20 61 20 68 65 72 62 20 73 61 6C 61    se in breadcrumbs on a herb sala

00003840  64 20 62 61 73 65 2E 2E 2E FF DB 00 43 00 03 03 03 03 03 02 03 03 03 03 04 04 03 04 05 08 05 05    d base...ÿÛ.C...................

I have yet to locate information specific to data for storage of GPS data.

PROC Star
Posts: 7,358

Deciphering a jpg file

I've posted my own attempt, thus far (even though it is WRONG), below.  I changed my mind and decided that the three fields of interest are title, gps info, and date picture was taken.  Date picture was taken is easy since it is the first datetime field that exists in the file.  Most of my code is an attempt to parse the exif information from the file.  However, I'm posting a secondary thread because I can't figure out how to get the code to correctly differentiate between big and little endian representations.

%let path=c:\art\;

options datestyle=ymd;

proc format;

  value endian 18761='pibr4.'

               19789='pib4.';

   run;

filename indata pipe "dir &path.*.jpg /b";

data want (drop=var1);

  length fil2read title $80;

  retain _exif_pattern_num _dt_pattern_num;

  format dt_taken datetime19.;

  format lat lon $1.;

  length coordinates $35;

  infile indata truncover;

  if _n_ = 1 then do;

    _exif_pattern_num =PRXPARSE("/\x45\x78\x69\x66/");

    _dt_pattern_num=PRXPARSE(

     "/\d\d\d\d\:\d\d\:\d\d\ \d\d\:\d\d\:\d\d/");

  end;

  informat f2r $50.;

  input f2r;

  fil2read="&path."||f2r;

  done=0;

  infile dummy filevar=fil2read RECFM=n lrecl=12000

    end=done;

  picture=fil2read;

  do while(not done);

    input VAR1 $char12000.;

    POSITIONX = PRXMATCH(_EXIF_PATTERN_NUM,var1)+6;

    type=put(input(substr(var1,positionx,2),pib4.), endian.);

    numberx=inputn(put(substr(var1,positionx+8,1),$hex2.)||

     put(substr(var1,positionx+9,1),$hex2.),type);

    numberx=inputn(substr(var1,positionx+8,2),type);

    gps_offset=0;

    title_offset=0;

    do i=0 to (numberx-1);

      xtag=inputn(substr(var1,positionx+10+i*12,2),type);

      if xtag eq 34853 then do;

        gps_offset=inputn(substr(var1,positionx+10+i*12+8,2),

         type)+positionx;

      end;

      else if xtag eq 40091 then do;

        title_offset=inputn(substr(var1,positionx+10+i*12+8,2),

         type)+positionx;

        title_length=inputn(substr(var1,positionx+10+i*12+4,2),

         type)+positionx;

      end;

    end;

    if title_offset then do;

      title=compress(input(substr(var1,title_offset,title_length),$50.),

            "/ ?+#$%&","knp");

    end;

    else title="No Title";

    if gps_offset then do;

      do i=0 to (inputn(substr(var1,gps_offset,2),type)-1);

        gps_tag=inputn(substr(var1,gps_offset+2+i*12,2),type);

        if gps_tag eq 1 then lat=

         input(substr(var1,gps_offset+2+i*12+8,1),$1.);

        else if gps_tag eq 2 then do;

          lat_offset=inputn(substr(var1,gps_offset+2+i*12+8,2),type);

          latdeg=inputn(substr(var1,lat_offset,4),type)/

           inputn(substr(var1,lat_offset+4,4),type);

          latmin=inputn(substr(var1,lat_offset+8,4),type)/

           inputn(substr(var1,lat_offset+12,4),type);

          latsec=inputn(substr(var1,lat_offset+16,4),type)/

           inputn(substr(var1,lat_offset+20,4),type);

        end;

        else if gps_tag eq 3 then lon=

         input(substr(var1,gps_offset+2+i*12+8,1),$1.);

        else if gps_tag eq 4 then do;

          lon_offset=inputn(substr(var1,gps_offset+2+i*12+8,2),type);

          londeg=inputn(substr(var1,lat_offset,4),type)/

           inputn(substr(var1,lat_offset+4,4),type);

          lonmin=inputn(substr(var1,lat_offset+8,4),type)/

           inputn(substr(var1,lat_offset+12,4),type);

          lonsec=inputn(substr(var1,lat_offset+16,4),type)/

           inputn(substr(var1,lat_offset+20,4),type);

        end;

      end;

      coordinates=strip(put(latdeg,best12.))||" "||

       strip(put(latmin,best12.))||"' "||

       strip(put(latsec,best12.))||'" '||

       strip(lat)||","||

       strip(put(londeg,best12.))||" "||

       strip(put(lonmin,best12.))||"' "||

       strip(put(lonsec,best12.))||'" '||

       strip(lon);

    end;

    else coordinates="No GPS";

    POSITION = PRXMATCH(_DT_PATTERN_NUM,var1);

    dt_taken=input(substr(var1,position,19),anydtdtm19.);

    output;

    done=1;

  end;

run;

PROC Star
Posts: 7,358

Deciphering a jpg file

With some invaluable assistance from DLing and Tom I was able to solve the problem.  The "almost" final code is shown below.  The code searches a directory for all JPG files and then outputs a file called "want" that contains the file names, coordinates where each picture was taken, the dates the pictures were taken, the titles that were assigned to the pictures, and the picures' heights and widths:

%let path=c:\art\;

options datestyle=ymd;

proc format;

  value ttwo  18761='pibr2.'

              19789='s370fpib2.';

  value tfour 18761='pibr4.'

              19789='s370fpib4.';

run;

filename indata pipe "dir &path.*.jpg /b";

data want (keep=picture dt_taken coordinatesn title

                width height);

  length fil2read title $80;

  retain _exif_pattern_num _dt_pattern_num;

  format dt_taken datetime19.;

  format lat lon $1.;

  length coordinates $35;

  infile indata truncover;

  if _n_ = 1 then _exif_pattern_num=

   PRXPARSE("/\x45\x78\x69\x66/");

  informat f2r $50.;

  input f2r;

  fil2read="&path."||f2r;

  done=0;

  infile dummy filevar=fil2read RECFM=n lrecl=12000

    end=done;

  picture=fil2read;

  do while(not done);

    input VAR1 $char12000.;

    POSITIONX = PRXMATCH(_EXIF_PATTERN_NUM,var1)+6;

    endian2=put(input(substr(var1,positionx,2),pibr2.),ttwo.);

    endian4=put(input(substr(var1,positionx,2),pibr2.),tfour.);

    numberx=inputn(substr(var1,positionx+inputn(substr(

     var1,positionx+4,4),endian4),2),endian2);

    offset=positionx+inputn(substr(var1,positionx+4,4),

     endian4)+2;

    gps_offset=0;

    subject_offset=0;

    date_offset=0;

    last_offset=0;

    do i=0 to numberx-1;

      xtag=inputn(substr(var1,offset+i*12,2),endian2);

      if xtag eq 34853 then do;

        gps_bytes=inputn(substr(var1,offset+i*12+2,2),

         endian2);

              if gps_bytes eq 2 then gps_offset=positionx+

         inputn(substr(var1,offset+i*12+10,2),endian2);

              else gps_offset=positionx+

         inputn(substr(var1,offset+i*12+8,4),endian4);

      end;

      else if xtag eq 40091 then do;

        title_offset=positionx+inputn(substr(

         var1,offset+i*12+8,4),endian4);

        title_length=inputn(substr(

         var1,offset+i*12+4,4),endian4);

      end;

      if i eq numberx-1 then do;

        last_offset=positionx+inputn(substr(

         var1,offset+i*12+8,4),endian4);

        last_length=inputn(substr(

         var1,offset+i*12+4,4),endian4);

        numberx2=inputn(substr(var1,last_offset+last_length,

         2),endian2);

        last_offset+(last_length+2);

      end;

    end;

    if last_offset then do;

      do i=0 to numberx2-1;

        xtag=inputn(substr(var1,last_offset+i*12,2),endian2);

        if xtag eq 36867 then date_offset=

          inputn(substr(

           var1,last_offset+i*12+8,4),endian4);

        else if xtag eq 40962 then width=

          inputn(substr(

           var1,last_offset+i*12+8,4),endian4);

        else if xtag eq 40963 then height=

          inputn(substr(

           var1,last_offset+i*12+8,4),endian4);

      end;

    end;

    if title_offset then title=compress(input(substr(

     var1,title_offset,title_length),$50.),

     "/ ?+#$%&","knp");

    else title="No Title";

    if gps_offset then do;

      numberx=inputn(substr(var1,gps_offset,2),endian2);

      offset=gps_offset+2;

      do i=0 to numberx-1;

        xtag=inputn(substr(var1,offset+i*12,2),endian2);

        if xtag eq 1 then lat=input(substr(

         var1,offset+i*12+8,1),$1.);

        else if xtag eq 2 then do;

          lat_offset=positionx+inputn(substr(var1,

           offset+i*12+8,2),endian4);

          latdeg=inputn(substr(var1,lat_offset+ 0,4),endian4)/

                 inputn(substr(var1,lat_offset+ 4,4),endian4);

          latmin=inputn(substr(var1,lat_offset+ 8,4),endian4)/

                 inputn(substr(var1,lat_offset+12,4),endian4);

          latsec=inputn(substr(var1,lat_offset+16,4),endian4)/

                 inputn(substr(var1,lat_offset+20,4),endian4);

        end;

        else if xtag eq 3 then lon=input(substr(

         var1,offset+i*12+8,1),$1.);

        else if xtag eq 4 then do;

          lon_offset=positionx+inputn(substr(var1,

           offset+i*12+8,2),endian4);

          londeg=inputn(substr(var1,lon_offset+ 0,4),endian4)/

                 inputn(substr(var1,lon_offset+ 4,4),endian4);

          lonmin=inputn(substr(var1,lon_offset+ 8,4),endian4)/

                 inputn(substr(var1,lon_offset+12,4),endian4);

          lonsec=inputn(substr(var1,lon_offset+16,4),endian4)/

                 inputn(substr(var1,lon_offset+20,4),endian4);

        end;

      end;

      coordinates=strip(put(latdeg,best12.))||" "||

      strip(put(latmin,best12.))||"' "||

      strip(put(latsec,best12.))||'" '||

      strip(lat)||","||

      strip(put(londeg,best12.))||" "||

      strip(put(lonmin,best12.))||"' "||

      strip(put(lonsec,best12.))||'" '||

      strip(lon);

    end;

    else coordinates="No GPS";

    if date_offset then dt_taken=

     input(substr(var1,positionx+date_offset,19),anydtdtm19.);

    else dt_taken=0;

    output;

    done=1;

  end;

run;

Trusted Advisor
Posts: 1,300

Deciphering a jpg file

Excellent work Art, very impressive!  Wish I had a little more time to have dedicate to this one.

Super User
Posts: 9,671

Re: Deciphering a jpg file

Art.

But I test it. It looks like it is not right.

I test the photo is FriedEgg offered.

Update: I am going to be working with this sample file: http://www.exif.org/samples/sanyo-vpcsx550.jpg

Your code get date taken is 01jan1960:00:00:00 .

But I take a look at this photo .it 2000-11-18:21:14:00.

and I also notice that if the photo were modified ,then your code will generated an errror.

Ksharp

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 16 replies
  • 336 views
  • 3 likes
  • 3 in conversation