While it was one big character column to begin with what we did was figure out what kind of variable it was going to be (free text, or a free ranging number like weight, or a date, or a numeric variable but one that had a format, like 1, 2, and 3 with a format of Yes, No and Don't Know), etc. and then deal with each of those separately. And actually it got more complicated than that for various reasons, one of which was that some dates required an exact date, and for other dates the month and year were enough and we could infer the 15th of the month and for yet other dates just the year was good enough and we could infer July 2 as the month and date.
Maybe there is a more efficient way of doing it but at this point there is no turning back since we have it all working (I think) except for the formats. I think we're at the point where we have all the variables in a nice SAS dataset and the format names are elsewhere, with the same SAS variable names, and we can merge them and at that point we'll have the variables and their formats in the same dataset. Of course, all variables will have a format of one kind or another but I'm talking about the formats like 1=Yes, 2=No, 3=Don't Know. The rest of the formats will already be assigned. If so I may be able to do that with a data _null_.
I've been busy with other stuff so I haven't tried it yet but I just wanted to post this so you know I haven't vanished.
... View more