BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AG_Stats
Quartz | Level 8

Hi Everyone,

 

Below are the codes for numeric type datalines.

 

data exposure;
	retain Type 'I' Hlo 'U';

	do Year=1944 to 1949;
		Fmtname=cats('exp', Year, 'fmt');

		do Start='A', 'B', 'C', 'D', 'E';
			End=Start;
			input Label : $3. @;
			output;
		end;
	end;
	drop Year;
datalines;
220 180 210 110 90
202 170 208 100 85
150 110 150 60 50
105 56 88 40 30
60 30 40 20 10
45 22 22 10 8
;

title "Creating the Exposure InFormat";
proc format cntlin=exposure fmtlib;
run;

data read_exp;
	input Worker $ Year JobCode $;
	Exposure = inputn(JobCode,cats('exp',Year,'fmt8.'));
datalines;
001 1944 B
002 1948 E
003 1947 C
005 1945 A
006 1948 d
;

Can anyone assist me if I have datalines a mixture of both numeric and char: (use the following datalines).

 

datalines;
Greater than 210 180 210 110 90
202 170 208 100 .
150 110 greater than 210 60 50
105 . 88 40 Less than 10
60 30 40 20 Less than 10
45 22 22 . Less than 10
;

 

Where, Greater than 210 is one value (character).  For simplicity if required you can replace it to Greater_210

similarly, for less than 10.

And, . denotes numeric missing.

 

However, it might be easy if we proceed by considering all values (in datalines) as character.

But, my concern is can we do for mixture of both type values.  If yes, then please give me codes.

 

Regards,
AG_Stats
1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Thanks for the image, which is not a huge help, though, because it essentially reflects the information contained in your existing dataset EXPOSURE. More interesting would be the corresponding table of what you want to achieve.

 

Currently, your informats assign 'A', 'B', ..., 'E' to certain numbers. So, it would make sense indeed to add an "other" category (HLO='O') in order to assign "unknown" input values such as 'F' or missing to something like "Not found." However, as mentioned in my first reply, a label like "Not found" would change the type of the informat from numeric to character. To preserve the type of the existing informats you could assign a special missing value, e.g. .N, to indicate the same. The text label "Not found" could then be assigned through a format.

 

So, the question is: Would you like to use the numeric EXPOSURE values (220, 180, etc.) obtained with your informats as numbers (e.g. perform calculations with them)? If so, I suggest to insert the following four lines of code between the two END statements in your first data step:

  hlo='O';
  label='.N';
  output;
  hlo='U';

I'm not quite sure how to interpret your suggestions involving HLO='H' and HLO='L', though. With HLO='H' you could define a label for an alphabetical range of letters such as 'F', 'G', ... But it seems odd to me that all these should be assigned the label "Greater_220." Similarly, HLO='L' could be used to define a label for characters which precede 'A' in the ASCII collating sequence (including digits or a blank). But why should, e.g., '9' be labeled 'Less_8'?

View solution in original post

5 REPLIES 5
ballardw
Super User

Please be more specific. If you are attempting to use a data set as shown below the posted code to build a Cntlin data set please show the equivalent Format value or invalue bits you would explect.

It is not obvious what your "greater than" or "less than" are to be compared to.

For example you have

Greater than 210 180 210 110 90

 do you want something like this?

210< - high ='A' 

180 = 'B'

210 = 'C'

110 = 'D'

90 = 'E'

 

Since you apparently have an abitrary location for the range indicator your input data set will have to be a bit more complicated so that

each "value" also has a space for a rule. It may be doable if you have something like:

>, 210,=, 180,=, 210,=,110,=, 90 and read the "rule" as character and then use If statements to assign the correct start, end and comparisons

 

And are you ever going to want a range such as 20 - 30 = 'A'?

 

 

FreelanceReinh
Jade | Level 19

Hi @AG_Stats,

 

I take it that you just want to create a new set of informats which read single letters (A - E) as values such as 202 or strings like "Greater than 210". Of course, this is possible. (Reading such datalines would be facilitated by separating the different values by two blanks or non-blank delimiters.) However, these new informats would be character informats, not numeric informats as the others. As a consequence, variable EXPOSURE would become a character variable, which might not be ideal for your purposes. To avoid this, one could replace the "Greater than ..." and "Less than ..." labels by numbers (or special missing values if this was more appropriate) and then apply a format to display these particular numbers (or special missing values, resp.) as "Greater than ..." etc.

AG_Stats
Quartz | Level 8

Hi,

 

Please find an image below which might help you: 

 

Untitled.png

 

The codes that I have given in my first post (question) are working well.

Now, again consider my question as I explained below (forgot my previous datalines given below codes - where I have mestioned greater than 200 etc.):

 

Only I want to add some more categories:-

 

Start = '  ';
End= Start;
Hlo = 'o'; /* Or you can change it if required & can add type= I or C also */
Label = 'Not Found';

 

Start = '  ';
End= Start;
Hlo = 'H'; /* Or you can change it if required & can add type= I or C also */
Label = 'Greater_220';

 

Start = '  ';
End= Start;
Hlo = 'L'; /* Or you can change it if required & can add type= I or C also */
Label = 'Less_8';

 

Regards,
AG_Stats
FreelanceReinh
Jade | Level 19

Thanks for the image, which is not a huge help, though, because it essentially reflects the information contained in your existing dataset EXPOSURE. More interesting would be the corresponding table of what you want to achieve.

 

Currently, your informats assign 'A', 'B', ..., 'E' to certain numbers. So, it would make sense indeed to add an "other" category (HLO='O') in order to assign "unknown" input values such as 'F' or missing to something like "Not found." However, as mentioned in my first reply, a label like "Not found" would change the type of the informat from numeric to character. To preserve the type of the existing informats you could assign a special missing value, e.g. .N, to indicate the same. The text label "Not found" could then be assigned through a format.

 

So, the question is: Would you like to use the numeric EXPOSURE values (220, 180, etc.) obtained with your informats as numbers (e.g. perform calculations with them)? If so, I suggest to insert the following four lines of code between the two END statements in your first data step:

  hlo='O';
  label='.N';
  output;
  hlo='U';

I'm not quite sure how to interpret your suggestions involving HLO='H' and HLO='L', though. With HLO='H' you could define a label for an alphabetical range of letters such as 'F', 'G', ... But it seems odd to me that all these should be assigned the label "Greater_220." Similarly, HLO='L' could be used to define a label for characters which precede 'A' in the ASCII collating sequence (including digits or a blank). But why should, e.g., '9' be labeled 'Less_8'?

AG_Stats
Quartz | Level 8

Quite useful, Thanks!

Regards,
AG_Stats

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 889 views
  • 3 likes
  • 3 in conversation