I have an input file and i am trying to group it and create a text file in a particular format.
Input data
vnumber key1 suffx source amount product cust
102XX 2948S X Axis 100 prod1 cust1
102XX 2948S X Axis 50 prod1 cust1
102XX 2948S X Axis 200 prod2 cust2
245TX 1801 Y Base 250 prod1 cust1
245TX 1801 Y Base 320 prod2 cust1
809TX 2341 Z Code 313 prod1 cust1
810TX 2341 Z Code -826.88 prod1 cust1
I am trying to create the following output from above
Record type vnumber key1 suffx source Number of lines amount
V 102XX 2948S X Axis 350
L 102XX 2948S X Axis 3 350
V 245TX 1801 Y Base 570
L 245TX 1801 Y Base 2 570
V 809TX 2341 Z Code 313
L 809TX 2341 Z Code 1 313
V 810TX 2341 Z Code -413.44
L 810TX 2341 Z Code 1 -413.44
I am trying to sort the input data by vnumber, key1, suffx, source
and extract first observation from each by group as record of the type "V" (header)
and the same record as Line and remaining records of the same group as L and would like to get the L record count in a variable. I am not sure how to extract the count, get the sum of amounts for each group.
proc sort data=test out=test1;
by vnumber key_1 suffx source;
run;
data flags_v;
set test1;
by vnumber key_1 suffx source;
if first.vnumber and first.key_1 and first.suffx =1 and first.source = 1 then
do;
rec_type="V";
end;
else
do;
if vnumber = lag(vnumber ) and key_1 = lag(key_1) and suffx = lag(suffx)
and source = lag(source) then
rec_type2 = "L";
else rec_type = "V";
end;
run;
************************************************************************************************************************************
Added the below lines to ask an additional questions.
*************************************************************************************************************************************
I have an input file and i am trying to group it and create a text file in a particular format.
Input data
vnumber key1 suffx source amount product cust
102XX 2948S X Axis 100 prod1 cust1
102XX 2948S X Axis 50 prod1 cust1
102XX 2948S X Axis 200 prod2 cust2
245TX 1801 Y Base 250 prod1 cust1
245TX 1801 Y Base 320 prod2 cust1
809TX 2341 Z Code 313 prod1 cust1
810TX 2341 Z Code -826.88 prod1 cust1
I am trying to create the following output from above
Record type | vnumber | key1 | suffx | source | Number of lines | amount | Number of accounting lines | Accounting amount | VAR1 |
V | 102XX | 2948S | X | Axis | 350 | ||||
L | 102XX | 2948S | X | Axis | 3 | 350 | 2 | ||
A | 102XX | 2948S | X | Axis | 2 | 150 | |||
A | 102XX | 2948S | X | Axis | 2 | 200 | |||
P | Payment | ||||||||
V | 245TX | 1801 | Y | Base | 570 | ||||
L | 245TX | 1801 | Y | Base | 2 | 570 | 2 | 250 | |
A | 245TX | 1801 | Y | Base | 2 | 320 | |||
A | 245TX | 1801 | Y | Base | 2 | ||||
P | Payment | ||||||||
V | 809TX | 2341 | Z | Code | 313 | ||||
L | 809TX | 2341 | Z | Code | 1 | 313 | 1 | ||
A | 809TX | 2341 | Z | Code | 1 | 313 | 1 | 313 | |
P | Payment | ||||||||
V | 810TX | 2341 | Z | Code | -413.44 | ||||
L | 810TX | 2341 | Z | Code | 1 | -413.44 | 1 | ||
A | 810TX | 2341 | Z | Code | 1 | -413.44 | 1 | -413.44 | |
P | Payment |
I am trying to sort the input data by vnumber, key1, suffx, source
and extract first observation from each by group as record of the type "V" (header)
and the same record as Line and remaining records of the same group as L and would like to get the L record count in a variable
Once I am able to classify the records as V, L ..i need to check the records within L to see if prod and cust are same as prod and cust for previous records.
if they are same , then it can be treated as 1 'A'. If either of it is different, it has to be treated as different A. In the above example the first by group has distinct values for prod and cust
so it has two accounting lines and amounts also get added according to the accounting lines.
Finally I need to add a record 'P' for each group with an additional variable .
Appreciate any help with this.
Appreciate any help on this.
Regards,
Sheeba
I assume that amount for the last two records in your example was wrong. If so, I think that the following does what you want:
data test;
input (vnumber key1 suffx source) ($) amount (product cust) ($);
cards;
102XX 2948S X Axis 100 prod1 cust1
102XX 2948S X Axis 50 prod1 cust1
102XX 2948S X Axis 200 prod2 cust2
245TX 1801 Y Base 250 prod1 cust1
245TX 1801 Y Base 320 prod2 cust1
809TX 2341 Z Code 313 prod1 cust1
810TX 2341 Z Code -826.88 prod1 cust1
;
proc sort data=test out=test1;
by vnumber key1 suffx source;
run;
data flags_v (drop=cust nlines sum);
set test1;
by vnumber key1 suffx source;
if first.source then do;
nLines=1;
sum=amount;
end;
else do;
sum+amount;
nLines+1;
end;
if last.source then do;
amount=sum;
rec_type = "V";
output;
rec_type = "L";
Number_of_Lines=nLines;
output;
end;
run;
Art, CEO, AnalystFinder.com
I assume that amount for the last two records in your example was wrong. If so, I think that the following does what you want:
data test;
input (vnumber key1 suffx source) ($) amount (product cust) ($);
cards;
102XX 2948S X Axis 100 prod1 cust1
102XX 2948S X Axis 50 prod1 cust1
102XX 2948S X Axis 200 prod2 cust2
245TX 1801 Y Base 250 prod1 cust1
245TX 1801 Y Base 320 prod2 cust1
809TX 2341 Z Code 313 prod1 cust1
810TX 2341 Z Code -826.88 prod1 cust1
;
proc sort data=test out=test1;
by vnumber key1 suffx source;
run;
data flags_v (drop=cust nlines sum);
set test1;
by vnumber key1 suffx source;
if first.source then do;
nLines=1;
sum=amount;
end;
else do;
sum+amount;
nLines+1;
end;
if last.source then do;
amount=sum;
rec_type = "V";
output;
rec_type = "L";
Number_of_Lines=nLines;
output;
end;
run;
Art, CEO, AnalystFinder.com
Hi Art,
Thanks a lot for the response.
This was exactly what I was looking for.
Amounts can be negative and it can have decimal places too... I will try modifying the code to introduce additional level of records...just like your code calculates number of L within V...I need to calculate number of A within L depending on prod and cust. I am thinking of adding the variables to the sort and trying it out without impacting the existing functionality.
Thanks a lot.
Regards,
Sheeba
Hi Art297,
Thanks a lot for the response.
This was exactly what I was looking for.
Amounts can be negative and it can have decimal places too... I will try modifying the code to introduce additional level of records...just like your code calculates number of L within V...I need to calculate number of A within L depending on prod and cust. I am thinking of adding the variables to the sort and trying it out without impacting the existing functionality.
Thanks a lot.
Regards,
Sheeba
You didn't mention them before but, given the example data you provided, I presume that your additionally looking for changes/counts by either product, customer or both. However, you'd have to provide an example in order for anyone to help.
Art, CEO, AnalystFinder.com
Hi Art297,
I assumed it might cause a confusion. it was my mistake.
This is a great code and really helps.
At this point , I am not sure if I am allowed to ask additional questions on this thread....
Regards,
Sheeba
You can either mark this thread answered and start another thread, or ask continue using this thread. Your choice!
I, personally, would mark this one as answered and start another thread. That way, you'd likely get more responses to the followup question.
Art, CEO, AnalystFinder.com
Thanks a lot for the directions.
I will continue using this thread and edit the post.
I will keep this mind from now on.
Thanks,
Regards,
Sheeba
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.