BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Kp1234
Fluorite | Level 6

Hello, 

My Dataset which is a .csv file (Comma Seperated):

Header: ID,"Folder Path","File Name","Log Time",Action,"Login Name",IPAddress,"Transfer Size",Duration

Row1:1344624,"/Distribution/Projects/WUXI-EDO/Input","C12345_xxx_wxp_img_20221013_114200_349.cum",,"file_download",mft2sftp,"172.24.147.36",46438, Row2:13445945,"/Distribution/Projects/WUXI-EDO/Input","C12345_xxx_wxp_img_20221013_114200_349.cum",,"file_upload",hud8,"204.134.192.105",46438,

Row3:13442235,"/Distribution/Projects/WUXI-EDO/Input","C12345_xxx_npn_img_20221014_113344_600.cum",,"file_upload",kp123,"204.134.192.105",46438,

 

Basically, I want to flag any File Name upload which does not have a corresponding download.  In the above example, The expected output is ROW 3 since it does not contain a "file_download" row

 

My code: 

data alldata_status_1;
set alldata;

do;
if File_Name^= ' ' ;
if action in ('file_upload','file_download')
then download_flag = 'yes';
else download_flag = 'no';

end;
run;

 

I am using Download_flag since It will help me subset the datasets. 

 

Thank you for your help.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@Kp1234 wrote:

Thank you for the solution! Perhaps i was not clear with the solution I was looking for. 

 

In the final output, I would like row 3 to have download_flag= 'No' since for that file_name action upload does not have action download. 

 

Basically, 

If file_name matches, check Action variable has both Upload and Download for that file_name. If true, Download_flag = "Yes" else "NO"


So the only way that makes any sense is if what you mean is that you want to check if the same file has records with both upload and download actions. Which means it is not something you can do by checking just one observation.

 

proc sql ;
create table check as 
  select file_name
       , sum( action = 'file_upload') as uploads
       , sum( action = 'file_download') as downloads
  from have
  group by file_name
;
quit;

Now with that level of information you can check if a file had both upload and download records.

data want;
  set check;
  if uploads and downloads then want='YES';
  else want='NO';
run;

View solution in original post

4 REPLIES 4
SASKiwi
PROC Star

Check your CSV file once imported. I strongly suspect your "Action" column isn't populated because it coincides with double comma delimiters (,,) in your data rows.

Tom
Super User Tom
Super User

Does your file really have that extra stuff at the start of each line?  Why is ROW2 in the middle of ROW1?

Header: ID,"Folder Path","File Name","Log Time",Action,"Login Name",IPAddress,"Transfer Size",Duration
Row1:1344624,"/Distribution/Projects/WUXI-EDO/Input","C12345_xxx_wxp_img_20221013_114200_349.cum",,"file_download",mft2sftp,"172.24.147.36",46438, Row2:13445945,"/Distribution/Projects/WUXI-EDO/Input","C12345_xxx_wxp_img_20221013_114200_349.cum",,"file_upload",hud8,"204.134.192.105",46438,
Row3:13442235,"/Distribution/Projects/WUXI-EDO/Input","C12345_xxx_npn_img_20221014_113344_600.cum",,"file_upload",kp123,"204.134.192.105",46438,

Or is all of that an artifact of typing/pasting the lines into the comment window instead of the pop-up window that would preserve the formatting?

ID,"Folder Path","File Name","Log Time",Action,"Login Name",IPAddress,"Transfer Size",Duration
1344624,"/Distribution/Projects/WUXI-EDO/Input","C12345_xxx_wxp_img_20221013_114200_349.cum",,"file_download",mft2sftp,"172.24.147.36",46438, 
13445945,"/Distribution/Projects/WUXI-EDO/Input","C12345_xxx_wxp_img_20221013_114200_349.cum",,"file_upload",hud8,"204.134.192.105",46438,
13442235,"/Distribution/Projects/WUXI-EDO/Input","C12345_xxx_npn_img_20221014_113344_600.cum",,"file_upload",kp123,"204.134.192.105",46438,

If the file looks like the latter then your code should work fine although you should remove the unnecessary DO and matching END as they add nothing at all to the logic, just confusion for the programmer.

 


proc import dbms=csv file=sample out=want replace ;
run;

data want2;
  set want;
  if File_Name^= ' ' ;
  if action in ('file_upload','file_download') then download_flag = 'yes';
  else download_flag = 'no';
run;

proc print;
run;

Tom_0-1670297084747.png

WARNING:  If you somehow have accidentally set the VALIDVARNAME option to ANY before running the PROC IMPORT step then the third columns variable name will be "File Name"n instead of File_Name.

Or if you wrote your own data step to read the file then use the variable names you created.

 

Kp1234
Fluorite | Level 6

Thank you for the solution! Perhaps i was not clear with the solution I was looking for. 

 

In the final output, I would like row 3 to have download_flag= 'No' since for that file_name action upload does not have action download. 

 

Basically, 

If file_name matches, check Action variable has both Upload and Download for that file_name. If true, Download_flag = "Yes" else "NO"

Tom
Super User Tom
Super User

@Kp1234 wrote:

Thank you for the solution! Perhaps i was not clear with the solution I was looking for. 

 

In the final output, I would like row 3 to have download_flag= 'No' since for that file_name action upload does not have action download. 

 

Basically, 

If file_name matches, check Action variable has both Upload and Download for that file_name. If true, Download_flag = "Yes" else "NO"


So the only way that makes any sense is if what you mean is that you want to check if the same file has records with both upload and download actions. Which means it is not something you can do by checking just one observation.

 

proc sql ;
create table check as 
  select file_name
       , sum( action = 'file_upload') as uploads
       , sum( action = 'file_download') as downloads
  from have
  group by file_name
;
quit;

Now with that level of information you can check if a file had both upload and download records.

data want;
  set check;
  if uploads and downloads then want='YES';
  else want='NO';
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 857 views
  • 0 likes
  • 3 in conversation