Tuesday, February 07, 2023

Removing Unwanted Lines From A File

Intro

 This is a bit of a "back to basics" post.  I find it is good to revisit, now and then, as it keeps you sharp.  It also helps as a reminder when you find yourself needing to perform this task.  One never know when they need to remove lines from something like a log file. ( for say, hiding one's interactions on a server )

So, lets say you have a file ( we will call it logfile.txt) on a system, and you need to remove all the lines that contain the text foobar (I know, pretty standard, but you understand the usage).  I will preface this with the fact that this is being done on Linux.  And as with anything on Linux, there are a plethora of ways to do things.  So long as the end result is what you require, then the method was correct, even if there are quicker or more efficient ways.    These are just a couple of the quick ways that you could achieve this goal.  So without further ado, let's jump right in.

NOTE: Please keep in mind that all of these methods will achieve the same exact thing, just in different ways.



Method 1

So the first method is more my preferred method.  Why my preferred?  Well, because its a one-liner and it doesn't require me to do any moving of files to different names.  Its just quick and direct.  But also, you need to make sure you are absolutely positive that it is doing what you expect, as it is immediately effecting of your file and not reversible.

 

The Method:

$ sed -i '/sometext/d' logfile.txt

 In the above, the '-i' tells sed to edit the file in place (that's what makes this a more dangerous and immediately effecting version).  The /sometext/ is the text that you want to match on each line in the file.  (yes, the file will be read, line by line, matching against 'sometext' to check for a match).  The 'd' option says to delete that line if a match is found.  'logfile.txt' is the file to search in. 

This tends to work quick and is efficient.  It would help your situation to search the file, say with grep, first, and ensure of what you will be matching.  Caution is always a good thing to err on the side of, but that's just me. 


A Note Before Continuing:  The above method is the only immediate method I am presenting.  The other two methods I show, will involve the use of temporary files.  That said, they are the safer options, unless of course, you decide to script them, in which case, you make them more immediate.  Your choice.



Method 2

This method involves using grep and using the '-v' option, which will ignore the lines that match.  It will take the lines that do NOT match, and output them to the temporary file.  You will then take the temporary file, after grep runs over your file, and move it back to the original filename.  Or, you could take and name it something different, the option is yours.

 

The Method:

$ grep -v "sometext" logfile.txt > tempfile && mv tempfile logfile.txt

 

Again, you could move the tempfile to another filename.  Or, even better, after purging into the tempfile, rename the original file to a backup name, and then rename the tempfile to the original name.

 

 

Method 3

This is the last method I will cover here.  This involves using the awk utility.  Just as the others, it will do a match, but its method is to use a '!', which means DON'T match, which tells awk you want to match all lines that do not contain the matching text.  So any lines that do not have the 'sometext' matching text, will be output to the tempfile.  

 

The Method:

$ awk '!/sometext/' logfile.txt > tempfile && mv tempfile logfile.txt

 

And as noted previously, you can rename as you wish.  Either back to the original file name, or backing up the original name first, and then renaming.  Its totally up to you.  And also, totally scriptable.

 

Conclusion

I hope that this helps you manage this basic of tasks.  Enjoy!

 

 


No comments:

 
Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.