Chapter 8 File reading and processing
There are many ways to show the contents of a file. Below are a few examples.
The files for the examples are within the directory: "/pub14/tea/nsc2xx/Linux/5_reading_files/" (replace xxx with your user number).
8.1 Print out a file
The cat
command will print out the entire contents of a file to the screen. This is useful for small text files and pipelines (pipelines are not covered here).
Example commands are below (remember to replace xxx with your user name number):
Note: remember tab complete and using the arrow keys
Print contents of "short_file.txt" to screen
Print contents of "Scientist.txt" to screen
Print contents of "ecoli.gbk" to screen
Remember: The clear
command.
8.2 head and tail
The head
command will print out to screen the top n lines of a file.
The tail
command will print out to screen the bottom n lines of a file.
The default value is 10. The -n
option can be used to indicate how many lines to print out.
Carry out the below commands in the directory "/pub14/tea/nsc2xx/Linux/5_reading_files/"
Print out the top 10 lines of "ecoli.gbk"
Print out the bottom 10 lines of "ecoli.gbk"
Print out the top 25 lines of "ecoli.gbk"
Print out the bottom 2 lines of "ecoli.gbk"
Print out all but the bottom 2 lines of "Scientist.txt"
Print out all lines starting from the 2nd top line of "Scientist.txt"
Print out all but the bottom 5 lines of "Scientist.txt"
Print out all lines starting from the 3rd top line of "Scientist.txt"
Print out the top 25 lines of "ecoli.gbk"
Print out the bottom 2 lines of "ecoli.gbk"
8.3 File viewing with less
The less
command will display a file’s contents one page at a time. Various keys on the keyboard will allow you to navigate the contents of the files. The below actions will occur identically with the man command.
- q : Exit
- up and down arrow keys : Will move up/down 1 line at a time
- space : Move down one page
- b : Move up one page
/
: Follow this by a term to search for it in the file’s contents- n : Find the next occurrence of the term last searched for
- N : Find the previous occurrence of the term last searched for
- g : Jump to the first line of the file
- G : Jump to the bottom line of the file
Use the less
command to view the contents of the "ecoli.gbk" file. Then find the 3rd occurrence of the word ‘ribosome’. Afterwards move around the file.
Look at the man
ual for less
and search for the first occurrence of the string ‘percent’. Afterwards look around the manual page.
8.4 Word count
The wc
command will allow you to word count files. It will display line, word and byte counts for files in that order.
Use wc
to see the line, word and byte count of the "short_file.txt", "Scientist.txt" and "ecoli.gbk" files. As you can see you can carry this out on multiple files at once.
Count the number of characters in the "short_file.txt" file
Count the number of lines in the "ecoli.gbk" file
8.5 Pattern searching
The grep
command will search for a pattern in a text file and output all the lines containing the pattern.
Print out the lines from "Scientist.txt" that have the number 18 in them. In this particular example it prints out all scientists which were born in the 1800s. This will not always be the case depending on the data in the file.
Print out the lines which have the string "Ada" in them.
Print out the lines which have the string "ada" in them. There should be none, as grep is case sensitive.
Type in the following command.
The above command will be stuck as grep
does not know what it is looking for. To cancel the command use ‘Ctrl’ + ‘c’
8.6 Text editor
Three of the most popular text editors are vim, gedit and nano. Below is a quick introductions to nano.
nano is the easiest to learn but is quite limiting. vim and gedit are quite similar in power with different people preferring one or the other.
The below will teach you nano. If you are interested in learning vim in the future you can find a quick guide in the appendix.
8.6.1 nano
To enter the nano
text editor you can use the command nano
. The command structure is: nano file.txt
.
nano
can be run with a previous file name which you can then edit or a new file name in which case you will create a new file.
Once you are in the editor you can type characters and move around with the arrow keys.
To carry out specific functions you will need to use Ctrl or Alt with another key. At the bottom of the editor are a few examples where the ^
indicates Ctrl. For example the ^G Get Help
means you need to press Ctrl+G to get help. When you use letters this way in nano they are case insensitive (i.e. the CAPS lock can be on or off and you will get the same result).
After you carry out a function ensure you look at the bottom of the editor again as it may ask you to type something or you may get a new series of functions you can use.
Below are some important examples:
- Ctrl+X - Exit nano
- Ctrl+S - Save file
- Ctrl+O - Save file as
- Ctrl+A - Jump to the start of a line
- Ctrl+E - Jump to the end of a line
- Ctrl+W - Start search (Where is) Note This unfortunately is also the shortcut to close a tab in internet browsers. Therefore this can't be used within our webVNC.
- Alt+W - Continue search forward (find next occurrence forward)
- Alt+Q - Find next occurrence backward
- Alt+K - Cut current line
- Alt+\ - Go to the first line
- Alt+/ - Go to the last line
8.6.2 Tasks
Carry out the following tasks in the directory: "/pub14/tea/nscxx/Linux/5_reading_files/"
Using a text editor (nano or vim) add an entry for Scientist Mae Jemison (Born: 1956) to the file "Scientist.txt". The names and date are separated by one tab.
Using your text editor of choice delete all the scientists born before 1000 in the "Scientist.txt" file and save this as "Scientist_post_1000.txt".
8.7 MCQs: File reading and processing
Please attempt to answer the below Multiple-Choice Questions to reinforce what you have learnt in this chapter.
- What command searches for a pattern?
- What command word counts files?
- What command prints the contents of a file?
- What command displays a file's contents one page at a time and allows keyboard navigation?
- What command prints out the top n lines of a file
- What command prints out the bottom n lines of a file