This is best explained by way of an example. Suppose the file data contains the lines:
x y z v u w i - - - - - - - 0 1 2 3 4 * 6 1 2 3 4 5 ** 7 2 3 4 5 6 *** 8 3 4 5 6 7 v 9 4 5 6 7 8 vi 10where exactly one tab-character separates the columns.
The command select 'z == 2' < data (where == is the usual Unix notation for 'equal') yields
x y z v u w i - - - - - - - 0 1 2 3 4 * 6on the standard output.
Other numerical selection commands include < > >= <= and these can be used together, with && (and), || (or), using parenthesis as needed.
The command select 'z > 2 && v <= 5' < data' will give
x y z v u w i - - - - - - - 1 2 3 4 5 ** 7 2 3 4 5 6 *** 8This last command should be read: select those lines where the z column is (strictly) larger than 2 and (at the same time) the v column is less than or equal to 5.
Select simply calls upon awk to do the work. Any bugs in awk will be reflected in select.
Since the column names are translated into awk variable names, column names are bound by the same rules as variable names in awk. To be on the safe side, one should only use alphabetic characters and underscores. International characters (e.g. from ISO 8859/1, upper half) will probably not work. Other symbols, such as a hyphen (minus) and asterisk will not work.
Some names are reserved awk symbols and should not be used for column names (e.g. if, for, length).