If you’ve ever needed to handle a large CSV file at the command line, Ruby is a good option and would be more reliable at CSV parsing than
awk. I found this necessary when trying to manipulate a CSV with more than 65k lines, which my available spreadsheet program (Apple’s Numbers) didn’t like.
This is the base command:
ruby -rcsv -ne 'puts $_.parse_csv'
-rcsvloads (“requires”) the built-in CSV library
-eprovides a script to execute (rather than a filename)
while getsloop around the script provided with
rubyact a little like
$_ variable is the current line, and each line is manipulated separately in the stream. Once parsed, it’s a Ruby Array, so you can use any Enumerable or Array methods. For example, to get the very last column:
ruby -rcsv -ne 'puts $_.parse_csv.last' # or ruby -rcsv -ne 'puts $_.parse_csv[-1]'
You can pipe it like you’d expect in Unix as well. As an example, here’s how to get the maximum value for the last column of a big CSV file.
cat big.csv | ruby -rcsv -ne 'puts $_.parse_csv.last' | sort -n | tail -n1