Linux: Splitting files in two

Here are two scripts splitting the lines of a file into two files based on a given ratio.

#!/bin/bash
 
# This script writeis the first part of the lines from the given input file into one output file and the rest of the lines into another output file. 
# The frist output file (with the postfix ".ratio") will contain a number of lines corresponding to the given ratio.
# The second output file (with the postfix ".rest") will contain the remaining lines.
# Both output files will have the given prefix.
 
# Arguments:
# 1: file name
# 2: split ratio (0;1)
# 3: output prefix
 
lines=$(wc -l $1 | sed "s/\s*[0-9][0-9]*.*/\1/g")
 
split=$(echo "$lines * $2" | bc -l)
split=${split%.*} # flooring number of lines for first file
 
echo $lines
echo $split
 
awk "{ if (NR <= $split) print \$0 > \"$3.ratio\"; else print \$0 > \"$3.rest\"}" $1
#!/bin/bash
#!/bin/bash
 
# Based on the given ratio this script will radomly distribute the lines of the given input file into two output files. 
# The frist output file (with the postfix ".ratio") will contain a number of lines roughly corresponding to the given ratio.
# The second output (with the postfix ".rest") file will contain the remaining lines.
# Both output files will have the given prefix.
 
# Arguments:
# 1: file name
# 2: split ratio (0;1)
# 3: output prefix
 
awk "BEGIN {srand()} !/^$/ { if (rand() <= $2) print \$0 > \"$3.ratio\"; else print \$0 > \"$3.rest\"}" $1
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Leave a Reply