On Thu, 31 May 2007, Jan Schaumann wrote: > > If you change this line in split3() from > > split1(sb.st_size/chunks, chunks); > > to > > split1((sb.st_size + chunks - 1)/chunks, chunks); > > > > then the last file will never be larger than the others, there won't > > be any "spillover", and you can remove all the new special cases in > > split1(). > > If by "all the new special cases", you mean the counting of the files > created, Yes... > then I'm not sure if using your approach does the right thing. > > Consider a file of 100 bytes size that you want to split into 11 files: It's just impossible to do that without violating either (A) the principle that all files should be the same size except that the last file may be smaller, or (B) the principle that exactly the requested number of files should be created and none should be empty. If you choose 9 bytes per file, you have one leftover byte that doesn't fit in the last file (but your special-case code deals with that by making the last file larger than the others, violating principle A above). If you choose 10 bytes per file, then the first 10 files use up all the input data, so either the 11th file won't exist at all, or it will be empty, violating principle B above. > My approach says "split bytewise into files with 100/11 = 9 bytes, no > more than 11 files" (ie 10 files with 9 bytes each, one file is 10 > bytes, total # of files is 11). > > Your approach says "split bytewise into files with (100 + 11 - 1)/11 = > 10 bytes" (ie 10 files with 10 bytes each). Yes, that's right. In this case (and in other cases where the input file size is less than (N-1)*(N-1)+1, where N is the number of files), my way results in the last file being empty or nonexistent. Your way always results in the last file being larger than the others (unless the input size is an exact multiple of the number of files). Another pathological case is where the input size is smaller than the requested number of files. My way will put 1 byte in each of the first few files, and will then stop. Your way will create empty files for all except the last, and then place all the input data in the last file. I think that my way is better in all these cases, but of course others may disagree. --apb (Alan Barrett)