How to split a csv file

If you’re on a PC, the instructions we posted here: http://www.themacroscope.org/?page_id=418 work. It’s a macro, in visual basic, for excel. But after a long back and forth yesterday with @thomasgpadilla we worked out that it would break in various Mac versions of Excel. Why? I do not know. But there’s a pretty simple command line sequence on a Mac that’ll do the trick. We split it into big chunks, then split the big chunks into smaller chunks. So, grab a moderately large CSV (like John Adams’ diary, 1025 entries, 1 per row) and fire up your terminal thusly:

$ split -l 200 johnadams-diary.csv 200out
$ split -l 1 200outaa aa
$ split -l 1 200outab ab
$ split -l 1 200outac ac
$ split -l 1 200outad ad
$ split -l 1 200outae ae
$ split -l 1 200outaf af

and so we get a series of files,







until all 1025 rows are their own file. But they don’t have extensions. So:

find . -type f -exec mv '{}' '{}’.csv \;

will recursively go through the current folder, finding files, and appending .csv to them.

Ta da!

[edit] sometimes you should read the manual. The error message I was getting when I split the original file one line at a time was ‘file too big’. But of course, that’s because of the default names a-z by a-z, so only 676 combinations, which means that yes, 1025 lines *is* too big… but, if you tell split to use a three-letter prefix, you can split up to 17576 lines. The -a flag lets you make this change, like so:

split -a 3 -l 1 johnadams-diary.csv

which is what I should’ve done in the first place. D’oh! Ah well: you don’t have to know everything in digital history. Work it out in public, and somebody is sure to let you know how you could’ve done it better 😉 Happily, I caught this fairly quickly after I made my first post – but how much more elegant if I’d gone for the best solution right away? Well, sometimes, the best solution, is the one that works when you need it to.