Create a CSV from other CSVs

That’s another reason why I would use Python to do all this sort of stuff and not expect to use DEVONthink as my “hammer”.

Curly quotes could just be considered a special character, since they’re not part of the CSV specification AFAIK. So, „Funny header“ stays just as it is, with quotes and all. The only problem with that would be an included comma which would split this string into „Funny and header“, effectively creating two columns.

I now see what you mean how the type of quotes on my dummy files on the header rows are different than rest of file. In my world that a detail that the computer overcomes on import without even bothering to tell me. :wink:

Would a python compiler accept «this» as a string? The programming languages I know are quite narrow-minded about acceptable quotes. As is CSV.

We are getting a little off-topic here, but interesting none the less.

It’s not the “compiler” that is relevant here. It’s the function I’m using.

I’m not quite sure what you are asking. If you are asking if « or » can be a delimiter, answer is “yes”. But far as I can tell one can’t use a different start from end for one text element. You can have different delimiters spread through the file, but opening and closing for an element has to be the same.

For my example, I used the function from the Pandas Library, pandas.read_csv which is fully documented at pandas.read_csv — pandas 1.4.4 documentation. As you can see, lots of knobs to tweak to support your data munging.

See the sep= parameter, which can also accept a regex, as in

df = pd.read_csv(fn,sep="«|»|:|",engine='python')

Well, I guess that is a bit outside the CSV specifications (if there even is a specification). At least according to what I found on Wikipedia, the only permissible (or widely used) quote characters are the simple double quotes from ASCII times (which makes sense, given the age of CSV). But then there were other attempts at standardizing, and I didn’t bother to read them all.

Yep. That’s more like a read whatever calls itself CSV that I throw at you. A bit like what GUIs in Excel and Libre/StarOffice offer. And probably a lot more than a “usual” CSV reader
would provide for, I guess.

Which means that we’re going way, way off from the usual CSV stuff. Not to say that it isn’t useful, of course. Just that it reaches out far beyond CSV.

1 Like

In real life, most people and computers creating CSV files have no clue about or loyalty to the CSV spec. Hence Pandas has given us tremendous power to deal with complexity for getting on to the hard work of analysis.

I see now you were talking about the quote character, not the delimiter. In the pandas.read_csv() function they give options there too, but open/close probably have to be the same.

exactly. Which would exclude the fancy stuff like “French” and “German” quotes («», „”“ etc.)

Give it a try…