Convert CSV to utf-8?
Question:
We’ve purchased a few of your products in the past and they work great. Here’s a question on another one: I’m creating a .csv file within a vfp program and I need it to be an Utf-8 format. I’m not finding anywhere how to do that programatically. Do you have a product that will convert?
Answer:
For those that may not be familiar with the meaning of “character encoding”, it defines the byte representation for characters. For example:
Consider the character: É
- In the iso-8859-1 character encoding, it is represented by a single byte: 0xC9
- In the utf-8 character encoding, it is represented by a two bytes: 0xC3 0x89
- In the ucs-2 character encoding, it is represented by a two bytes: 0x00 0xC9
To save text (such as the CSV text) to a file using any desired character encoding, load the string into a Chilkat StringBuilder object, then call StringBuilder.WriteFile.
The arguments to WriteFile are:
- localFilePath — the local path of the file to create.
- charset — such as “utf-8”.
- emitBom — true/false (or 1/0) to indicate if a Byte Order Mark is desired. BOM’s (also known as preamble) exist for utf-8, utf-16, utf-32, and big-ending utf-16/32. If you don’t know if you want it, choose false. A BOM is a sequence of 2 or 3 bytes that indicates the charset of the file. For example, if the 1st three bytes of a text file are 0xEF,0xBB,0xBF, then you have a utf-8 file.
Note: In some programming languages, such as VB6, VB.NET, C#, FoxPro, and many others, a “string” is an object. You never deal with the actual byte representation of individual characters. Therefore, if you have a string (such as in Visual FoxPro or C#), you can pass it to StringBuilder.WriteFile without worry. Chilkat will write the characters using the byte representation (i.e. character encoding) desired for the file.
Also, everywhere in the Chilkat API where a charset can be specified, any of the charset’s listed at https://cknotes.com/chilkat-charsets-character-encodings-supported/ may be used.