Understanding EncryptStringENC and DecryptStringENC in Python and C/C++
Chilkat provides API’s that are identical across a variety of different programming languages. One difficulty in doing this is in handling strings. Different programming languages pass strings in different ways. In some programming languages, such as Python or C/C++, a “string” is simply a sequence of bytes terminated by a null. (I’m referring to “multibyte” strings, not Unicode (utf-16) strings. The term “multibyte” means any charset such that each letter or symbol is represented by one or more bytes without using nulls.) A Python or C/C++ application must indicate how the bytes are going to be interpreted. There are two choices: ANSI or utf-8. Each Chilkat class has a “Utf8” property that controls whether the bytes are interpreted as ANSI or utf-8. Note: The Utf8 property only exists in programming languages where strings are passed as a sequence of bytes. For example, in .NET strings are objects and are always passed as objects (and returned as objects). If the ActiveX is used, then strings are always passed as utf-16. However, in the case of Python or C/C++, strings are simply sequences of bytes and some additional mechanism must be used to indicate how the bytes are to be interpreted.
To encrypt a string, we must precisely specify the exact byte representation of the string we want to be encrypted. This is achieved via the Charset property. For example, maybe it is the ANSI byte representation that is to be encrypted. Or maybe it is the utf-16 byte representation. Or maybe utf-8, or anything else. The mechanism to specify the byte representation of the string to be encrypted must be entirely separate from the mechanism used to unambiguously pass the string to the Chilkat method. These are two separate things. Therefore, string encryption/decryption happens in these steps:
Encrypting a String (EncryptStringENC)
1) Unambiguously pass the string to the EncryptStringENC method.
2) (Internal to the Chilkat method) Convert the string to the byte representation specified by the Charset property.
3) Encrypt
4) Encode the binary encrypted bytes according to the EncodingMode property (which can be base64, hex, etc.) and return this string.
Decrypting a String (DecryptStringENC)
1) Pass the encoded string to DecryptStringENC method. Note that all possible encodings (base64, hex, etc.) use only us-ascii chars. In all multibyte charsets, it is only the non-us-ascii chars that are different. us-ascii chars are always represented by a single byte that is less than 0x80. Therefore, the Utf8 property can be either true or false because us-ascii chars have the same byte representation in both utf-8 and ANSI.
2) (Internal to the Chilkat method) Decode the base64/hex/etc. to get the binary encrypted bytes.
3) Decrypt to get the string in the byte representation as was indicated by the Charset property when encrypting. (The Charset property must be set to this same value when decrypting.)
4) Unambiguously return the string. For a languages such as Python or C/C++, this means examining the Utf8 property setting, and performing whatever conversion is necessary (if any) to convert from the charset indicated by the Charset property, to return the string in the ANSI or utf-8 encoding. (For languages such as C#, Chilkat will convert as appropriate to return as string object to the .NET language.)