Downloading Super-Huge Files via FTP, SFTP, Http, etc.

It seems that more and more nowadays, Chilkat has customers needing to download gigantic files (many Gigabytes) using various protocols such as SFTP, FTP, or HTTP. These downloads can take a long time, even with a very good transfer rate. The longer the download time, the higher the probability that something goes wrong: network congestion, connectivity issues, server problems, etc.

Chilkat provides resume functionality for downloads in all protocols (HTTP, SFTP, FTP). This allows for a failed download to restart at the point it failed. This works by examining the byte count of the local file, and then restarting the download at that byte index for the remote file.

It is also important to have in mind a way to verify after-the-download that the local file indeed contains the exact bytes as the remote file. Many cloud storage API’s (S3, Glacier, and others) incorporate checksums (or hash values, such as SHA256). The hash of the large file on the server is made available, and then after the download the local file can be hashed, and the hash values compared to ensure your local file is identical to the remote file.

If you are hosting large files on your own FTP, SSH/SFTP, or HTTP server, I would recommend including a hash value in your system design/architecture. For example, let’s say you store a huge file named “big.zip” on your FTP server. You might also store a file named “big.zip.sha256” alongside “big.zip”. Whatever system or application originally uploads big.zip would also compute the SHA256 hash of big.zip and upload it as well. An application that downloads big.zip would also download big.zip.sha256 and then verify that the SHA256 hash of the local file matches the downloaded SHA256 hash. This is just one possible implementation idea. The point is: You want to be sure that what was downloaded, especially if the download was completed after several restarts, is an exact duplicate of the remote file.