Automating a Website Login using Chilkat HTTP

Question:

I am trying to write a program that will sign on to a web page. It is a sign on page where I have to enter the user ID and then the password. If is it valid then it goes to the web page.

Do you have any examples like that where I can use?

Answer:

First, it’s important to understand the difference between HTTP authentication and website login schemes. Chilkat implements various HTTP authentication methods: Basic, Digest, NTLM, and Negotiate. (See http://www.ietf.org/rfc/rfc2617.txt for information about Basic and Digest authentication methods. See http://en.wikipedia.org/wiki/NTLM for information about NTLM.) HTTP authentication methods involve adding HTTP header fields to an HTTP request that provide authentication credentials. Some authentication methods, such as Digest, NTLM, and Negotiate, involve a short back-and-forth handshake to establish authorization parameters in such a way as to avoid sending the actual password over the connection. Chilkat HTTP provides easy-to-use implementations for these authentication methods. It’s just a matter of setting the Http.Login and Http.Password properties, and then letting the component handle it. (The back-and-forth handshake for Digest, NTLM, etc. is handled transparently behind the scenes.)

A website “login” however, is something completely different. A typical scenario is a page that provides an HTML form with fields for submitting a login and password. The information is posted to the server and if valid, the HTTP response contains one or more Set-Cookie header fields containing authorization data. These cookies are then sent with subsequent HTTP requests to provide authentication.

A login process using cookies may be implemented in any way — it’s entirely up to the web programmer. For example, it might be that the submit button for the HTML form runs client-side Javascript that reads the login/password form fields, performs some computations, sets some cookies dynamically, and posts the HTTP request to the server. In a case such as this, it would be impossible to automate using Chilkat HTTP. The reason is that when using a browser, the HTML is loaded into a DOM (Document Object Model) and there is a Javascript engine to run the Javascript code, which may reference the DOM. There is no such DOM built with Chilkat HTTP, and there is no Javascript engine.

In summary, to automate a forms-based website login, you must fully understand what happens during the login process: 1) What happens when the submit button is pressed? 2) Does any client-side Javascript run? 3) What is submitted to the server? 4) What cookies are received back? 5) Perhaps the response contains Javascript that runs. If so, what does it do? etc. etc. etc.

It is not feasible for Chilkat support to analyze the millions of websites in existence which can implement forms-based logins in arbitrary, custom ways (because there is no standard). It is the application programmer’s task to investigate and understand the login process.