Skip to content

Facilitating the Analysis of Data Sent Via Web Forms

After writing three consecutive entries here about Greasemonkey’s past and present security issues, I wanted to write about a different topic. I was actually fine with making Greasemonkey the topic once again, although this time, I preferred to emphasize the reasons to use it, rather than the reasons one may want to avoid it. I also wanted to write about topics other than Greasemonkey, as the percentage of posts about Greasemonkey on this blog is currently greater than I would like it to be. I aspire to make this blog about a somewhat more diverse group of topics, and so I decided to make this entry about webbots and automated form submission. And so it was time for a break from posts about Greasemonkey. Or so I thought.

I have recently been reading a book titled “Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL” by Michael Schrenk. And as I read through it, I came up with a few ideas for PHP/CURL projects. And even though none of the ideas for projects that I currently have involve automated form submission, I find the topic of automated form submission interesting. So I read through the chapter on this topic, and I found the examples of how to use the code available through the book’s website quite interesting. However, what I found more interesting and more important were explanations of what needs to be considered when writing code that automates form submission.

It is said in that chapter of the book that it is important to ensure that information sent by software that sends form data is what would be expected by a server receiving the data. One can look through the page source to discover what data is sent via a form. However, this way of determining what data is sent is inefficient, as it is time-consuming and may not lead to discovery of all data that gets sent to the web server through the form. Fortunately, on the website for this book is a page that, when form data is sent to it through HTTP GET or POST methods, displays that data sent through the form. And this page that displays variable names and values sent to it, among other useful data, gives developers a very efficient way of getting this data.

Although the data displayed on this page is useful for developers of form submission webbots, the suggested method of using the page is certainly not convenient. To use it, the action attributes of the <form> tags of forms to be analyzed need to be replaced with the URL of the web page to which the form data is to be sent for analysis. And so it is suggested by Schrenk that a copy of the web page on which one is working be saved to the local hard drive, and then source code of the page be modified to change these attributes accordingly. After that is done, one will need to open the saved and modified page in a web browser. And it is only after all of these steps are completed that one can enter data to be sent in a form to this web page that displays the data sent through the form.

There must be a better way, a more efficient way, an automated way, of updating attributes in the page than the process previously outlined. And does automation of changing attributes ever sound like a job for a Greasemonkey script?

As anyone with knowledge of what Greasemonkey can do knows, one can, with a Greasemonkey script that consists of only a few lines of JavaScript code, change the attributes of each form in a page automatically. And so I decided to take the few minutes it would take to write a Greasemonkey script that can make these changes. I had it set to work with all web pages, and so I have it disabled most times, as I usually intend to have form data go to its intended destination. One could also simply reconfigure the script so that it only works with certain pages, as one would likely only be interested in the form data sent from a few different pages. This is a script that is unique, as it is one that almost certainly requires some configuration on the part of the user. It is also unique in that it is one that may be disabled most times. However, this script performs its intended task where and when it needs to be performed.

If you have Greasemonkey installed, then you can install this script if you click here. One might measure the success and usefulness of a Greasemonkey script by referring to the number of times it is installed from a site such as Userscripts.org. And if one measures success and usefulness of scripts this way, then this script may not be considered very successful or useful at all. However, it is there for those who would like to use it, and there are some who might be interested in using it. After focusing on malicious scripts, I thought it would be appropriate to focus on a script that is beneficial and cannot be malicious. Still, it may not be the most useful script. And because I want to emphasize Greasemonkey’s benefits and usefulness in this post, I’ll end it with an amusing and entertaining video that explains how Greasemonkey can be useful.