Creating an Intelligent Agent with ColdFusion
The dot-com gold rush is over. In order to survive in this economy, businesses are not so willing to depend on advertising for their main revenue stream. As they turn away from advertising, they look increasingly to partnerships to provide them with stability and flow. One way of doing this is content syndication. You may have heard of television shows that are "syndicated." TV networks resell the right to broadcast certain popular shows, which is the reason you might see the same show on different channels on the same week. I have not watched television in several years, so I hope that I am not abusing this concept. The idea is the same on the Web. Content syndication is the process of serializing the content of your site, making it available to others in some form that they can retrieve automatically. Businesses can enter into meaningful partnerships that automate the process of sharing their content.
In this article, you will learn how to leverage ColdFusion to create intelligent agents. We will make an application that goes out to an external Web site, retrieves the headlines in WDDX format, and processes the result for display on your Web site.
WDDX stands for Web Distributed Data eXchange. It is an application of XML that was created by Allaire in 1997 as an open, nonproprietary project. Using WDDX, you can serialize and deserialize data easily for exchange with applications written in other languages, such as Perl or ASP. You can learn more about WDDX at http://www.openwddx.org.
CFHTTP
This tag allows you to get data from and post data to any Web site. This is a very powerful tag in ColdFusion. Using the post action, you can post information to an external Web site; using the get action, you can retrieve the entire contents of a Web page into a local variable or a file.
Table 1 shows the attributes for using the <CFHTTP> tag.
Table 1 <CFHTTP> Tag Attributes
Attribute |
Description |
---|---|
URL |
Required. Full URL of the hostname or IP address of the server that hosts the file. The URL must be absolute and must include the protocol and hostname. You may include port number, if applicable; a port specified in the URL will override any value in the port attribute. |
Port |
Optional. The port number on the server from which the object is requested. Default is 80. |
Method |
Required. GET or POST. Use GET to download a text or binary file or to create a query from the contents of a text file. Use POST to send information to a server for processing. POST requires use of CFHTTPPARAM tag. |
Username |
Optional. A valid username when the server requires it. |
Password |
Optional. A valid password when the server requires it. |
Name |
Optional. The name to use for a query if you create a query from a file. |
Columns |
Optional. Specifies the column names for a query when creating a query as a result of CFHTTP get. By default, the first column of a text file is interpreted as column headings. If there are column headings in the text from which to draw the query, do not specify this attribute unless you want to overwrite them. If there are no column headers in the text file, or if you want to overwrite them, you specify the Columns attribute. ColdFusion never treats the first row of a file as data. |
Path |
Optional. Path to the directory in which a file is to be stored. If a path is not specified in a POST or GET operation, a variable is created that you can use to display the results of the POST operation using CFOUTPUT. That variable is cfhttp.FileContent. |
File |
Required for Method=POST if a path is specified. The filename to be used for the file that is accessed. For GET operations, this defaults to the name specified in the URL attribute. |
Delimiter |
Required for creating a query. The default is a comma. The other option is TAB. |
TextQualifier |
Required for creating a query. Indicates the start and end of a column. Should be escaped when embedded in a column. The default is a double quotation mark ("). |
ResolveURL |
Optional. YES or NO. For GET and POST operations, if YES, page reference returned into the fileContent variable has its internal URLs fully resolved so that links remain intact. This includes images, form actions, hrefs, framesources, and other objects that can contain links. |
ProxyServer |
Optional. Name or IP of a proxy server. |
ProxyPort |
Optional. The port number on the proxy server from which the object is requested. When used with ResolveURL, the URLs of retrieved documents that specify a port number are automatically resolved to preserve links in the retrieved document. The default is 80. |
UserAgent |
Optional. User agent information to write to the request header. |
ThrowOnError |
Optional. Boolean indicating whether to throw an exception that can be caught by using the CFTRY and CFCATCH tags. The default is NO. |
Redirect |
Optional. Boolean indicating whether to redirect execution or stop execution. The default is YES. |
Timeout |
Optional. Value in seconds. When a URL timeout is specified in the browser, the Timeout attribute setting takes precedence over the ColdFusion administrator timeout. The ColdFusion server then uses the lesser of the URL timeout and the timeout passed in the timeout attribute so that the request always times out at the same time or before when the page times out. If there is no timeout set on the URL in the browser, the administrator, or this attribute, ColdFusion waits indefinitely for the request. |