Grouper: Convert news searches and other web pages to RSS newsfeeds

Grouper: RSS Generator - Documentation

     Version 1.4.3 (7/15/2005)

This documentation has been replaced. The new documentation is available here.

Download | Install | Configure | Functions | Grouper Evolution Plugins | Atom Example | News Example | Associates Program
Configuration
Configuration is done by changing the values in the array $grouperconf. You may change the defaults by modifying grouper.php itsself. You may override the defaults for a particular RSS feed by inserting lines like "GrouperConf('cachepath','/my/private/cache/path');" after the line where you "require_once" Grouper, and before calling GrouperShow().

The following options are defined in $grouperconf.

Global Options: Cache Control | Miscellaneous | Network Connection
News Source Specific Options: Miscellaneous | Search & Result Parsing

Global Options
Use the function "GrouperConf" to set these options.

Cache Control:

  • cachepath: The path to the directory where your cached files will be stored--Important notes:
    1. If you wish to store the cache files in the same directory as the document being loaded by the web browser, change the value of cacherelative to 0, and set cachepath to '.', not ''. If you set it to '', Grouper will attempt to use your root directory to hold the cache files.
    2. To store the cache files in a location not relative to the location of Grouper, change the value of cacherelative to 0 and then change cachepath to the absolute path to where you want to store cache files.
    3. Storing the cache files in the very same directory as Grouper is not recommended.
  • cacherelative: This setting controls how Grouper interprets the value of cachepath. If cacherelative is "0", then cachepath must either be an absolute path or a path relative to the location of the PHP file being loaded by the web browser. If cacherelative is "1", then cachepath is relative to the location of Grouper itsself.
  • cacheinterval: The number of minutes before the cached file expires and the Google News search is repeated
  • cacheerrorwait: The number of minutes to wait before refreshing the cache if an error occurs while attempting to search Google News.
Miscellaneous:
  • contenttype: The MIME type for Grouper's output. This applies only when calling GrouperShow with $showit set to 1 (the default value). If you do not want Grouper to output a Content-Type header, set this to '' (blank).
  • groupererrors: 1 to show Grouper's error messages, 0 to surpress them. Note that this setting does not affect PHP errors. Use the following setting for that.
  • phperrors: -1 to leave PHP's error reporting setting where it is. Otherwise, indicate the value you desire for error reporting. When using predefined PHP constants, be not NOT to put quote marks around the values. See the PHP documentation for more details and valid values.
  • source: Selects the news source to search. Valid values are: 'google', 'yahoo', and for Grouper Evolution: 'feedster', 'daypop', and 'blog'. With Grouper Evolution, GrouperLoadPlugin('desired source name.php'); does the same thing.
  • maxitems: The maximum number of news items to display. Google's search results generally include 10 items, and at present, setting this value higher than that will not produce more results.
  • skipdups: Skip duplicate news items based on the headline (commercial version only). This feature is not supported with Feedster and Daypop.
Network Connection:
  • proxyauth: If you are using a proxy server which requires authorization and accepts "Basic" authorization, enter your credentials here as "username:password".
  • timeout: The number of seconds to wait during the "CONNECT" phase of setting up a TCP/IP connection before aborting the attempt to execute the search
News Source Specific Options
Use the function "GrouperSourceConf" to set these options. Be sure to call GrouperConf('source',desired news source); before calling GrouperSourceConf if you are not using the default news source. The following letters indicate whether a particular option applies to a particular news source: Google, Yahoo!, Feedster, Daypop

Miscellaneous:

  • channeltitle (G, Y): The channel title for the RSS feed.
Search & Result Parsing:
Some of these settings may need to be changed someday if Google changes the way searches are performed or how it outputs its results.
  • searchdomain (G, Y, F, D): The domain name of the server to send the search query to.
  • querystart (G, Y, F, D): The first part of the search query.
  • language (G, F, D): The language code for the search.
    NOTES:
    1. If you change the language code when searching Google News, you will also need to change the value of querystart to "Sort by date" in the language you select. Refer to the Google News page, doing a search in that language, to see exactly how they phrase it.
    2. This feature does not appear to work with Feedster at this time. We have verified that we generate the search query correctly, but it does not appear to affect the results.
  • edition (G): The edition code for the search.
  • encoding (G, Y, F): Specify the character encoding of the RSS feed. Valid values depend on what each site will accept. NOTE: This feature does not appear to work with Feedster at this time. We have verified that we generate the search query correctly, but it does not appear to affect the results.
  • sortby and sortby# (G, Y): sortby1 and sortby2 are extra code to add to the search query to control the sort order. To sort by relevance to the search terms, nothing is added. To sort strictly by date, some extra code is added. If you wish to sort by date, change sortby to 2.
  • sortby (F): Valid values are "relevance", "date" and "blogrank".
  • tossbefore (G, Y): Text or HTML to look for in the search results which appears just before the actual results. If you change the language of the search (not applicable for all news sources), you may need to change this value.
  • tossafter (G, Y): Text or HTML to look for in the search results which appears just after the actual results. If you change the language of the search (not applicable for all news sources), you may need to change this value.
  • extractionpattern (G, Y): A "regular expression" to use to chop up the search results into individual news stories and pull out the various parts of each story. If you don't have experience with regular expression matching, you probably shouldn't touch this! If Google changes the format of their results, we will provide an updated script to fix this.
  • extractionorder (G, Y): The order of the fields as they come out of the extraction pattern.
  • sourcetypes (D): Specifies what types of news sources to search. Valid values are: a (all), n (news), w (weblogs) and h (rss headlines).