CaRP: Caching RSS Parser - Manual
     Version 3.5.4 (10/7/2004)

NOTE: This version of CaRP is obsolete. This documentation is being left here for those who have not yet upgraded to the current version.

Download | Installation | Configuration | Upgrading from version 2 | Functions | Examples | Donations

Old documentation: Version 2

Examples:
Styles & Classes | Multiple Feeds on a Page | Unordered List | Background Refreshing | Filtering | Aggregating 1 | Aggregating 2 | Amazon.com Associates | JavaScript

When displaying newsfeeds from slow servers or displaying many newsfeeds on the same page, your own page will load slowly whenever the cache is refreshed. One way to speed up the loading of multiple newsfeeds on a page is to set different cache timing for each using the funciton CarpConf('cacheinterval',60); (where 60 is the cache interval in minutes). If visitors come to your page often enough, usually only one cache will be refreshed at a time, providing acceptable performance. However, if your page is only rarely accessed, using different cache intervals may not solve the problem.

Another solution is to set up background refreshing of your cache files. This is done by setting up a "cron job" to periodically load a web page containing the newsfeeds. When setting up such a system, note the following points:

  1. The page loaded by the cron job should not be the same page that is loaded by your website visitors. The reason for this is that the cache interval for the page loaded by the cron job needs to be shorter than the interval for the page loaded by your visitors to ensure that the cron job always handles the refresh.
  2. Another reason for having a different page is that the page loaded by the cron job doesn't need to contain anything but the newsfeed, but the page being shown to visitors will usually contain various other content. See "update_news.php" below for an example.
  3. It may be best to have a separate web page (a separate file like "update_news.php") and cron job (a separate "update_news.pl", or a way to pass the address of "update_news.php" in to "update_news.pl") for each newsfeed to avoid problems with timing out on slow or temporarily unavailable newsfeeds.
Here's an example of how to set up a background refreshing system:

update_news.php
This file is loaded by "update_news.pl", show below, to cache a copy of the newsfeed on your server for quick access:

<?php
require_once "/path/to/carp.php";

/* set the cache interval to 1 minute (you'll control how often the cache is actually refreshed with your cron settings, not here) */
CarpConf('cacheinterval',1);

CarpCache('http://www.some.where.com/path/to/newsfeed.rdf',
     'cache.file.name');
?>

your_webpage.php
Now that you have the newsfeed cached on your server, you can access it on the page where you want it displayed like this:

<?php
require_once "/path/to/carp.php";

/* use the CarpConf() function to set up any desired formatting - see the manual and other examples for details */

CarpShow(CarpCachePath().'cache.file.name');
?>

update_news.pl
Depending on your PHP setup, cron jobs may not be able to load and execute PHP files directly. Even if they can, it is usually preferable to have the cron job execute a script which loads the PHP file to ensure that there are no problems with file access and that you don't get an unnecessary email every time the cron job runs.

This is a script written in Perl which is executed by cron to connect to the web server and load update_news.php. Refer to the documentation for cron for instructions on how to set up the cron job on your server:

Note that you'll need to change "www.yourwebsite.com" (near the top) and "/path/to/update_news.php" (near the bottom) to the appropriate values for your setup. IMPORTANT: The path to update_news.php should NOT be the full path on the server, but the path that a web browser would specify. For example, if a web browser would load the page as http://www.mouken.com/rss/update_news.php", the appropriate path would be /rss/update_news.php.

The cron job can run on a different computer from the webserver. When doing that, enter the hostname of the computer where the cron job is running in the "$hostname=" line.

#!/usr/bin/perl

use Socket;
$hostname='www.yourwebsite.com';
$remote_host='www.yourwebsite.com';
$sockaddr = 'S n a4 x8';
$proto=getprotobyname('tcp');

($name, $aliases, $type, $len, $thisaddr) = gethostbyname($hostname);
($name, $aliases, $type, $len, $thataddr) = gethostbyname($remote_host);

$this = pack($sockaddr, &AF_INET, 0, $thisaddr);
$that = pack($sockaddr, &AF_INET, 80, $thataddr);

socket(S, &AF_INET, &SOCK_STREAM, $proto)||die "socket: $!";
bind(S,$this)||die "bind: $!";
connect(S,$that)||die "connect: $!";
select(S); $|=1; select(STDOUT);

print S "GET /path/to/update_news.php HTTP/1.0\nHost: $remote_host\n\n";
while ($line=<S>) { }
close(S);
exit 0;

News Headlines on ANY Website

Import RSS newsfeeds into ANY website! Pick your price:
$29.97: Use Jawfish, our RSS to JavaScript or frames conversion service (works with ALL websites)
Free: Download CaRP SE, our free RSS parser (requires PHP)
$47: Get the high-powered, plugin-extendable version, CaRP Evolution


[CaRP] Unknown option (corder). Please check the spelling of the option name and that the version of CaRP you are using supports this option.

[CaRP] XML error: junk after document element at line 2 - Unknown document format.