jPHPsetistats is a tool written in PHP to get SETI@home user statistics and pretty print them on websites. For example, it's used to generate the following image:
Data is read from the SETI@home website in an XML file, is cached then the image is generated.
We were using Royale's script to get our SETI@home stats in our online signatures, but it was too slow, because of :
At that time, I had just learned how to use XML in PHP, and I was trying to find some application I could hack with. The old script was annoying because it was slowing down some forums for every readers. And I found the SETI@home guys has done an XML access to user statistics, so I thought it was a good idea to redo the script from scratch using the XML and handling its own cache, then putting it on a fast server, resolving in one step the three issues.
setipng.php
script in this directory.
Have a look at the various parameters at the top of the main script class, there
are several features you can control from there.
You can also use the one I put online (see the link just above), but if it generates
too much traffic I'll have to remove it.I think I successfully resolved the issues exposed at the “About ..the goal” paragraph. However I think I can still add some interesting features and improvements. I won't code them all, it's only a wishlist.
jPHPsetistats in Tar/Gz format (6 125 bytes).
jPHPsetistats in Zip format (6 626 bytes).
This project is subject to the Lesser GNU Public Licence, which text can be found at http://www.gnu.org/licenses/lgpl.txt. Feel free to propose your ideas ans contributions (send me an email), I'll try to incorporate them.
I worked with the XML DTD
provided by the folks at the SETI@home website. Unfortunately, when you check their
XML output and DTD, you can see something is wrong.
First, the rankinfo
element, though declared in the header's commentary,
is not defined in the DTD body.
Second, the groupinfo founder
element is defined, but seems to be missing
from the XML output. I can't add myself the missing groupinfo founder
element, but I can correct the DTD to include the rankinfo
element:
the modified DTD is here (userstats-corrected.dtd).
In the current release, the scripts run pretty well, astonishingly fast compared to Royale's original script. The bottleneck was to retrieve the data (HTML is larger than XML) then to process it (XML is faster than HTML grabbing through regular expressions).
But when you look the way it's used (inside forum's signatures), it's not unusual for the same image to be requested many times over the same page. In this case, the script is doing the same thing for every image; it is multithread-safe, but it isn't the most efficient way to do.
We can save server resources by doing the stuff on the first request, then directly drop the ready-to-go result to the "following simultanneous requests". So I tried to formalise (in a non-formalised language, but it should be easy to understand) how the current process works, and I crafted a new process to be multirequest-safe. I think the new process is well designed (I hand-tested it to verify safety, liveness, fairness, deadlock free and starvation free; I'm used to this kind of work, look at my SemServ project) but there is no formal proof of it, and I'll have to code it to see if I'm right or not.
======== = Current process (not MR-safe): ======== START. READ. READ1. If file found in cache and not too old (min wait interval), go to PARSE. READ2. If web is available and data has been retrieved, go to PARSE. READ3. If file found in cache and not too old (flush cache interval), go to PARSE. READ4. FAILED. PARSE. Parse XML data. WRITE. If XML data is not from cache (ie is from from web), write into cache. END. ======== = New process (MR-safe): ======== START. READ. READ1. If named temporary file not found, go to READ5. READ2. If named temporary file too old, remove it then go to READ5. READ3. Wait (yielding CPU) while named temporary file is still there (and time not elapsed). READ4. If time elapsed, remove named temporary file. READ5. If file found in cache and not too old (min wait interval), go to PARSE. READ6. If web is not available or data has not been retrieved, go to READ8. READ7. Create exclusive named temporary file, if failed then set data like coming from cache and go to PARSE. READ8. If file found in cache and not too old (flush cache interval), go to PARSE. READ9. FAILED. PARSE. Parse XML data. WRITE. WRITE1. If XML data is from cache (or exclusive named temporary file has not been created by us), go to END. WRITE2. Fill the temporary named file with XML data. WRITE3. Rename (or copy then remove) the temporary named file to the cache file. END.