FastRWeb.Rd
FastRWeb is not just a package, but an entire infrastructure allowing the use of R scripts to create web pages and graphics.
The basic idea is that an URL of the form
http://server/cgi-bin/R/foo?bar=value
will be processed by FastRWeb
such as to result in sourcing of the foo.R
script and running
the function run(bar="value")
which is expected to be defined
in that script. The results of a script can be anything from HTML
pages to bitmap graphics or PDF document.
FastRWeb uses CGI or PHP as front-end and Rserve
server
as the back-end. For details see Urbanek, S. (2008)
FastRWeb: Fast Interactive Web Framework for Data Mining Using R,
IASC 2008.
The R code in the package itself provides R-side tools that
facilitate the delivery of results to a browser - such as
WebResult
, WebPlot
, out
,
done
- more in detail below.
The default configuration of FastRWeb assumes that the project root
will be in /var/FastRWeb
and that the server is a unix
machine. It is possible to install FastRWeb in other settings, but it
will require modification of the configuration.
First, the FastRWeb
package should be installed (typically
using install.packages("FastRWeb")
in R). The installed package
contains shell script that will setup the environment in
/var/FastRWeb
. To run the script, use
system(paste("cd",system.file(package="FastRWeb"),"&& install.sh"))
For the anatomy of the /var/FastRWeb
project root see below.
Once created, you can inspect the Rserve configuration file
/var/FastRWeb/code/rserve.conf
and adjust it for your needs if
necessary. You can also look a the Rserve initialization script
located in /var/FastRWeb/code/rserve.R
which is used to pre-load
data, packages etc. into Rserve. If you are happy with it, you can
start Rserve using /var/FastRWeb/code/start
In order to tell your webserver to use FastRWeb, you have two options:
CGI script or PHP script. The former is more common as it works with
any web server. The FastRWeb R package builds and installs the Rcgi
script as part of its installation process into the cgi-bin
directory of the package, but it has no way of knowing about the
location of your server's cgi-bin
directory, so it is left to
the user to copy the script in the proper location.
Use system.file("cgi-bin", package="FastRWeb")
in R to locate
the package directory - it will contain an executable Rcgi
(or
Rcgi.exe
on Windows) and copy that executable into you server's
cgi-bin
directory (on Debian/Ubuntu this is typically
/usr/lib/cgi-bin
, on Mac OS X it is
/Library/WebServer/CGI-Executables
). Most examples in FastRWeb
assume that you have renamed the script to R
instead of
Rcgi
, but you can choose any name.
With Rserve started and the CGI script in place, you should be able to
open a browser and run your first script, the URL will probably look
something like http://my.server/cgi-bin/R/main
.
This will invoke the script /var/FastRWeb/web.R/main.R
by
sourcing it and running the run()
function.
For advanced topics, please see Rserve
documentation. For
production systems we encourage the use of gid
, uid
,
sockmod
and umask
configuration directives to secure the
access to the Rserve according to your web server configuration.
The project root (typically var/FastRWeb
) contains various
directories:
web.R
- this directory contains the R scripts that will
be served by FastRWeb. The URL is parsed such that the path part
after the CGI binary is taken, .R
appended and serves to
locate the file in the web.R
directory. Once located, it is
sourced and the run()
function is called with query strang
parsed into its arguments. The default installation also sources
common.R
in addition to the specified script (see
code/rserve.R
and the init()
function for details on
how this is achieved - you can modify the behavior as you please).
web
- this directory can contain static content that
can be referenced using the "file"
command in
WebResult
.
code
- this directory contains supporting
infrastructure and configurations files in association with the
Rserve back-end. If the start
script in this directory is
used, it loads the rserve.conf
configuration file and sources
rserve.R
as initialization of the Rserve master. The
init()
function (if present, e.g., defined in rserve.R
)
is run on every request.
tmp
- this directory is used for temporary
files. It should be purged occasionally to prevent accumulation of
temporary files. FastRWeb provides ways of cleanup (e.g., see
"tmpfile"
command in WebResult
), but crashed or
aborted requests may still leave temporary files around. Onyl files
from this directory can be served using the "tmpfile"
WebResult
command.
logs
- this directory is optional and if present, the
Rcgi
script will log requests in the cgi.log
file in
this directory. It records the request time, duration, IP address,
WebResult
command, payload, optional cookie filter and
the user-agent. If you want to enable logging, simply create the
logs
directory with sufficient permissions to allow the Rcgi
script to write in it.
run
- this directory is optional as well and used for
run-time systems such as global login authorization etc. It is not
populated or used in the CRAN version of FastRWeb, but we encourage
this structure for any user-defined subsystems.
In addition, the default configuration uses a local socket of the name
socket
to communicate with the Rserve instance. Note that you
can use regular unix permissions to limit the access to Rserve this
way.