This vignette demonstrates how to use the gs.engine
function provided by the gsengine package to run Genstat
code chunks from within an R Markdown document.
It assumes:
gsengine with Genstat on your own machineIf you have Genstat installed locally, i.e. on the machine that you
are writing your R Markdown document, then you can create a
knitr engine for interacting with Genstat by using the
gs.engine function from the gsengine package
with the knitr::knit_engines$set function. That is, you
create an R chunk that calls the knitr::knit_engines$set
function
```{r}
knitr::knit_engines$set(gs = gsengine::gs.engine())
```
Note that there is nothing special about the gs
variable. All it allows you to do is tell R Markdown that the chunk
should talk to Genstat in order to interpret the code. That is if you
specify a chunk with
```{gs}
```
then R Markdown knows to talk to the Genstat engine. We could have
used the name Genstat, g, or any other valid
variable name and it would work in the same way. We will use
gs throughout this document.
One of the cool things about how David Baird and Simon Urbanek wrote the Genstat Messenger client is that it allows you to have Genstat running remotely on an network contactable machine. You can configure the Genstat server host and port via environment variables:
```{r}
Sys.setenv(GENSTAT_HOST = "sc389508.UoA.auckland.ac.nz")
Sys.setenv(GENSTAT_PORT = "8085")
knitr::knit_engines$set(gs = gsengine::gs.engine())
```
or you can specify the host and the port in the call to
gs.engine
```{r tidy=FALSE}
knitr::knit_engines$set(gs = gsengine::gs.engine(host = "sc389508.UoA.auckland.ac.nz",
port = 8085))
```
Note that the default host is localhost and the default
port is 8085 so we did not really need to specify the
port.
To run Genstat code in your R Markdown file, you
need to mark your chunk as with the same name you used to set the engine
for knitr. So we used the name gs, and that
means we can mark Genstat chunks with
```{gs}
```
PRINT statementsWe can ask Genstat to print something for us:
| 1.000 | |
This code will be sent to the Genstat server, and any output will be captured and rendered in the final document. That of course is very exciting. We can make it next level exciting by declaring a variable and printing it.
Example:
| Foo | |
| 1.000 | |
| 2.000 | |
| 3.000 | |
We are unlikely to want to use a programme with Genstat’s capabilities simply to print things out. It is much more likely that we want to do some statistical analysis and capture the output in our report. As an example we will analyze the built-in Wheat Trials data set. It contains information from a crop trial in New Zealand with four different cultivars.
IMPORT [PRINT=*] '%DATA%/WheatTrials.xlsx'; SHEET='Trial A1'
TREATMENTS Cultivar
BLOCKS Rep
ANOVA [FPROB=yes; PSE=LSD] Mean_Population_m2| Source of variation | d.f. | s.s. | m.s. | v.r. | F pr. |
| Rep stratum | 3 | 344.2 | 114.7 | 0.40 | |
| Rep.Units stratum | |||||
| Cultivar | 12 | 4354.8 | 362.9 | 1.26 | 0.283 |
| Residual | 36 | 10368.3 | 288.0 | ||
| Total | 51 | 15067.3 | |||
| Cultivar | Blaze | CAW722 | Encore | Envy | Excess | Flame | Monarch |
| 200.0 | 191.2 | 211.2 | 196.2 | 191.2 | 187.5 | 198.8 | |
| Cultivar | Nelson | Resolve | Sirus | Victor | Wanaka | WWT1695 | |
| 181.2 | 183.8 | 183.8 | 181.2 | 195.0 | 206.2 | ||
| Table | Cultivar | |
| rep. | 4 | |
| d.f. | 36 | |
| l.s.d. | 24.34 | |
Any reproducible research of course needs to be able to reproduce statistical graphics and, if you are working with Genstat, then it is highly likely you will want to be able to use Genstat’s graphics system.
Calls to Genstat’s graphing procedures result in a PNG image, which are be displayed directly in the output document. These images are saved locally with filenames like ‘img_1.png’, ‘img_2.png’, etc.
Example:
Readers who are familiar with R Markdown will be aware that chunk
options can be used to control how R Markdown behaves. For example,
there are options to surpress evaluation, surpress code, change plot
dimensions and so on. Many of these will function in the same way for
Genstat code, but some may not. We give examples of some common options
here, and then provide examples of some which are specific to the
gsengine package.
If you want to display the Genstat code without evaluating—for
example if you are interested in displaying the code for instructive
purposes—then the eval chunk option works as it does with R
Markdown. That is, setting eval=FALSE has this
consequence:
```{gs, eval=FALSE}
```
Example:
```{gs, eval=FALSE}
PRINT 'This should not run.'
```
By default, the ‘gs.engine’ shows the Genstat commands submitted in
each chunk. There will be times when you want to show the results of an
analysis in Genstat, but do not necessarily need the reader to see the
Genstat commands. To hide Genstat commands in the rendered output, we
use the echo=FALSE option in the chunk header, in the same
way we do in R Markdown:
```{gs, echo=FALSE}
```
Example:
```{gs, echo=FALSE}
PRINT 'This should run.'
```
This will only include the Genstat result or tables in the final document output.
A lot of Genstat output is in tabular format. Equally, it is a very
common in reproducible research to want to include figures from tables.
For example a sum-of-squares value or an \(F\)-statistic. The current implementation
in gsengine allows for three different modes of table
retrieval. The package can
Here are a few examples
```{gs, saveTables=TRUE}
IMPORT [PRINT=*] '%DATA%/WheatTrials.xlsx'; SHEET='Trial A1'
TREATMENTS Cultivar
BLOCKS Rep
ANOVA [FPROB=yes; PSE=LSD] Mean_Population_m2
```
Note: we will hide the results of the Genstat code here because you have seen them plenty of times so far and it is both repititious and unhelpful to display them time and time again.
Notice that in this example there is no label for the chunk. In this
situation gsengine will return a list called
gs_tables_last. This will, of course, get overwritten if
you ask to save the tables in a subsequent chunk.
str(gs_tables_last)
#> List of 3
#> $ :'data.frame': 10 obs. of 6 variables:
#> ..$ X1: chr [1:10] "Source of variation" "" "Rep stratum" "" ...
#> ..$ X2: chr [1:10] "d.f." "" "3" "" ...
#> ..$ X3: chr [1:10] "s.s." "" "344.2" "" ...
#> ..$ X4: chr [1:10] "m.s." "" "114.7" "" ...
#> ..$ X5: chr [1:10] "v.r." "" "0.40" "" ...
#> ..$ X6: chr [1:10] "F pr." "" "" "" ...
#> ..- attr(*, "gen_head_major")= chr "Analysis of variance"
#> ..- attr(*, "gen_head_minor")= logi NA
#> ..- attr(*, "gen_table_id")= chr "gs-table-7"
#> $ :'data.frame': 6 obs. of 8 variables:
#> ..$ X1: chr [1:6] "Cultivar" "" "" "Cultivar" ...
#> ..$ X2: chr [1:6] "Blaze" "200.0" "" "Nelson" ...
#> ..$ X3: chr [1:6] "CAW722" "191.2" "" "Resolve" ...
#> ..$ X4: chr [1:6] "Encore" "211.2" "" "Sirus" ...
#> ..$ X5: chr [1:6] "Envy" "196.2" "" "Victor" ...
#> ..$ X6: chr [1:6] "Excess" "191.2" "" "Wanaka" ...
#> ..$ X7: chr [1:6] "Flame" "187.5" "" "WWT1695" ...
#> ..$ X8: chr [1:6] "Monarch" "198.8" "" "" ...
#> ..- attr(*, "gen_head_major")= chr "Tables of means"
#> ..- attr(*, "gen_head_minor")= logi NA
#> ..- attr(*, "gen_table_id")= chr "gs-table-9"
#> $ :'data.frame': 5 obs. of 3 variables:
#> ..$ X1: chr [1:5] "Table" "rep." "d.f." "l.s.d." ...
#> ..$ X2: chr [1:5] "Cultivar" "4" "36" "24.34" ...
#> ..$ X3: logi [1:5] NA NA NA NA NA
#> ..- attr(*, "gen_head_major")= chr "Tables of means"
#> ..- attr(*, "gen_head_minor")= chr "Least significant differences of means (5% level)"
#> ..- attr(*, "gen_table_id")= chr "gs-table-10"As you can see there are still a few unresovled issues here in that
we have recovered the data, but it is not neccesarily laid out in a
manner we might expect. For example, you might be able access the Mean
Squares in ANOVA table using a column name of Mean.Sq or
something similar. We will return to this problem later.
We will repeat part of the last example, but using a chunk label. This is useful when you want to preserve tables from earlier chunks.
```{gs, label="wheat_anova", saveTables=TRUE}
IMPORT [PRINT=*] '%DATA%/WheatTrials.xlsx'; SHEET='Trial A1'
TREATMENTS Cultivar
BLOCKS Rep
ANOVA [FPROB=yes; PSE=LSD] Mean_Population_m2
```
There is a chunk label for this chunk, wheat_anova. In
this situation gsengine will return a list called
gs_tables_<label>, or specifically in this example
gs_tables_wheat_anova. This variable will last for all of
the knitr document because you cannot duplicate chunk labels.
gs_tables_wheat_anova[[1]]
#> X1 X2 X3
#> 1 Source of variation d.f. s.s.
#> 2
#> 3 Rep stratum 3 344.2
#> 4
#> 5 Rep.*Units* stratum Rep.*Units* stratum Rep.*Units* stratum
#> 6 Cultivar 12 4354.8
#> 7 Residual 36 10368.3
#> 8
#> 9 Total 51 15067.3
#> 10 <NA>
#> X4 X5 X6
#> 1 m.s. v.r. F pr.
#> 2
#> 3 114.7 0.40
#> 4
#> 5 Rep.*Units* stratum Rep.*Units* stratum Rep.*Units* stratum
#> 6 362.9 1.26 0.283
#> 7 288.0
#> 8
#> 9
#> 10 <NA> <NA> <NA>There are two other variants. Firstly we can just specify the name of
the list. For example if we use saveTables="wheat_anova"
then we can refer to the results with the variable
wheat_anova. Be warned though that this is like every other
variable, and can be over-written is future chunks.
```{gs, saveTables="wheat_anova"}
IMPORT [PRINT=*] '%DATA%/WheatTrials.xlsx'; SHEET='Trial A1'
TREATMENTS Cultivar
BLOCKS Rep
ANOVA [FPROB=yes; PSE=LSD] Mean_Population_m2
```
wheat_anova[[1]]
#> X1 X2 X3
#> 1 Source of variation d.f. s.s.
#> 2
#> 3 Rep stratum 3 344.2
#> 4
#> 5 Rep.*Units* stratum Rep.*Units* stratum Rep.*Units* stratum
#> 6 Cultivar 12 4354.8
#> 7 Residual 36 10368.3
#> 8
#> 9 Total 51 15067.3
#> 10 <NA>
#> X4 X5 X6
#> 1 m.s. v.r. F pr.
#> 2
#> 3 114.7 0.40
#> 4
#> 5 Rep.*Units* stratum Rep.*Units* stratum Rep.*Units* stratum
#> 6 362.9 1.26 0.283
#> 7 288.0
#> 8
#> 9
#> 10 <NA> <NA> <NA>If there is only a single table in the output, then the variable will
not be a list of data.frames but rather a single data
frame. For example
```{gs, saveTables="foo"}
VARIATE [VALUES=1,2,3] Foo
PRINT Foo
```
Now we can access the variable foo. I.e.
The final mode of use is that we can provide a list/vector of table names for the results.
```{gs, saveTables=c("anovatbl", "meanstbl")}
IMPORT [PRINT=*] '%DATA%/WheatTrials.xlsx'; SHEET='Trial A1'
TREATMENTS Cultivar
BLOCKS Rep
ANOVA [FPROB=yes; PSE=LSD] Mean_Population_m2
```
str(anovatbl)
#> 'data.frame': 10 obs. of 6 variables:
#> $ X1: chr "Source of variation" "" "Rep stratum" "" ...
#> $ X2: chr "d.f." "" "3" "" ...
#> $ X3: chr "s.s." "" "344.2" "" ...
#> $ X4: chr "m.s." "" "114.7" "" ...
#> $ X5: chr "v.r." "" "0.40" "" ...
#> $ X6: chr "F pr." "" "" "" ...
#> - attr(*, "gen_head_major")= chr "Analysis of variance"
#> - attr(*, "gen_head_minor")= logi NA
#> - attr(*, "gen_table_id")= chr "gs-table-20"
str(meanstbl)
#> 'data.frame': 6 obs. of 8 variables:
#> $ X1: chr "Cultivar" "" "" "Cultivar" ...
#> $ X2: chr "Blaze" "200.0" "" "Nelson" ...
#> $ X3: chr "CAW722" "191.2" "" "Resolve" ...
#> $ X4: chr "Encore" "211.2" "" "Sirus" ...
#> $ X5: chr "Envy" "196.2" "" "Victor" ...
#> $ X6: chr "Excess" "191.2" "" "Wanaka" ...
#> $ X7: chr "Flame" "187.5" "" "WWT1695" ...
#> $ X8: chr "Monarch" "198.8" "" "" ...
#> - attr(*, "gen_head_major")= chr "Tables of means"
#> - attr(*, "gen_head_minor")= logi NA
#> - attr(*, "gen_table_id")= chr "gs-table-22"Whilst we can see all the information in an ANOVA table that comes from Genstat, it might be nice to convert this into something R understands as an ANOVA table.
anovatbl = gsengine:::genstatAnovaToAnova(anovatbl)
anovatbl
#> Analysis of Variance Table
#> Df Sum Sq Mean Sq F value Pr(>F)
#> Rep stratum 3 344.2 114.7 0.40
#> Cultivar 12 4354.8 362.9 1.26 0.283
#> Residual 36 10368.3 288.0
#> Total 51 15067.3This is useful for more than just pretty printing. We can actually extract elements out of it to use in our text. For example we can get get the \(P\)-value
And then we can embed that in our text by including
`r pval`
in our text. For example the R Markdown sentence
The *P*-value is `r pval`.
gets rendered into HTML as:
The P-value is 0.283.
If you experience connection issues:
Ensure your Genstat socket server is running and listening on the correct port.
Test using ‘telnet
sessionInfo()
#> R version 4.5.1 (2025-06-13)
#> Platform: aarch64-apple-darwin20
#> Running under: macOS Tahoe 26.1
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> time zone: Pacific/Auckland
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] vctrs_0.6.5 httr_1.4.7 cli_3.6.5 knitr_1.50
#> [5] rlang_1.1.6 xfun_0.54 jsonlite_2.0.0 glue_1.8.0
#> [9] gsengine_2.3-0 htmltools_0.5.8.1 sass_0.4.10 rmarkdown_2.30
#> [13] evaluate_1.0.5 jquerylib_0.1.4 tibble_3.3.0 fastmap_1.2.0
#> [17] base64enc_0.1-3 yaml_2.3.10 lifecycle_1.0.4 compiler_4.5.1
#> [21] rvest_1.0.5 pkgconfig_2.0.3 rstudioapi_0.17.1 digest_0.6.37
#> [25] R6_2.6.1 pillar_1.11.1 magrittr_2.0.4 bslib_0.9.0
#> [29] tools_4.5.1 xml2_1.4.1 cachem_1.1.0