Using gs.engine(): Genstat Socket Engine

Yuxiao Wang, James Curran, Simon Urbanek

2025-11-12

Overview

This vignette demonstrates how to use the gs.engine function provided by the gsengine package to run Genstat code chunks from within an R Markdown document.

It assumes:

Using gsengine with Genstat on your own machine

If you have Genstat installed locally, i.e. on the machine that you are writing your R Markdown document, then you can create a knitr engine for interacting with Genstat by using the gs.engine function from the gsengine package with the knitr::knit_engines$set function. That is, you create an R chunk that calls the knitr::knit_engines$set function

```{r}
knitr::knit_engines$set(gs = gsengine::gs.engine())
```

Note that there is nothing special about the gs variable. All it allows you to do is tell R Markdown that the chunk should talk to Genstat in order to interpret the code. That is if you specify a chunk with

```{gs}
```

then R Markdown knows to talk to the Genstat engine. We could have used the name Genstat, g, or any other valid variable name and it would work in the same way. We will use gs throughout this document.

Interacting with Genstat remotely

One of the cool things about how David Baird and Simon Urbanek wrote the Genstat Messenger client is that it allows you to have Genstat running remotely on an network contactable machine. You can configure the Genstat server host and port via environment variables:

```{r}
Sys.setenv(GENSTAT_HOST = "sc389508.UoA.auckland.ac.nz")
Sys.setenv(GENSTAT_PORT = "8085") 
knitr::knit_engines$set(gs = gsengine::gs.engine())
```

or you can specify the host and the port in the call to gs.engine

```{r tidy=FALSE}
knitr::knit_engines$set(gs = gsengine::gs.engine(host = "sc389508.UoA.auckland.ac.nz", 
                                                 port = 8085))
```

Note that the default host is localhost and the default port is 8085 so we did not really need to specify the port.

Writing Genstat chunks

To run Genstat code in your R Markdown file, you need to mark your chunk as with the same name you used to set the engine for knitr. So we used the name gs, and that means we can mark Genstat chunks with

```{gs}
```

Doing some simple things with Genstat

Simple PRINT statements

We can ask Genstat to print something for us:

PRINT 1.0

 

1.000

This code will be sent to the Genstat server, and any output will be captured and rendered in the final document. That of course is very exciting. We can make it next level exciting by declaring a variable and printing it.

Example:

VARIATE [VALUES=1,2,3] Foo
PRINT Foo

 

Foo
1.000
2.000
3.000

More complex code execution

We are unlikely to want to use a programme with Genstat’s capabilities simply to print things out. It is much more likely that we want to do some statistical analysis and capture the output in our report. As an example we will analyze the built-in Wheat Trials data set. It contains information from a crop trial in New Zealand with four different cultivars.

IMPORT [PRINT=*] '%DATA%/WheatTrials.xlsx'; SHEET='Trial A1'
TREATMENTS Cultivar
BLOCKS Rep
ANOVA [FPROB=yes; PSE=LSD] Mean_Population_m2
Analysis of variance
Source of variation d.f. s.s. m.s. v.r. F pr.
Rep stratum 3   344.2   114.7   0.40
Rep.Units stratum
Cultivar 12   4354.8   362.9   1.26   0.283
Residual 36   10368.3   288.0
Total 51   15067.3
Message: the following units have large residuals.
Rep 3 units 8   34.5 s.e. 14.1

Tables of means
Cultivar   Blaze   CAW722   Encore   Envy   Excess   Flame   Monarch
  200.0   191.2   211.2   196.2   191.2   187.5   198.8
Cultivar   Nelson   Resolve   Sirus   Victor   Wanaka   WWT1695
  181.2   183.8   183.8   181.2   195.0   206.2
Least significant differences of means (5% level)
Table Cultivar
rep.   4
d.f.   36
l.s.d.   24.34

Displaying Genstat Graphics

Any reproducible research of course needs to be able to reproduce statistical graphics and, if you are working with Genstat, then it is highly likely you will want to be able to use Genstat’s graphics system.

Calls to Genstat’s graphing procedures result in a PNG image, which are be displayed directly in the output document. These images are saved locally with filenames like ‘img_1.png’, ‘img_2.png’, etc.

Example:

DGRAPH !(1,2,3,4,5); !(10,15,7,20,12)

Chunk options

Readers who are familiar with R Markdown will be aware that chunk options can be used to control how R Markdown behaves. For example, there are options to surpress evaluation, surpress code, change plot dimensions and so on. Many of these will function in the same way for Genstat code, but some may not. We give examples of some common options here, and then provide examples of some which are specific to the gsengine package.

Displaying Genstat code without evaluation

If you want to display the Genstat code without evaluating—for example if you are interested in displaying the code for instructive purposes—then the eval chunk option works as it does with R Markdown. That is, setting eval=FALSE has this consequence:

```{gs, eval=FALSE}
```

Example:

```{gs, eval=FALSE}
PRINT 'This should not run.'
```

Hiding Genstat code

By default, the ‘gs.engine’ shows the Genstat commands submitted in each chunk. There will be times when you want to show the results of an analysis in Genstat, but do not necessarily need the reader to see the Genstat commands. To hide Genstat commands in the rendered output, we use the echo=FALSE option in the chunk header, in the same way we do in R Markdown:

```{gs, echo=FALSE}
```

Example:

```{gs, echo=FALSE}
PRINT 'This should run.'
```

This will only include the Genstat result or tables in the final document output.

Saving Genstat tables for further manipulation

A lot of Genstat output is in tabular format. Equally, it is a very common in reproducible research to want to include figures from tables. For example a sum-of-squares value or an \(F\)-statistic. The current implementation in gsengine allows for three different modes of table retrieval. The package can

Here are a few examples

```{gs, saveTables=TRUE}
IMPORT [PRINT=*] '%DATA%/WheatTrials.xlsx'; SHEET='Trial A1'
TREATMENTS Cultivar
BLOCKS Rep
ANOVA [FPROB=yes; PSE=LSD] Mean_Population_m2
```

Note: we will hide the results of the Genstat code here because you have seen them plenty of times so far and it is both repititious and unhelpful to display them time and time again.

Notice that in this example there is no label for the chunk. In this situation gsengine will return a list called gs_tables_last. This will, of course, get overwritten if you ask to save the tables in a subsequent chunk.

str(gs_tables_last)
#> List of 3
#>  $ :'data.frame':    10 obs. of  6 variables:
#>   ..$ X1: chr [1:10] "Source of variation" "" "Rep stratum" "" ...
#>   ..$ X2: chr [1:10] "d.f." "" "3" "" ...
#>   ..$ X3: chr [1:10] "s.s." "" "344.2" "" ...
#>   ..$ X4: chr [1:10] "m.s." "" "114.7" "" ...
#>   ..$ X5: chr [1:10] "v.r." "" "0.40" "" ...
#>   ..$ X6: chr [1:10] "F pr." "" "" "" ...
#>   ..- attr(*, "gen_head_major")= chr "Analysis of variance"
#>   ..- attr(*, "gen_head_minor")= logi NA
#>   ..- attr(*, "gen_table_id")= chr "gs-table-7"
#>  $ :'data.frame':    6 obs. of  8 variables:
#>   ..$ X1: chr [1:6] "Cultivar" "" "" "Cultivar" ...
#>   ..$ X2: chr [1:6] "Blaze" "200.0" "" "Nelson" ...
#>   ..$ X3: chr [1:6] "CAW722" "191.2" "" "Resolve" ...
#>   ..$ X4: chr [1:6] "Encore" "211.2" "" "Sirus" ...
#>   ..$ X5: chr [1:6] "Envy" "196.2" "" "Victor" ...
#>   ..$ X6: chr [1:6] "Excess" "191.2" "" "Wanaka" ...
#>   ..$ X7: chr [1:6] "Flame" "187.5" "" "WWT1695" ...
#>   ..$ X8: chr [1:6] "Monarch" "198.8" "" "" ...
#>   ..- attr(*, "gen_head_major")= chr "Tables of means"
#>   ..- attr(*, "gen_head_minor")= logi NA
#>   ..- attr(*, "gen_table_id")= chr "gs-table-9"
#>  $ :'data.frame':    5 obs. of  3 variables:
#>   ..$ X1: chr [1:5] "Table" "rep." "d.f." "l.s.d." ...
#>   ..$ X2: chr [1:5] "Cultivar" "4" "36" "24.34" ...
#>   ..$ X3: logi [1:5] NA NA NA NA NA
#>   ..- attr(*, "gen_head_major")= chr "Tables of means"
#>   ..- attr(*, "gen_head_minor")= chr "Least significant differences of means (5% level)"
#>   ..- attr(*, "gen_table_id")= chr "gs-table-10"

As you can see there are still a few unresovled issues here in that we have recovered the data, but it is not neccesarily laid out in a manner we might expect. For example, you might be able access the Mean Squares in ANOVA table using a column name of Mean.Sq or something similar. We will return to this problem later.

We will repeat part of the last example, but using a chunk label. This is useful when you want to preserve tables from earlier chunks.

```{gs, label="wheat_anova", saveTables=TRUE}
IMPORT [PRINT=*] '%DATA%/WheatTrials.xlsx'; SHEET='Trial A1'
TREATMENTS Cultivar
BLOCKS Rep
ANOVA [FPROB=yes; PSE=LSD] Mean_Population_m2
```

There is a chunk label for this chunk, wheat_anova. In this situation gsengine will return a list called gs_tables_<label>, or specifically in this example gs_tables_wheat_anova. This variable will last for all of the knitr document because you cannot duplicate chunk labels.

gs_tables_wheat_anova[[1]]
#>                     X1                  X2                  X3
#> 1  Source of variation                d.f.                s.s.
#> 2                                                             
#> 3          Rep stratum                   3               344.2
#> 4                                                             
#> 5  Rep.*Units* stratum Rep.*Units* stratum Rep.*Units* stratum
#> 6             Cultivar                  12              4354.8
#> 7             Residual                  36             10368.3
#> 8                                                             
#> 9                Total                  51             15067.3
#> 10                                                        <NA>
#>                     X4                  X5                  X6
#> 1                 m.s.                v.r.               F pr.
#> 2                                                             
#> 3                114.7                0.40                    
#> 4                                                             
#> 5  Rep.*Units* stratum Rep.*Units* stratum Rep.*Units* stratum
#> 6                362.9                1.26               0.283
#> 7                288.0                                        
#> 8                                                             
#> 9                                                             
#> 10                <NA>                <NA>                <NA>

There are two other variants. Firstly we can just specify the name of the list. For example if we use saveTables="wheat_anova" then we can refer to the results with the variable wheat_anova. Be warned though that this is like every other variable, and can be over-written is future chunks.

```{gs, saveTables="wheat_anova"}
IMPORT [PRINT=*] '%DATA%/WheatTrials.xlsx'; SHEET='Trial A1'
TREATMENTS Cultivar
BLOCKS Rep
ANOVA [FPROB=yes; PSE=LSD] Mean_Population_m2
```
wheat_anova[[1]]
#>                     X1                  X2                  X3
#> 1  Source of variation                d.f.                s.s.
#> 2                                                             
#> 3          Rep stratum                   3               344.2
#> 4                                                             
#> 5  Rep.*Units* stratum Rep.*Units* stratum Rep.*Units* stratum
#> 6             Cultivar                  12              4354.8
#> 7             Residual                  36             10368.3
#> 8                                                             
#> 9                Total                  51             15067.3
#> 10                                                        <NA>
#>                     X4                  X5                  X6
#> 1                 m.s.                v.r.               F pr.
#> 2                                                             
#> 3                114.7                0.40                    
#> 4                                                             
#> 5  Rep.*Units* stratum Rep.*Units* stratum Rep.*Units* stratum
#> 6                362.9                1.26               0.283
#> 7                288.0                                        
#> 8                                                             
#> 9                                                             
#> 10                <NA>                <NA>                <NA>

If there is only a single table in the output, then the variable will not be a list of data.frames but rather a single data frame. For example

```{gs, saveTables="foo"}
VARIATE [VALUES=1,2,3] Foo
PRINT Foo
```

Now we can access the variable foo. I.e.

foo
#>      X1 X2
#> 1   Foo NA
#> 2 1.000 NA
#> 3 2.000 NA
#> 4 3.000 NA
#> 5       NA

The final mode of use is that we can provide a list/vector of table names for the results.

```{gs, saveTables=c("anovatbl", "meanstbl")}
IMPORT [PRINT=*] '%DATA%/WheatTrials.xlsx'; SHEET='Trial A1'
TREATMENTS Cultivar
BLOCKS Rep
ANOVA [FPROB=yes; PSE=LSD] Mean_Population_m2
```
str(anovatbl)
#> 'data.frame':    10 obs. of  6 variables:
#>  $ X1: chr  "Source of variation" "" "Rep stratum" "" ...
#>  $ X2: chr  "d.f." "" "3" "" ...
#>  $ X3: chr  "s.s." "" "344.2" "" ...
#>  $ X4: chr  "m.s." "" "114.7" "" ...
#>  $ X5: chr  "v.r." "" "0.40" "" ...
#>  $ X6: chr  "F pr." "" "" "" ...
#>  - attr(*, "gen_head_major")= chr "Analysis of variance"
#>  - attr(*, "gen_head_minor")= logi NA
#>  - attr(*, "gen_table_id")= chr "gs-table-20"
str(meanstbl)
#> 'data.frame':    6 obs. of  8 variables:
#>  $ X1: chr  "Cultivar" "" "" "Cultivar" ...
#>  $ X2: chr  "Blaze" "200.0" "" "Nelson" ...
#>  $ X3: chr  "CAW722" "191.2" "" "Resolve" ...
#>  $ X4: chr  "Encore" "211.2" "" "Sirus" ...
#>  $ X5: chr  "Envy" "196.2" "" "Victor" ...
#>  $ X6: chr  "Excess" "191.2" "" "Wanaka" ...
#>  $ X7: chr  "Flame" "187.5" "" "WWT1695" ...
#>  $ X8: chr  "Monarch" "198.8" "" "" ...
#>  - attr(*, "gen_head_major")= chr "Tables of means"
#>  - attr(*, "gen_head_minor")= logi NA
#>  - attr(*, "gen_table_id")= chr "gs-table-22"

One last piece of magic

Whilst we can see all the information in an ANOVA table that comes from Genstat, it might be nice to convert this into something R understands as an ANOVA table.

anovatbl = gsengine:::genstatAnovaToAnova(anovatbl)
anovatbl
#> Analysis of Variance Table
#>             Df  Sum Sq Mean Sq F value Pr(>F)
#> Rep stratum  3   344.2   114.7    0.40       
#> Cultivar    12  4354.8   362.9    1.26  0.283
#> Residual    36 10368.3   288.0               
#> Total       51 15067.3

This is useful for more than just pretty printing. We can actually extract elements out of it to use in our text. For example we can get get the \(P\)-value

pval = anovatbl$'Pr(>F)'[2] ## We choose the second element, because this is where it is stored

And then we can embed that in our text by including

`r pval`

in our text. For example the R Markdown sentence

The *P*-value is `r pval`.

gets rendered into HTML as:

The P-value is 0.283.

Troubleshooting

If you experience connection issues:

  1. Ensure your Genstat socket server is running and listening on the correct port.

  2. Test using ‘telnet ’ or similar tools.

Session Info

sessionInfo()
#> R version 4.5.1 (2025-06-13)
#> Platform: aarch64-apple-darwin20
#> Running under: macOS Tahoe 26.1
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: Pacific/Auckland
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] vctrs_0.6.5       httr_1.4.7        cli_3.6.5         knitr_1.50       
#>  [5] rlang_1.1.6       xfun_0.54         jsonlite_2.0.0    glue_1.8.0       
#>  [9] gsengine_2.3-0    htmltools_0.5.8.1 sass_0.4.10       rmarkdown_2.30   
#> [13] evaluate_1.0.5    jquerylib_0.1.4   tibble_3.3.0      fastmap_1.2.0    
#> [17] base64enc_0.1-3   yaml_2.3.10       lifecycle_1.0.4   compiler_4.5.1   
#> [21] rvest_1.0.5       pkgconfig_2.0.3   rstudioapi_0.17.1 digest_0.6.37    
#> [25] R6_2.6.1          pillar_1.11.1     magrittr_2.0.4    bslib_0.9.0      
#> [29] tools_4.5.1       xml2_1.4.1        cachem_1.1.0