Title: Calling C code from R an introduction
1Calling C code from Ran introduction
- Sigal Blay
- Dept. of Statistics and Actuarial Science
- Simon Fraser University
- October 2004
2- Motivation
- Speed
- Efficient memory management
- Using existing C libraries
3The following functions provide a standard
interface to compiled code that has been linked
into R .C .Call .External
4We will explore using .C and .Call with 7 code
examples Using .C I. Calling C with an
integer vector II. Calling C with different
vector types Using .Call III. Sending R integer
vectors to C IV. Sending R character vectors to
C V. Getting an integer vector from C VI.
Getting a character vector from C VII. Getting a
list from C And lastly, tips on creating an R
package with compiled code
5I. Calling C with an integer vector using .C
6/ useC1.c / void useC(int i)
i0 11 The C function should be of
type void. The compiled code should not return
anything except through its arguments.
7To compile the c code, type at the command
prompt R CMD SHLIB useC1.c The compiled code
file name is useC1.so
8In R gt dyn.load("useC1.so") gt a lt- 110
integer vector gt a 1 1 2 3 4 5 6 7
8 9 10 gt out lt- .C("useC", b as.integer(a)) gt
a 1 1 2 3 4 5 6 7 8 9 10 gt outb 1
11 2 3 4 5 6 7 8 9 10
9- You have to allocate memory to the vectors passed
to .C in R by creating vectors of the right
length. - The first argument to .C is a character string of
the C function name. - The rest of the arguments are R objects to be
passed to the C function.
10- All arguments should be coerced to the correct R
storage mode to prevent mismatching of types that
can lead to errors. - .C returns a list object.
- The second .C argument is given the name b. This
name is used for the respective component in the
returned list object (but not passed to the
compiled code).
11II. Calling C with different vector types using .C
12/ useC2.c
/ void useC(int i, double d, char c, int
l) i0 11 d0 2.333 c1
"g" l0 0
13To compile the c code, type at the command
prompt R CMD SHLIB useC2.c to get
useC2.so To compile more than one c file R CMD
SHLIB file1.c file2.c file3.c to get file1.so
14In R gt dyn.load("useC2.so") gt i lt- 110
integer vector gt d lt-
seq(length3,from1,to2) real number vector gt
c lt- c("a", "b", "c") string vector gt
l lt- c("TRUE", "FALSE") logical vector gt
i 1 1 2 3 4 5 6 7 8 9 10 gt d 1
1.0 1.5 2.0 gt c 1 "a" "b" "c" gt l 1 "TRUE"
"FALSE"
15gt out lt- .C("useC", i1
as.integer(a), d1 as.numeric(d),
c1 as.character(c), l1
as.logical(l)) gt out i1 1 11 2 3 4 5 6
7 8 9 10 d1 1 2.333 1.500 2.000 c1 1
"a" "g" "c l1 1 FALSE FALSE
16- Other R objects can be passed to .C but it is
better to use one of the other interfaces. - With .C, the R objects are copied before being
passed to the C code, and copied again to an R
list object when the compiled code returns. - Neither .Call nor .External copy their arguments.
- You should treat arguments you receive through
these interfaces as read-only.
17Advantages to using .Call() instead of
.C() (Posted by Prof Brian Ripley on R-help, Jun
2004) 1) A lot less copying. 2) The ability to
dimension the answer in the C code. 3) Access to
other types, e.g. expressions, raw type and
the ability to easily execute R code (call_R is a
pain). 4) Access to the attributes of the
vectors, for example the names. 5) The ability to
handle missing values easily.
18III. Sending R integer vectors to C using .Call
19/ useCall1.c
/ include ltR.hgt include ltRdefines.hgt SEXP
getInt(SEXP myint, SEXP myintVar) int
Imyint, n // declare an integer variable int
Pmyint // pointer to an integer vector
PROTECT(myint AS_INTEGER(myint))
20- Rdefines.h is somewhat more higher level then
Rinternal.h, and is preferred if the code might
be shared with S at any stage. - SEXP stands for Simple EXPression
- myint is of type SEXP, which is a general type,
hence coercion is needed to the right type. - R objects created in the C code have to be
reported using the PROTECT macro on a pointer to
the object. This tells R that the object is in
use so it is not destroyed.
21 Imyint INTEGER_POINTER(myint)0 Pmyint
INTEGER_POINTER(myint) n
INTEGER_VALUE(myintVar) printf( Printed from
C \n) printf( Imyint d \n", Imyint)
printf( n d \n", n) printf( Pmyint0,
Pmyint1 d d \n",
Pmyint0, Pmyint1) UNPROTECT(1)
return(R_NilValue)
22- The protection mechanism is stack-based, so
UNPROTECT(n) unprotects the last n objects which
were protected. The calls to PROTECT and
UNPROTECT must balance when the user's code
returns. - to work with real numbers, replace int with
double and
INTEGER with NUMERIC -
23In R gt dyn.load("useCall1.so") gt myintlt-
c(1,2,3) gt outlt- .Call("getInt", myint, 5)
Printed from C Imyint 1 n 5 Pmyint0,
Pmyint1 1 2 gt out NULL
24IV. Reading an R character vector from C using
.Call
25/ useCall2.c
/ include ltR.hgt include ltRdefines.hgt SEXP
getChar(SEXP mychar) char Pmychar5 //
array of 5 pointers // to
character strings PROTECT(mychar
AS_CHARACTER(mychar))
26// allocate memory Pmychar0
R_alloc(strlen(CHAR(STRING_ELT(mychar, 0))),
sizeof(char)) Pmychar1 R_alloc(strlen(CHAR(S
TRING_ELT(mychar, 1))), sizeof(char)) // ...
and copy mychar to Pmychar strcpy(Pmychar0,
CHAR(STRING_ELT(mychar, 0)))
strcpy(Pmychar1, CHAR(STRING_ELT(mychar, 1)))
printf( Printed from C) printf( s s
\n",Pmychar0,Pmychar1) UNPROTECT(1)
return(R_NilValue)
27In R gt dyn.load("useCall2.so") gt mychar lt-
c("do","re","mi", "fa", "so") gt out lt-
.Call("getChar", mychar) Printed from C do re
28V. Getting an integer vector from C using .Call
29/ useCall3.c
/ include ltR.hgt include ltRdefines.hgt SEXP
setInt() SEXP myint int p_myint int
len 5 // Allocating storage space
PROTECT(myint NEW_INTEGER(len))
30 p_myint INTEGER_POINTER(myint)
p_myint0 7 UNPROTECT(1) return
myint // to work with real numbers, replace
// int with double and INTEGER with NUMERIC
31In R gt dyn.load("useCall3.so") gt outlt-
.Call("setInt") gt out 1 7 0 0 0 0
32VI. Getting a character vector from C using .Call
33/ useCall4.c
/ include ltR.hgt include ltRdefines.hgt SEXP
setChar() SEXP mychar PROTECT(mychar
allocVector(STRSXP, 5)) SET_STRING_ELT(mychar,
0, mkChar("A")) UNPROTECT(1) return
mychar
34In R gt dyn.load("useCall4.so") gt out lt-
.Call("setChar") gt out 1 "A" "" "" "" ""
35VII. Getting a list from C using .Call
36/ useCall5.c
/ include ltR.hgt include ltRdefines.hgt SEXP
setList() int p_myint, i double
p_double SEXP mydouble, myint, list,
list_names char names2 "integer",
"numeric"
37 // creating an integer vector
PROTECT(myint NEW_INTEGER(5)) p_myint
INTEGER_POINTER(myint) // ... and a
vector of real numbers PROTECT(mydouble
NEW_NUMERIC(5)) p_double
NUMERIC_POINTER(mydouble) for(i 0 i lt
5 i) p_doublei 1/(double)(i 1)
p_myinti i 1
38// Creating a character string vector // of the
"names" attribute of the // objects in out
list PROTECT(list_names allocVector(STRSXP,
2)) for(i 0 i lt 2 i)
SET_STRING_ELT(list_names,i,mkChar(namesi))
39- // Creating a list with 2 vector elements
- PROTECT(list allocVector(VECSXP, 2))
- // attaching myint vector to list
- SET_VECTOR_ELT(list, 0, myint)
- // attaching mydouble vector to list
- SET_VECTOR_ELT(list, 1, mydouble)
- // and attaching the vector names
- setAttrib(list, R_NamesSymbol, list_names)
- UNPROTECT(4)
- return list
-
- SET_VECTOR_ELT stands for Set Vector Element
40In R gt dyn.load("useCall5.so") gt out lt-
.Call("setList") gt out integer 1 1 2 3 4
5 numeric 1 1.00000 0.50000 0.33333 0.25000
0.20000
41If you are developing an R package copy useC.c
to myPackage/src/ The user of the package will
not have to manually load the compiled c code
with dyn.load(), so add zzz.R file to
myPackage/R zzz.R should contain the following
code .First.lib lt-function (lib, pkg)
library.dynam("myPackage", pkg, lib)
42If you are developing an R package
(cont.), modify the .C call After the argument
list to the C function, add PACKAGE"compiled_fil
e". For example, if your compiled C code file
name is useC1.so, type .C("useC", b
as.integer(a), PACKAGE"useC1") If you are
using a Makefile, look at the output from R CMD
SHLIB myfile.c for flags that you may need to
incorporate in the Makefile.
43- Even if your R package perfectly passes an 'R CMD
check' - Try to compile your C code with 'gcc -pedantic
-Wall' - (you should get only warnings that you have
reasons - not to eliminate)
- check the R code with 'R CMD check --use-gct'
- (It uses 'gctorture(TRUE)' when running
examples/tests, - and it's slow)
- If you won't, CRAN will do that for you and
- will send you back to the drawing board.
44 This work has been made possible by the
Statistical Genetics Working Group at the
Department of Statistics and Actuarial Science,
SFU.