The recently added RcppArmadillo::sample() functionality provides the same
algorithm used in R’s sample() to Rcpp-level code. Because R’s own sample()
is written in C with minimal work done in R, writing a wrapper around
RcppArmadillo::sample() to then call in R won’t get you much of a performance
boost. However, if you need to repeatedly call sample(), then calling a single
function which performs everything in Rcpp-land (including multiple calls to
sample()) before returning to R can produce a noticeable speedup over a purely
R-based solution.

Accept-Reject Sampler Example

One place where this situation arises is in an accept-reject sampler where the
candidate “draw” is the output of a call to sample(). Concretely, let’s
suppose we want to sample 20 integers (without replacement) from 1 to 50 such
that the sum of the 20 integers is less than 400. Far fewer than 10% of randomly
drawn samples will meet this constraint.

Loading required package: RcppArmadillo

Loading required package: Rcpp

Loading required package: rbenchmark

The R code is straightforward enough. It has been written to mirror the logic of
the C++ code, although that doesn’t come at the cost of much performance.

Although it is a bit longer, the logic of the C++ code is similar.

Performance

The Rcpp code tends to be about 7-9 times faster and this boost increases as the
constraint becomes more complicated (and necessarily more costly in R).

test replications relative elapsed
2 cpp 10 1.00 0.036
1 r 10 11.97 0.431

In the Real World …

Where might the structure in this problem arise in practice? One set of
instances are those where “space” matters:

sampling US cities such that no more than two are in any one state

sampling cellphone towers such that no two are closer than X miles apart

sampling nodes in a graph/network such that no one has more than K edges

In these situations, R code to check the acceptance condition will likely be
less efficient relative to the corresponding C++ code and so even larger
speed-ups are realized.