How can you implement F#’s forward pipe operator in R? The operator makes it possible to easily chain a sequence of calculations. For example, when you have an input data and want to call functions foo and bar in sequence, you can write:
data > foo > bar
Nine months later, on October 28th, 2012, Hadley Wickham started the
dplyr project in github as an evolution of his data analysis package
plyr (Initially the package was indeed called ‘plyr2’). As he showed in three consecutive presentations of
dplyr during summer 2013 in Dublin, Albacete, and London,
ddply, a function from data frames to data frames with the philosophy of Split - Apply - Combine, was the most popular function from
dplyr was designed to focus also on data frames, but with the idea of being more efficient than plyr. The initial functions of dplyr in 2012 included arrange, mutate, summarise, and subset, but soon the package will evolve to its main verbs:
- select: subset variables
- filter: subset rows
- mutate: add new columns
- summarise: reduce to a single row
- arrange: re-order the rows
Then, almost one year later, in October 9, 2013, the first pipe in dplyr appears. The function was denominated
chain, but also the package introduced its first operator for the pipe:
%.%. The idea behind the introduction of the chain was simplify notation for applying several functions to a data frame. Without the chain function, you need to read the verbs from inside out:
But with the chain function, the previous code is converted to:
And with the operator
%.% pipe would not stay in
dplyr package for long time, on December 29th, 2013, Stefan Bache revisited the old stackoverflow question proposing an alternative to the original answer:
which allows a chain like:
Stefan continued working on this pipe operation, and on December 30th, 2013, he implemented in github the
plumbr package which included the
%>% operator. Two days later,
plumbr would be renamed as
magrittr, its current name, in a clear reference of the famous painting “The Treachery of Images” of the Belgian painter René Magritte.
dplyr package was being developed in parallel but these two developments were intended to converge. On March 19th, 2014 , the
chain function was deprecated on dplyr, and finally on April 14th, 2014,
dplyr incorporated the
%>% operator of magrittr, recommending it in substitution of the original
%.%, because the former is more easy to type holding down the Shift key. Both operators are still in use by
dplyr, although on August 1st, 2014,
%.% was deprecated.
Two weeks later, on August 14th, 2014 , the Rstudio IDE version v0.98.1028 incorporated a shortcut for the
magrittr pipe operator
%>% to make even more easy its use (Shift+Alt+.), although is possible that in the near future the operator shortcut will be changed to Ctrl+Shift+M
The last iteration of the pipe implementation in R started on April 7th, 2014, when Kun Ren published on github the
pipeR package , incorporating a different pipe operator
%>>%to add more flexibility to the piping process.
From the package webpage we can find examples of the several uses of the
- As first argument to a function:
- As argument in an expression (Using
- By using a formula (To avoid confussion with
- To save intermediate results (using
- Or to extract element by names (using
Pipes in R are here to stay and change completely the way how we code in R, making it more simple and readable. Simple and readable means that our daily work in R will be more easy and also it can encourage new people to use our favourite language. Have you tried R piping?
Edited September 08, 2014
Notes (Thanks to Hadley Wickham comments):
On April 5, 2012, Peter Meilstrup started the package
ptoolsas a way to collect “various data manipulation and programming utilities”. One of those utilities was the function
chain, implementing a way to pipe the arguments. The current version of the
chainfunction can be found in the
vadrpackage from the same author.
On this post, Stefan Bache gives his version about how he created
magrittrand the convergence of the pipe operators between
In the first comment of the announcing post of
dplyrby the RStudio blog, Stefan Bache let Hadley Wickham know about magrittr operator.
%>%operator should be pronounced “then”.