/usr/lib/R/site-library/plyr/NEWS is in r-cran-plyr 1.8-1build1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 | Version 1.8
------------------------------------------------------------------------------
NEW FEATURES AND FUNCTIONS
* `**ply` gain a `.inform` argument (previously only available in `llply`) - this gives more useful debugging information at the cost of some speed. (Thanks to Brian Diggs, #57)
* if `.dims = TRUE` `alply`'s output gains dimensions and dimnames, similar to `apply`. Sequential indexing of a list produced by `alply` should be unaffected. (Peter Meilstrup)
* `colwise`, `numcolwise` and `catcolwise` now all accept additional arguments in .... (Thanks to Stavros Macrakis, #62)
* `here` makes it possible to use `**ply` + a function that uses non-standard evaluation (e.g. `summarise`, `mutate`, `subset`, `arrange`) inside a function. (Thanks to Peter Meilstrup, #3)
* `join_all` recursively joins a list of data frames. (Fixes #29)
* `name_rows` provides a convenient way of saving and then restoring row names so that you can preserve them if you need to. (#61)
* `progress_time` (used with `.progress = "time"`) estimates the amount of time remaining before the job is completed. (Thanks to Mike Lawrence, #78)
* `summarise` now works iteratively so that later columns can refer to earlier. (Thanks to Jim Hester, #44)
* `take` makes it easy to subset along an arbitrary dimension.
* Improved documentation thanks to patches from Tim Bates.
PARALLEL PLYR
* `**ply` gains a `.paropts` argument, a list of options that is passed onto `foreach` for controlling parallel computation.
* `*_ply` now accepts `.parallel` argument to enable parallel processing. (Fixes #60)
* Progress bars are disabled when using parallel plyr (Fixes #32)
PERFORMANCE IMPROVEMENTS
* `a*ply`: 25x speedup when indexing array objects, 3x speedup when indexing data frames. This should substantially reduce the overhead of using `a*ply`
* `d*ply` subsetting has been considerably optimised: this will have a small impact unless you have a very large number of groups, in which case it will be considerably faster.
* `idata.frame`: Subsetting immutable data frames with `[.idf` is now
faster (Peter Meilstrup)
* `quickdf` is around 20% faster
* `split_indices`, which powers much internal splitting code (like `vaggregate`, `join` and `d*ply`) is about 2x faster. It was already incredible fast ~0.2s for 1,000,000 obs, so this won't have much impact on overall performance
BUG FIXES
* `*aply` functions now bind list mode results into a list-array (Peter Meilstrup)
* `*aply` now accepts 0-dimension arrays as inputs. (#88)
* `*dply` now deals better with matrix results, converting them to data frames, rather than vectors. (Fixes #12)
* `d*ply` will now preserve factor levels input if `drop = FALSE` (#81)
* `join` works correctly when there are no common rows (Fixes #74), or when one input has no rows (Fixes #48). It also consistently orders the columns: common columns, then x cols, then y cols (Fixes #40).
* `quickdf` correctly handles NA variable names. (Fixes #66. Thanks to Scott Kostyshak)
* `rbind.fill` and `rbind.fill.matrix` work consistently with matrices and data frames with zero rows. Fixes #79. (Peter Meilstrup)
* `rbind.fill` now stops if inputs are not data frames. (Fixes #51)
* `rbind.fill` now works consistently with 0 column data frames
* `round_any` now works with `POSIXct` objects, thanks to Jean-Olivier Irisson (#76)
Version 1.7.1
------------------------------------------------------------------------------
* Fix bug in id, using numeric instead of integer
Version 1.7
------------------------------------------------------------------------------
* `rbind.fill`: if a column contains both factors and characters (in different
inputs), the resulting column will be coerced to character
* When there are more than 2^31 distinct combinations `id`, switches to a
slower fallback strategy using strings (inspired by `merge`) that guarantees
correct results. This fixes problems with `join` when joining across many
columns. (Fixes #63)
* `split_indices` checks input more aggressively to prevent segfaults.
Fixes #43.
* fix small bug in `loop_apply` which lead to segfaults in certain
circumstances. (Thanks to Pål Westermark for patch)
* `itertools` and `iterators` moved to suggests from imports so that plyr now
only depends on base R.
Version 1.6
------------------------------------------------------------------------------
* documentation improved using new features of `roxygen2`
* fixed namespacing issue which lead to lost labels when subsetting the
results of `*lply`
* `colwise` automatically strips off split variables.
* `rlply` now correctly deals with `rlply(4, NULL)` (thanks to bug report from
Eric Goldlust)
* `rbind.fill` tries harder to keep attributes, retaining the attributes from
the first occurrence of each column it finds. It also now works with
variables of class `POSIXlt` and preserves the ordered status of factors.
* `arrange` now works with one column data frames
Version 1.5.2
------------------------------------------------------------------------------
* `d*ply` returns correct number of rows when function returns vector
* fix NAMESPACE bug which was causing problems with ggplot2
Version 1.5.1
------------------------------------------------------------------------------
* `rbind.fill` now treats 1d arrays in the same way as `rbind` (i.e. it turns
them into ordinary vectors)
* fix bug in rename when renaming multiple columns
Version 1.5 (2011-03-02)
------------------------------------------------------------------------------
NEW FEATURES
* new `strip_splits` function removes splitting variables from the data frames
returned by `ddply`.
* `rename` moved in from reshape, and rewritten.
* new `match_df` function makes it easy to subset a data frame to only contain
values matching another data frame. Inspired by
http://stackoverflow.com/questions/4693849.
BUG FIXES
* `**ply` now works when passed a list of functions
* `*dply` now correctly names output even when some output combinations are
missing (NULL) (Thanks to bug report from Karl Ove Hufthammer)
* `*dply` preserves the class of many more object types.
* `a*ply` now correctly works with zero length margins, operating on the
entire object (Thanks to bug report from Stavros Macrakis)
* `join` now implements joins in a more SQL like way, returning all possible
matches, not just the first one. It is still a (little) faster than merge.
The previous behaviour is accessible with `match = "first"`.
* `join` is now more symmetric so that `join(x, y, "left")` is closer to
`join(y, x, "right")`, modulo column ordering
* `named.quoted` failed when quoted expressions were longer than 50
characters. (Thanks to bug report from Eric Goldlust)
* `rbind.fill` now correctly maintains POSIXct tzone attributes and preserves
missing factor levels
* `split_labels` correctly preserves empty factor levels, which means that
`drop = FALSE` should work in more places. Use `base::droplevels` to remove
levels that don't occur in the data, and `drop = T` to remove combinations
of levels that don't occur.
* `vaggregate` now passes `...` to the aggregation function when working out
the output type (thanks to bug report by Pavan Racherla)
Version 1.4.1 (2011-04-05)
------------------------------------------------------------------------------
* Add citation to JSS article
Version 1.4 (2011-01-03)
------------------------------------------------------------------------------
* `count` now takes an additional parameter `wt_var` which allows you to
compute weighted sums. This is as fast, or faster than, `tapply` or `xtabs`.
* Really fix bug in `names.quoted`
* `.` now captures the environment in which it was evaluated. This should fix
an esoteric class of bugs which no-one probably ever encountered, but will
form the basis for an improved version of `ggplot2::aes`.
Version 1.3.1 (2010-12-30)
------------------------------------------------------------------------------
* Fix bug in `names.quoted` that interfered with ggplot2
Version 1.3 (2010-12-28)
------------------------------------------------------------------------------
NEW FEATURES
* new function `mutate` that works like transform to add new columns or
overwrite existing columns, but computes new columns iteratively so later
transformations can use columns created by earlier transformations. (It's
also about 10x faster) (Fixes #21)
BUG FIXES
* split column names are no longer coerced to valid R names.
* `quickdf` now adds names if missing
* `summarise` preserves variable names if explicit names not provided (Fixes
#17)
* `arrays` with names should be sorted correctly once again (also fixed a bug
in the test case that prevented me from catching this automatically)
* `m_ply` no longer possesses .parallel argument (mistakenly added)
* `ldply` (and hence `adply` and `ddply`) now correctly passes on .parallel
argument (Fixes #16)
* `id` uses a better strategy for converting to integers, making it possible
to use for cases with larger potential numbers of combinations
Version 1.2.1 (2010-09-10)
------------------------------------------------------------------------------
* Fix bug in llply fast path that causes problems with ggplot2.
Version 1.2 (2010-09-09)
------------------------------------------------------------------------------
NEW FEATURES
* l*ply, d*ply, a*ply and m*ply all gain a .parallel argument that when TRUE,
applies functions in parallel using a parallel backend registered with the
foreach package:
x <- seq_len(20)
wait <- function(i) Sys.sleep(0.1)
system.time(llply(x, wait))
# user system elapsed
# 0.007 0.005 2.005
library(doMC)
registerDoMC(2)
system.time(llply(x, wait, .parallel = TRUE))
# user system elapsed
# 0.020 0.011 1.038
This work has been generously supported by BD (Becton Dickinson).
MINOR CHANGES
* a*ply and m*ply gain an .expand argument that controls whether data frames
produce a single output dimension (one element for each row), or an output
dimension for each variable.
* new vaggregate (vector aggregate) function, which is equivalent to tapply,
but much faster (~ 10x), since it avoids copying the data.
* llply: for simple lists and vectors, with no progress bar, no extra info,
and no parallelisation, llply calls lapply directly to avoid all the
overhead associated with those unused extra features.
* llply: in serial case, for loop replaced with custom C function that takes
about 40% less time (or about 20% less time than lapply). Note that as a
whole, llply still has much more overhead than lapply.
* round_any now lives in plyr instead of reshape
BUG FIXES
* list_to_array works correct even when there are missing values in the array.
This is particularly important for daply.
Version 1.1 (2010-07-19)
------------------------------------------------------------------------------
* *dply deals more gracefully with the case when all results are NULL
(fixes #10)
* *aply correctly orders output regardless of dimension names
(fixes #11)
* join gains type = "full" which preserves all x and y rows
Version 1.0 (2010-07-02)
------------------------------------------------------------------------------
New functions:
* arrange, a new helper method for reordering a data frame.
* count, a version of table that returns data frames immediately and that is
much much faster for high-dimensional data.
* desc makes it easy to sort any vector in descending order
* join, works like merge but can be much faster and has a somewhat simpler
syntax drawing from SQL terminology
* rbind.fill.matrix is like rbind.fill but works for matrices, code
contributed by C. Beleites
Speed improvements
* experimental immutable data frame (idata.frame) that vastly speeds up
subsetting - for large datasets with large numbers of groups, this can yield
10-fold speed ups. See examples in ?idata.frame to see how to use it.
* rbind.fill rewritten again to increase speed and work with more data types
* d*ply now much faster with nested groups
This work has been generously supported by BD (Becton Dickinson).
New features:
* d*ply now accepts NULL for splitting variables, indicating that the data
should not be split
* plyr no longer exports internal functions, many of which were causing
clashes with other packages
* rbind.fill now works with data frame columns that are lists or matrices
* test suite ensures that plyr behaviour is correct and will remain correct
as I make future improvements.
Bug fixes:
* **ply: if zero splits, empty list(), data.frame() or logical() returned,
as appropriate for the output type
* **ply: leaving .fun as NULL now always returns list
(thanks to Stavros Macrakis for the bug report)
* a*ply: labels now respect options(stringAsFactors)
* each: scoping bug fixed, thanks to Yasuhisa Yoshida for the bug report
* list_to_dataframe is more consistent when processing a single data frame
* NAs preserved in more places
* progress bars: guaranteed to terminate even if **ply prematurely terminates
* progress bars: misspelling gives informative warning, instead of
uninformative error
* splitter_d: fixed ordering bug when .drop = FALSE
Version 0.1.9 (2009-06-23)
------------------------------------------------------------------------------
* fix bug in rbind.fill when NULLs present in list
* improve each to recognise when all elements are numeric
* fix labelling bug in d*ply when .drop = FALSE
* additional methods for quoted objects
* add summarise helper - this function is like transform, but creates a new data frame rather than reusing the old (thanks to Brendan O'Connor for the neat idea)
Version 0.1.8 (2009-04-20)
------------------------------------------------------------------------------
* made rbind a little faster (~20%) using an idea from Richard Raubertas
* daply now works correctly when splitting variables that contain empty factor levels
Version 0.1.7 (2009-04-15)
------------------------------------------------------------------------------
* Version that rbind.fill copies attributes.
Version 0.1.6 (2009-04-15)
------------------------------------------------------------------------------
Improvements:
* all ply functions deal more elegantly when given function names: can supply a vector of function names, and name is used as label in output
* failwith and each now work with function names as well as functions (i.e. "nrow" instead of nrow)
* each now accepts a list of functions or a vector of function names
* l*ply will use list names where present
* if .inform is TRUE, error messages will give you information about where errors within your data - hopefully this will make problems easier to track down
* d*ply no longer converts splitting variables to factors when drop = T (thanks to bug report from Charlotte Wickham)
Speed-ups
* massive speed ups for splitting large arrays
* fixed typo that was causing a 50% speed penalty for d*ply
* rewritten rbind.fill is considerably (> 4x) faster for many data frames
* colwise about twice as fast
Bug fixes:
* daply: now works when the data frame is split by multiple variables
* aaply: now works with vectors
* ddply: first variable now varies slowest as you'd expect
Version 0.1.5 (2009-02-23)
------------------------------------------------------------------------------
* colwise now accepts a quoted list as its second argument. This allows you to specify the names of columns to work on: colwise(mean, .(lat, long))
* d_ply and a_ply now correctly pass ... to the function
Version 0.1.4 (2008-12-12)
------------------------------------------------------------------------------
* Greatly improved speed (> 10x faster) and memory usage (50%) for splitting data frames with many combinations
* Splitting variables containing missing values now handled consistently
Version 0.1.3 (2008-11-19)
------------------------------------------------------------------------------
* Fixed problem where when splitting by a variable that contained missing values, missing combinations would be drop, and labels wouldn't match up
Version 0.1.2 (2008-11-18)
------------------------------------------------------------------------------
* a*ply now works correctly with array-lists
* drop. -> .drop
* r*ply now works with ...
* use inherits instead of is so method package doesn't need to be loaded
* fix bug with using formulas
Version 0.1.1 (2008-10-08)
------------------------------------------------------------------------------
* argument names now start with . (instead of ending with it) - this should prevent name clashes with arguments of the called function
* return informative error if .fun is not a function
* use full names in all internal calls to avoid argument name clashes
|