This file is indexed.

/usr/lib/R/site-library/stringi/NEWS is in r-cran-stringi 1.0-1-1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
                   stringi package NEWS and CHANGELOG
===============================================================================

## Known bugs in ICU

* [REGEX] #147: regex look-behind assertions may fail to find a multibyte
Unicode search pattern [scheduled for ICU4C 56.1, see
http://bugs.icu-project.org/trac/ticket/11554]

-------------------------------------------------------------------------------

## 1.0-1 (2015-10-22) **CRAN**

* [GENERAL] #88: C++ API is now available for use in, e.g., Rcpp packages, see
https://github.com/Rexamine/ExampleRcppStringi for an exemple.

* [BUGFIX] #183: Floating point exception raised in `stri_sub()` and
`stri_sub<-()` when `to` or `length` was a zero-length numeric vector.

* [BUGFIX] #180: `stri_c()` warned incorrectly (recycling rule) when using more
than 2 elements.

-------------------------------------------------------------------------------

## 0.5-5 (2015-06-28) **CRAN**

* [BACKWARD INCOMPATIBILITY] `stri_install_check()` and `stri_install_icudt()`
are now deprecated. From now on they are supposed to be used only
by the `stringi` installer.

* [BUGFIX] #176: a patch for `sys/feature_tests.h` no longer included
(the original file was copyrighted by Sun Microsystems); fixed the *Compiler
or options invalid for pre-UNIX 03 X/Open applications and pre-2001 POSIX
applications* error by forcing (conditionally) `_XPG6` conformance.

* [BUGFIX] #174: `stri_paste()` did not generate any warning when
the recycling rule is violated and `sep==""`.

* [BUGFIX] #170: `icu::setDataDirectory` no longer called if our ICU source bundle
is not used (this used to cause build problems on openSUSE).

* [BUILD TIME] #169: `./configure` now tries to switch to the "standard"
C++ compiler if a C++11 one is not properly configured.

* [BUILD TIME] `configure.win` (`Biarch: TRUE`) now mimics `autoconf`'s
`AC_SUBST` and `AC_CONFIG_FILES` so that the build process is now
more similar across different platforms.

* [NEW FEATURE] `stri_info()` now also gives information on which ICU4C is
used (system or bundle).

-------------------------------------------------------------------------------

## 0.5-2 (2015-06-21) **CRAN**

* [BACKWARD INCOMPATIBILITY] The second argument to `stri_pad_*()` has
been renamed `width`.

* [GENERAL] #69: `stringi` is now bundled with ICU4C 55.1.

* [NEW FUNCTIONS] `stri_extract_*_boundaries()` extract text between text
boundaries.

* [NEW FUNCTION] #46: `stri_trans_char()` is a `stringi`-flavoured
`chartr()` equivalent.

* [NEW FUNCTION] #8: `stri_width()` approximates the *width* of a string
in a more Unicodish fashion than `nchar(..., "width")`

* [NEW FEATURE] #149: `stri_pad()` and `stri_wrap()` now by default bases on
code point widths instead of the number of code points. Moreover, the default
behavior of `stri_wrap()` is now such that it does not get rid
of non-breaking, zero width, etc. spaces

* [NEW FEATURE] #133: `stri_wrap()` silently allows for `width <= 0`
(for compatibility with `strwrap()`).

* [NEW FEATURE] #139: `stri_wrap()` gained a new argument: `whitespace_only`.

* [NEW FUNCTIONS] #137: date-time formatting/parsing:
     * `stri_timezone_list()` - lists all known time zone identifiers
     * `stri_timezone_set()`, `stri_timezone_get()` - manage current default time zone
     * `stri_timezone_info()` - basic information on a given time zone
     * `stri_datetime_symbols()` - localizable date-time formatting data
     * `stri_datetime_fstr()` - convert a `strptime`-like format string
            to an ICU date/time format string
     * `stri_datetime_format()` - convert date/time to string
     * `stri_datetime_parse()` - convert string to date/time object
     * `stri_datetime_create()` - construct date-time objects
            from numeric representations
     * `stri_datetime_now()` - return current date-time
     * `stri_datetime_fields()` - get values for date-time fields
     * `stri_datetime_add()` - add specific number of date-time units
            to a date-time object

* [GENERAL] #144: Performance improvements in handling ASCII strings
(these affect `stri_sub()`, `stri_locate()` and other string index-based
 operations)

* [GENERAL] #143: Searching for short fixed patterns (`stri_*_fixed()`) now
relies on the current `libC`'s implementation of `strchr()` and `strstr()`.
This is very fast e.g. on `glibc` utilizing the `SSE2/3/4` instruction set.

* [BUILD TIME] #141: a local copy of `icudt*.zip` may be used on package
install; see the `INSTALL` file for more information.

* [BUILD TIME] #165: the `./configure` option `--disable-icu-bundle`
forces the use of system ICU when building the package.

* [BUGFIX] locale specifiers are now normalized in a more intelligent way:
e.g. `@calendar=gregorian` expands to `DEFAULT_LOCALE@calendar=gregorian`.

* [BUGFIX] #134: `stri_extract_all_words()` did not accept `simplify=NA`.

* [BUGFIX] #132: incorrect behavior in `stri_locate_regex()` for matches
of zero lengths

* [BUGFIX] stringr/#73: `stri_wrap()` returned `CHARSXP` instead of `STRSXP`
on empty string input with `simplify=FALSE` argument.

* [BUGFIX] #164: `libicu-dev` usage used to fail on Ubuntu
(`LIBS` shall be passed after `LDFLAGS` and the list of `.o` files).

* [BUGFIX] #168: Build now fails if `icudt` is not available.

* [BUGFIX] #135: C++11 is now used by default (see the `INSTALL` file,
however) to build `stringi` from sources. This is because ICU4C uses the
`long long` type which is not part of the C++98 standard.

* [BUGFIX] #154: Dates and other objects with a custom class attribute
were not coerced to the character type correctly.

* [BUGFIX] Force ICU `u_init()` call on `stringi` dynlib load.

* [BUGFIX] #157: many overfull hboxes in the package PDF manual has been
corrected.

-------------------------------------------------------------------------------

## 0.4-1 (2014-12-11) **CRAN**

* [IMPORTANT CHANGE] `n_max` argument in `stri_split_*()` has been renamed `n`.

* [IMPORTANT CHANGE] `simplify=FALSE` in `stri_extract_all_*()` and
`stri_split_*()` now calls `stri_list2matrix()` with `fill=""`.
`fill=NA_character_` may be obtained by using `simplify=NA`.

* [IMPORTANT CHANGE, NEW FUNCTIONS] #120: `stri_extract_words` has been
renamed `stri_extract_all_words` and `stri_locate_boundaries` -
`stri_locate_all_boundaries` as well as `stri_locate_words` -
`stri_locate_all_words`. New functions are now available:
`stri_locate_first_boundaries`, `stri_locate_last_boundaries`,
`stri_locate_first_words`, `stri_locate_last_words`,
`stri_extract_first_words`, `stri_extract_last_words`.

* [IMPORTANT CHANGE] #111: `opts_regex`, `opts_collator`, `opts_fixed`, and
`opts_brkiter` can now be supplied individually via `...`.
In other words, you may now simply call e.g.
`stri_detect_regex(str, pattern, case_insensitive=TRUE)` instead of
`stri_detect_regex(str, pattern, opts_regex=stri_opts_regex(case_insensitive=TRUE))`.

* [NEW FEATURE] #110: Fixed pattern search engine's settings can
now be supplied via `opts_fixed` argument in `stri_*_fixed()`,
see `stri_opts_fixed()`. A simple (not suitable for natural language
processing) yet very fast `case_insensitive` pattern matching can be
performed now. `stri_extract_*_fixed` is again available.

* [NEW FEATURE] #23: `stri_extract_all_fixed`, `stri_count`, and
`stri_locate_all_fixed` may now also look for overlapping pattern
matches, see `?stri_opts_fixed`.

* [NEW FEATURE] #129: `stri_match_*_regex` gained a `cg_missing` argument.

* [NEW FEATURE] #117: `stri_extract_all_*()`, `stri_locate_all_*()`,
`stri_match_all_*()` gained a new argument: `omit_no_match`.
Setting it to `TRUE` makes these functions compatible with their
`stringr` equivalents.

* [NEW FEATURE] #118: `stri_wrap()` gained `indent`, `exdent`, `initial`,
and `prefix` arguments. Moreover Knuth's dynamic word wrapping algorithm
now assumes that the cost of printing the last line is zero, see #128.

* [NEW FEATURE] #122: `stri_subset()` gained an `omit_na` argument.

* [NEW FEATURE] `stri_list2matrix()` gained an `n_min` argument.

* [NEW FEATURE] #126: `stri_split()` now is also able to act
just like `stringr::str_split_fixed()`.

* [NEW FEATURE] #119:  `stri_split_boundaries()` now have
`n`, `tokens_only`, and `simplify` arguments. Additionally,
`stri_extract_all_words()` is now equipped with `simplify` arg.

* [NEW FEATURE] #116: `stri_paste()` gained a new argument:
`ignore_null`. Setting it to `TRUE` makes this function more compatible
with `paste()`.

* [OTHER] #123: `useDynLib` is used to speed up symbol look-up in
the compiled dynamic library.

* [BUGFIX] #114: `stri_paste()`: could return result in an incorrect order.

* [BUGFIX]  #94: Run-time errors on Solaris caused by setting
`-DU_DISABLE_RENAMING=1` -- memory allocation errors in i.a. ICU's
UnicodeString. This setting also caused some ABSan sanity check
failures within ICU code.

-------------------------------------------------------------------------------

## 0.3-1 (2014-11-06) **CRAN**

* [IMPORTANT CHANGE] #87: `%>%` overlapped with the pipe operator from
the `magrittr` package; now each operator like `%>%` has been renamed `%s>%`.

* [IMPORTANT CHANGE] #108: Now the BreakIterator (for text boundary analysis)
may be better controlled via `stri_opts_brkiter()` (see options `type`
and `locale` which aim to replace now-removed `boundary` and `locale` parameters
to `stri_locate_boundaries`, `stri_split_boundaries`, `stri_trans_totitle`,
`stri_extract_words`, `stri_locate_words`).

* [NEW FUNCTIONS] #109: `stri_count_boundaries()` and `stri_count_words()`
count the number of text boundaries in a string.

* [NEW FUNCTIONS] #41: `stri_startswith_*()` and `stri_endswith_*()`
determine whether a string starts or ends with a given pattern.

* [NEW FEATURE] #102: `stri_replace_all_*()` gained a `vectorize_all()` parameter,
which defaults to TRUE for backward compatibility.

* [NEW FUNCTION] #91: `stri_subset_*()`, a convenient and more efficient
substitute for `str[stri_detect_*(str, ...)]`, added.

* [NEW FEATURE] #100: `stri_split_fixed()`, `stri_split_charclass()`,
`stri_split_regex()`, `stri_split_coll()` gained a `tokens_only` parameter,
which defaults to `FALSE` for backward compatibility.

* [NEW FUNCTION] #105: `stri_list2matrix()` converts lists of atomic vectors
to character matrices, useful in connection with `stri_split`
and `stri_extract`.

* [NEW FEATURE] #107: `stri_split_*()` now allow setting an `omit_empty=NA` argument.

* [NEW FEATURE] #106: `stri_split()` and `stri_extract_all()` gained a `simplify`
argument (if `TRUE`, then `stri_list2matrix(..., byrow=TRUE)`
is called on the resulting list.

* [NEW FUNCTION] #77: `stri_rand_lipsum()` generates
(pseudo)random dummy *lorem ipsum* text.

* [NEW FEATURE] #98: `stri_trans_totitle()` gained a `opts_brkiter`
parameter; it indicates which ICU BreakIterator should be used when
performing case mapping.

* [NEW FEATURE] `stri_wrap()` gained a new parameter: `normalize`.

* [BUGFIX] #86: `stri_*_fixed()`, `stri_*_coll()`, and `stri_*_regex()` could
give incorrect results if one of search strings were of length 0.

* [BUGFIX] #99: `stri_replace_all()` did not use the `replacement` arg.

* [BUGFIX] #112: Some of the objects were not PROTECTed from
being garbage collected, which might have caused spontaneous SEGFAULTS.

* [BUGFIX] Some collator's options were not passed correctly to ICU services.

* [BUGFIX] Memory leaks causes as detected by
`valgrind --tool=memcheck --leak-check=full` have been removed.

* [DOCUMENTATION] Significant extensions/clean ups in the `stringi` manual.

-------------------------------------------------------------------------------

## 0.2-5 (2014-05-16) **CRAN**

* ~~icudt-dependent examples are no longer run if `icudt` is not available.~~

-------------------------------------------------------------------------------

## 0.2-4 (2014-05-15) **CRAN**

* [BUGFIX] Issues with loading of misaligned addresses in `stri_*_fixed()`.

-------------------------------------------------------------------------------

## 0.2-3 (2014-05-14) **CRAN**

* [IMPORTANT CHANGE] `stri_cmp*()` now do not allow for passing
 `opts_collator=NA`. From now on, `stri_cmp_eq`, `stri_cmp_neq`,
 and the new operators `%===%`, `%!==%`, `%stri===%`, and `%stri!==%`
 are locale-independent operations, which base on code point comparisons.
 New functions `stri_cmp_equiv` and `stri_cmp_nequiv`
 (and from now on also `%==%`, `%!=%`, `%stri==%`, and `%stri!=%`)
 test for canonical equivalence.

* [IMPORTANT CHANGE] `stri_*_fixed()` search functions now perform
 a locale-independent exact (byte-wise, of course after conversion to UTF-8)
 pattern search. All the Collator-based, locale-dependent search routines
 are now available via `stri_*_coll()`. The reason for this is that
 ICU USearch has currently very poor performance and in many search tasks
 in fact it is sufficient to do exact pattern matching.

* [GENERAL] `stri_*_fixed` now use a tweaked Knuth-Morris-Pratt search
 algorithm, which improves the search performance drastically.

* [IMPORTANT CHANGE] `stri_enc_nf*()` and `stri_enc_isnf*()` function families
have been renamed to `stri_trans_nf*()` and `stri_trans_isnf*()`,
respectively. This is because they deal with text transforming,
and not with character encoding. Moreover, all such operations may
also be performed by ICU's Transliterator (see below).

* [NEW FUNCTION] `stri_trans_general()` and `stri_trans_list()` give access
to ICU's Transliterator: may be used to perform very general
text transforms.

* [NEW FUNCTION `stri_split_boundaries()` utilizes ICU's BreakIterator
to split strings at specific text boundaries. Moreover,
stri_locate_boundaries indicates positions of these boundaries.

* [NEW FUNCTION] `stri_extract_words()` uses ICU's BreakIterator to
extract all words from a text. Additionally, `stri_locate_words`
locates start and end positions of words in a text.

* [NEW FUNCTION] `stri_pad()`, `stri_pad_left()`, `stri_pad_right()`,
and `stri_pad_both` pad a string with a specific code point.

* [NEW FUNCTION] `stri_wrap()` breaks paragraphs of text into lines.
Two algorithms (greedy and minimal raggedness) are available.

* [IMPORTANT CHANGE] `stri_*_charclass()` search functions now
rely solely on ICU's UnicodeSet patterns. All previously accepted
charclass identifiers became invalid. However, new patterns
should now be more familiar to the users (they are regex-like).
Moreover, we observe a very nice performance gain.

* [IMPORTANT CHANGE] `stri_sort()` now does not include `NA`s
in output vectors by default, for compatibility with `sort()`.
Moreover, currently none of the input vector's attributes are preserved.

* [NEW FUNCTION] `stri_unique()` extracts unique elements from
a character vector.

* [NEW FUNCTIONS] `stri_duplicated()` and `stri_duplicated_any()`
determine duplicate elements in a character vector.

* [NEW FUNCTION] `stri_replace_na()` replaces `NA`s in a character vector
with a given string, useful for emulating e.g. R's `paste()` behavior.

* [NEW FUNCTION] `stri_rand_shuffle()` generates a random permutation
of code points in a string.

* [NEW FUNCTION] `stri_rand_strings()` generates random strings.

* [NEW FUNCTIONS] New functions and binary operators for string comparison:
`stri_cmp_eq()`, `stri_cmp_neq()`, `stri_cmp_lt()`, `stri_cmp_le()`,
`stri_cmp_gt()`, `stri_cmp_ge()`, `%==%`, `%!=%`, `%<%`, `%<=%`, `%>%`, `%>=%`.

* [NEW FUNCTION] `stri_enc_mark()` reads declared encodings of character
strings as seen by `stringi`.

* [NEW FUNCTION] `stri_enc_tonative(str)` is an alias to
`stri_encode(str, NULL, NULL)`.

* [NEW FEATURE] `stri_order()` and `stri_sort()` now have an additional argument
`na_last` (defaults to `TRUE` and `NA`, respectively).

* [NEW FEATURE] `stri_replace_all_charclass()`, `stri_extract_all_charclass()`,
and `stri_locate_all_charclass()` now have a new arg, `merge`
(defaults to `FALSE` for backward-compatibility). It may be used
to e.g. replace sequences of white spaces with a single space.

* [NEW FEATURE] `stri_enc_toutf8()` now has a new `validate` arg (defaults
to FALSE for backward-compatibility). It may be used in a (rare) case
in which a user wants to fix an invalid UTF-8 byte sequence.
stri_length (among others) now detect invalid UTF-8 byte sequences.

* [NEW FEATURE] All binary operators `%???%` now also have aliases `%stri???%`.

* [GENERAL] Performance improvements in `StriContainerUTF8`
and `StriContainerUTF16` (they affect most other functions).

* [GENERAL] Significant performance improvements in `stri_join()`,
`stri_flatten()`, `stri_cmp()`, `stri_trans_to*()`, and others.

* [GENERAL] Added 3rd mirror site for our `icudt` binary distribution.

* `U_MISSING_RESOURCE_ERROR` message in `StriException` now suggests
calling `stri_install_check()`.

* [BUGFIX] UTF-8 BOMs are now silently removed from input strings.

* [BUGFIX] no more attempts to re-encode UTF-8 encoded strings
if native encoding=UTF-8 in `StriContainerUTF8`.

* [BUGFIX] possible memory leaks when throwing errors via `Rf_error()`.

* [BUGFIX] `stri_order()` and `stri_cmp()` could return incorrect results
for `opts_collator=NA`.

* [BUGFIX] `stri_sort()` did not guarantee to return strings in UTF-8.

-------------------------------------------------------------------------------

## 0.1-25 (2014-03-12) **CRAN**

* LICENSE tweaks.

* Initial CRAN release.

-------------------------------------------------------------------------------

## 0.1-24 (2014-03-11) **devel**

* Fixed bugs detected with `ASan` and `UBSan`,
   e.g. fixed `CharClass::gcmask` type (`enum` -> `uint32_t`)
   (reported by `UBSan`).

* Fixed array over-runs detected with `valgrind` in `string8.h`.

* Fixed unitialized class fields in `StriContainerUTF8`
   (reported by `valgrind`).

-------------------------------------------------------------------------------

## 0.1-23 (2014-03-11) **devel**

* License changed to BSD-3-clause, COPYRIGHTS updated.

* `icudt` is not shipped with `stringi` anymore;
   it is now downloaded in `install.libs.R` from one of our servers.

* New functions: `stri_install_check()`, `stri_install_icudt()`.

-------------------------------------------------------------------------------

## 0.1-22 (2014-02-20) **devel**

* System ICU is used on systems which do have one (version >= 50 needed).
  ICU is autodetected with `pkg-config` in `./configure`.
  Pass `'--disable-pkg-config'` to `./configure` to force building
  ICU from sources.

* `icudt52b` (custom subset) is now shipped with `stringi`
  (for big-endian, ASCII systems).

-------------------------------------------------------------------------------

## 0.1-21 (2014-02-19) **devel**

* Fixed some Solaris-related issues while preparing `stringi`
  for CRAN submission.

-------------------------------------------------------------------------------

## 0.1-20 (2014-02-17) **devel**

* ICU4C 52.1 sources included (common, i18n, stubdata + icu52dt.dat
  loaded dynamically). Compilation via Makevars.

* `stringi` now does not depend on any external libraries.

-------------------------------------------------------------------------------

## 0.1-11 (2013-11-16) **devel**

* ICU4C is now statically linked on Windows.

* First OS X binary build.

* The package is being intensively tested by our students @ FMIS WUT.

-------------------------------------------------------------------------------

## 0.1-10 (2013-11-13) **devel**

* Using `pkg-config` via `./configure` to look for ICU4C libs.

-------------------------------------------------------------------------------

## 0.1-6 (2013-07-05) **devel**

* First Windows binary build.

* Compilation passed on Oracle Sun Studio compiler collection.

* By now we have implemented most of the functionality
  scheduled for milestone 0.1.

-------------------------------------------------------------------------------

## 0.1-1 (2013-01-05) **devel**

* The `stringi` project has been established on GitHub.

-------------------------------------------------------------------------------