/usr/include/xapian-1.3/xapian/database.h is in libxapian-1.3-dev 1.3.4-0ubuntu6.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 | /** @file database.h
* @brief API for working with Xapian databases
*/
/* Copyright 1999,2000,2001 BrightStation PLC
* Copyright 2002 Ananova Ltd
* Copyright 2002,2003,2004,2005,2006,2007,2008,2009,2011,2012,2013,2014,2015 Olly Betts
* Copyright 2006,2008 Lemur Consulting Ltd
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
* published by the Free Software Foundation; either version 2 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301
* USA
*/
#ifndef XAPIAN_INCLUDED_DATABASE_H
#define XAPIAN_INCLUDED_DATABASE_H
#if !defined XAPIAN_IN_XAPIAN_H && !defined XAPIAN_LIB_BUILD
# error "Never use <xapian/database.h> directly; include <xapian.h> instead."
#endif
#include <iosfwd>
#include <string>
#include <vector>
#include <xapian/attributes.h>
#include <xapian/intrusive_ptr.h>
#include <xapian/types.h>
#include <xapian/positioniterator.h>
#include <xapian/postingiterator.h>
#include <xapian/termiterator.h>
#include <xapian/valueiterator.h>
#include <xapian/visibility.h>
namespace Xapian {
class Compactor;
class Document;
/** This class is used to access a database, or a group of databases.
*
* For searching, this class is used in conjunction with an Enquire object.
*
* @exception InvalidArgumentError will be thrown if an invalid
* argument is supplied, for example, an unknown database type.
*
* @exception DatabaseOpeningError may be thrown if the database cannot
* be opened (for example, a required file cannot be found).
*
* @exception DatabaseVersionError may be thrown if the database is in an
* unsupported format (for example, created by a newer version of Xapian
* which uses an incompatible format).
*/
class XAPIAN_VISIBILITY_DEFAULT Database {
/// @internal Implementation behind check() static methods.
static size_t check_(const std::string * path_ptr, int fd, int opts,
std::ostream *out);
/// Internal helper behind public compact() methods.
void compact_(const std::string * output_ptr,
int fd,
unsigned flags,
int block_size,
Xapian::Compactor * compactor) const;
public:
class Internal;
/// @private @internal Reference counted internals.
std::vector<Xapian::Internal::intrusive_ptr<Internal> > internal;
/** Add an existing database (or group of databases) to those
* accessed by this object.
*
* @param database the database(s) to add.
*/
void add_database(const Database & database);
/** Create a Database with no databases in.
*/
Database();
/** Open a Database, automatically determining the database
* backend to use.
*
* @param path directory that the database is stored in.
*/
explicit Database(const std::string &path, int flags = 0);
/** Open a single-file Database.
*
* This method opens a single-file Database given a file descriptor
* open on it. Xapian looks starting at the current file offset,
* allowing a single file database to be easily embedded within
* another file.
*
* @param fd file descriptor for the file. Xapian takes ownership of
* this and will close it when the database is closed.
* @param flags Bitwise-or of Xapian::DB_* constants.
*/
explicit Database(int fd, int flags = 0);
/** @private @internal Create a Database from its internals.
*/
explicit Database(Internal *internal);
/** Destroy this handle on the database.
*
* If there are no copies of this object remaining, the database(s)
* will be closed.
*/
virtual ~Database();
/** Copying is allowed. The internals are reference counted, so
* copying is cheap.
*
* @param other The object to copy.
*/
Database(const Database &other);
/** Assignment is allowed. The internals are reference counted,
* so assignment is cheap.
*
* @param other The object to copy.
*/
void operator=(const Database &other);
/** Re-open the database.
*
* This re-opens the database(s) to the latest available version(s).
* It can be used either to make sure the latest results are returned,
* or to recover from a Xapian::DatabaseModifiedError.
*
* Calling reopen() on a database which has been closed (with @a
* close()) will always raise a Xapian::DatabaseError.
*
* @return true if the database might have been reopened (if false
* is returned, the database definitely hasn't been
* reopened, which applications may find useful when
* caching results, etc). In Xapian < 1.3.0, this method
* did not return a value.
*/
bool reopen();
/** Close the database.
*
* This closes the database and closes all its file handles.
*
* For a WritableDatabase, if a transaction is active it will be
* aborted, while if no transaction is active commit() will be
* implicitly called. Also the write lock is released.
*
* Closing a database cannot be undone - in particular, calling
* reopen() after close() will not reopen it, but will instead throw a
* Xapian::DatabaseError exception.
*
* Calling close() again on a database which has already been closed
* has no effect (and doesn't raise an exception).
*
* After close() has been called, calls to other methods of the
* database, and to methods of other objects associated with the
* database, will either:
*
* - behave exactly as they would have done if the database had not
* been closed (this can only happen if all the required data is
* cached)
*
* - raise a Xapian::DatabaseError exception indicating that the
* database is closed.
*
* The reason for this behaviour is that otherwise we'd have to check
* that the database is still open on every method call on every
* object associated with a Database, when in many cases they are
* working on data which has already been loaded and so they are able
* to just behave correctly.
*
* This method was added in Xapian 1.1.0.
*/
virtual void close();
/// Return a string describing this object.
virtual std::string get_description() const;
/** An iterator pointing to the start of the postlist
* for a given term.
*
* @param tname The termname to iterate postings for. If the
* term name is the empty string, the iterator
* returned will list all the documents in the
* database. Such an iterator will always return
* a WDF value of 1, since there is no obvious
* meaning for this quantity in this case.
*/
PostingIterator postlist_begin(const std::string &tname) const;
/** Corresponding end iterator to postlist_begin().
*/
PostingIterator XAPIAN_NOTHROW(postlist_end(const std::string &) const) {
return PostingIterator();
}
/** An iterator pointing to the start of the termlist
* for a given document.
*
* @param did The document id of the document to iterate terms for.
*/
TermIterator termlist_begin(Xapian::docid did) const;
/** Corresponding end iterator to termlist_begin().
*/
TermIterator XAPIAN_NOTHROW(termlist_end(Xapian::docid) const) {
return TermIterator();
}
/** Does this database have any positional information? */
bool has_positions() const;
/** An iterator pointing to the start of the position list
* for a given term in a given document.
*/
PositionIterator positionlist_begin(Xapian::docid did, const std::string &tname) const;
/** Corresponding end iterator to positionlist_begin().
*/
PositionIterator XAPIAN_NOTHROW(positionlist_end(Xapian::docid, const std::string &) const) {
return PositionIterator();
}
/** An iterator which runs across all terms with a given prefix.
*
* @param prefix The prefix to restrict the returned terms to (default:
* iterate all terms)
*/
TermIterator allterms_begin(const std::string & prefix = std::string()) const;
/** Corresponding end iterator to allterms_begin(prefix).
*/
TermIterator XAPIAN_NOTHROW(allterms_end(const std::string & = std::string()) const) {
return TermIterator();
}
/// Get the number of documents in the database.
Xapian::doccount get_doccount() const;
/// Get the highest document id which has been used in the database.
Xapian::docid get_lastdocid() const;
/// Get the average length of the documents in the database.
Xapian::doclength get_avlength() const;
/// Get the number of documents in the database indexed by a given term.
Xapian::doccount get_termfreq(const std::string & tname) const;
/** Check if a given term exists in the database.
*
* @param tname The term to test the existence of.
*
* @return true if and only if the term exists in the database.
* This is the same as (get_termfreq(tname) != 0), but
* will often be more efficient.
*/
bool term_exists(const std::string & tname) const;
/** Return the total number of occurrences of the given term.
*
* This is the sum of the number of occurrences of the term in each
* document it indexes: i.e., the sum of the within document
* frequencies of the term.
*
* @param tname The term whose collection frequency is being
* requested.
*/
Xapian::termcount get_collection_freq(const std::string & tname) const;
/** Return the frequency of a given value slot.
*
* This is the number of documents which have a (non-empty) value
* stored in the slot.
*
* @param slot The value slot to examine.
*
* @exception UnimplementedError The frequency of the value isn't
* available for this database type.
*/
Xapian::doccount get_value_freq(Xapian::valueno slot) const;
/** Get a lower bound on the values stored in the given value slot.
*
* If there are no values stored in the given value slot, this will
* return an empty string.
*
* If the lower bound isn't available for the given database type,
* this will return the lowest possible bound - the empty string.
*
* @param slot The value slot to examine.
*/
std::string get_value_lower_bound(Xapian::valueno slot) const;
/** Get an upper bound on the values stored in the given value slot.
*
* If there are no values stored in the given value slot, this will
* return an empty string.
*
* @param slot The value slot to examine.
*
* @exception UnimplementedError The upper bound of the values isn't
* available for this database type.
*/
std::string get_value_upper_bound(Xapian::valueno slot) const;
/** Get a lower bound on the length of a document in this DB.
*
* This bound does not include any zero-length documents.
*/
Xapian::termcount get_doclength_lower_bound() const;
/// Get an upper bound on the length of a document in this DB.
Xapian::termcount get_doclength_upper_bound() const;
/// Get an upper bound on the wdf of term @a term.
Xapian::termcount get_wdf_upper_bound(const std::string & term) const;
/// Return an iterator over the value in slot @a slot for each document.
ValueIterator valuestream_begin(Xapian::valueno slot) const;
/// Return end iterator corresponding to valuestream_begin().
ValueIterator XAPIAN_NOTHROW(valuestream_end(Xapian::valueno) const) {
return ValueIterator();
}
/// Get the length of a document.
Xapian::termcount get_doclength(Xapian::docid did) const;
/// Get the number of unique terms in document.
Xapian::termcount get_unique_terms(Xapian::docid did) const;
/** Send a "keep-alive" to remote databases to stop them timing out.
*
* Has no effect on non-remote databases.
*/
void keep_alive();
/** Get a document from the database, given its document id.
*
* This method returns a Xapian::Document object which provides the
* information about a document.
*
* @param did The document id of the document to retrieve.
*
* @return A Xapian::Document object containing the document data
*
* @exception Xapian::DocNotFoundError The document specified
* could not be found in the database.
*
* @exception Xapian::InvalidArgumentError did was 0, which is not
* a valid document id.
*/
Xapian::Document get_document(Xapian::docid did) const;
/** Suggest a spelling correction.
*
* @param word The potentially misspelled word.
* @param max_edit_distance Only consider words which are at most
* @a max_edit_distance edits from @a word. An edit is a
* character insertion, deletion, or the transposition of two
* adjacent characters (default is 2).
*/
std::string get_spelling_suggestion(const std::string &word,
unsigned max_edit_distance = 2) const;
/** An iterator which returns all the spelling correction targets.
*
* This returns all the words which are considered as targets for the
* spelling correction algorithm. The frequency of each word is
* available as the term frequency of each entry in the returned
* iterator.
*/
Xapian::TermIterator spellings_begin() const;
/// Corresponding end iterator to spellings_begin().
Xapian::TermIterator XAPIAN_NOTHROW(spellings_end() const) {
return Xapian::TermIterator();
}
/** An iterator which returns all the synonyms for a given term.
*
* @param term The term to return synonyms for.
*/
Xapian::TermIterator synonyms_begin(const std::string &term) const;
/// Corresponding end iterator to synonyms_begin(term).
Xapian::TermIterator XAPIAN_NOTHROW(synonyms_end(const std::string &) const) {
return Xapian::TermIterator();
}
/** An iterator which returns all terms which have synonyms.
*
* @param prefix If non-empty, only terms with this prefix are
* returned.
*/
Xapian::TermIterator synonym_keys_begin(const std::string &prefix = std::string()) const;
/// Corresponding end iterator to synonym_keys_begin(prefix).
Xapian::TermIterator XAPIAN_NOTHROW(synonym_keys_end(const std::string & = std::string()) const) {
return Xapian::TermIterator();
}
/** Get the user-specified metadata associated with a given key.
*
* User-specified metadata allows you to store arbitrary information
* in the form of (key,tag) pairs. See @a
* WritableDatabase::set_metadata() for more information.
*
* When invoked on a Xapian::Database object representing multiple
* databases, currently only the metadata for the first is considered
* but this behaviour may change in the future.
*
* If there is no piece of metadata associated with the specified
* key, an empty string is returned (this applies even for backends
* which don't support metadata).
*
* Empty keys are not valid, and specifying one will cause an
* exception.
*
* @param key The key of the metadata item to access.
*
* @return The retrieved metadata item's value.
*
* @exception Xapian::InvalidArgumentError will be thrown if the
* key supplied is empty.
*/
std::string get_metadata(const std::string & key) const;
/** An iterator which returns all user-specified metadata keys.
*
* When invoked on a Xapian::Database object representing multiple
* databases, currently only the metadata for the first is considered
* but this behaviour may change in the future.
*
* If the backend doesn't support metadata, then this method returns
* an iterator which compares equal to that returned by
* metadata_keys_end().
*
* @param prefix If non-empty, only keys with this prefix are
* returned.
*
* @exception Xapian::UnimplementedError will be thrown if the
* backend implements user-specified metadata, but
* doesn't implement iterating its keys (currently
* this happens for the InMemory backend).
*/
Xapian::TermIterator metadata_keys_begin(const std::string &prefix = std::string()) const;
/// Corresponding end iterator to metadata_keys_begin().
Xapian::TermIterator XAPIAN_NOTHROW(metadata_keys_end(const std::string & = std::string()) const) {
return Xapian::TermIterator();
}
/** Get a UUID for the database.
*
* The UUID will persist for the lifetime of the database.
*
* Replicas (eg, made with the replication protocol, or by copying all
* the database files) will have the same UUID. However, copies (made
* with copydatabase, or xapian-compact) will have different UUIDs.
*
* If the backend does not support UUIDs or this database has no
* subdatabases, the UUID will be empty.
*
* If this database has multiple sub-databases, the UUID string will
* contain the UUIDs of all the sub-databases.
*/
std::string get_uuid() const;
/** Check the integrity of a database or database table.
*
* This method is currently experimental, and may change incompatibly
* or possibly even be removed. Feedback on how well it works and
* how it might be improved are welcome.
*
* @param path Path to database or table
* @param opts Options to use for check
* @param out std::ostream to write output to (NULL for no output)
*/
static size_t check(const std::string & path, int opts = 0,
std::ostream *out = NULL) {
return check_(&path, 0, opts, out);
}
/** Check the integrity of a single file database.
*
* This method is currently experimental, and may change incompatibly
* or possibly even be removed. Feedback on how well it works and
* how it might be improved are welcome.
*
* @param fd file descriptor for the database. The current file
* offset is used, allowing checking a single file
* database which is embedded within another file. Xapian
* takes ownership of the file descriptor and will close
* it before returning.
* @param opts Options to use for check
* @param out std::ostream to write output to (NULL for no output)
*/
static size_t check(int fd, int opts = 0, std::ostream *out = NULL) {
return check_(NULL, fd, opts, out);
}
/** Produce a compact version of this database.
*
* New 1.3.4. Various methods of the Compactor class were deprecated
* in 1.3.4.
*
* @param output Path to write the compact version to.
* This can be the same as an input if that input is a
* stub database (in which case the database(s) listed
* in the stub will be compacted to a new database and
* then the stub will be atomically updated to point to
* this new database).
*
* @param flags Any of the following combined using bitwise-or (| in
* C++):
* - Xapian::DBCOMPACT_NO_RENUMBER By default the document ids will
* be renumbered the output - currently by applying the
* same offset to all the document ids in a particular
* source database. If this flag is specified, then this
* renumbering doesn't happen, but all the document ids
* must be unique over all source databases. Currently
* the ranges of document ids in each source must not
* overlap either, though this restriction may be removed
* in the future.
* - Xapian::DBCOMPACT_MULTIPASS
* If merging more than 3 databases, merge the postlists
* in multiple passes, which is generally faster but
* requires more disk space for temporary files.
* - Xapian::DBCOMPACT_SINGLE_FILE
* Produce a single-file database (only supported for
* glass currently).
*
* @param block_size This specifies the block size (in bytes) for
* to use for the output. For glass, the block size must
* be a power of 2 between 2048 and 65536 (inclusive), and
* the default (also used if an invalid value is passed)
* is 8192 bytes.
*/
void compact(const std::string & output,
unsigned flags = 0,
int block_size = 0) {
compact_(&output, 0, flags, block_size, NULL);
}
/** Produce a compact version of this database.
*
* New 1.3.4. Various methods of the Compactor class were deprecated
* in 1.3.4.
*
* This variant writes a single-file database to the specified file
* descriptor. Only the glass backend supports such databases, so
* this form is only supported for this backend.
*
* @param fd File descriptor to write the compact version to. The
* descriptor needs to be readable and writable (open with
* O_RDWR) and seekable. The current file offset is used,
* allowing compacting to a single file database embedded
* within another file. Xapian takes ownership of the
* file descriptor and will close it before returning.
*
* @param flags Any of the following combined using bitwise-or (| in
* C++):
* - Xapian::DBCOMPACT_NO_RENUMBER By default the document ids will
* be renumbered the output - currently by applying the
* same offset to all the document ids in a particular
* source database. If this flag is specified, then this
* renumbering doesn't happen, but all the document ids
* must be unique over all source databases. Currently
* the ranges of document ids in each source must not
* overlap either, though this restriction may be removed
* in the future.
* - Xapian::DBCOMPACT_MULTIPASS
* If merging more than 3 databases, merge the postlists
* in multiple passes, which is generally faster but
* requires more disk space for temporary files.
* - Xapian::DBCOMPACT_SINGLE_FILE
* Produce a single-file database (only supported for
* glass currently) - this flag is implied in this form
* and need not be specified explicitly.
*
* @param block_size This specifies the block size (in bytes) for
* to use for the output. For glass, the block size must
* be a power of 2 between 2048 and 65536 (inclusive), and
* the default (also used if an invalid value is passed)
* is 8192 bytes.
*/
void compact(int fd,
unsigned flags = 0,
int block_size = 0) {
compact_(NULL, fd, flags, block_size, NULL);
}
/** Produce a compact version of this database.
*
* New 1.3.4. Various methods of the Compactor class were deprecated
* in 1.3.4.
*
* The @a compactor functor allows handling progress output and
* specifying how user metadata is merged.
*
* @param output Path to write the compact version to.
* This can be the same as an input if that input is a
* stub database (in which case the database(s) listed
* in the stub will be compacted to a new database and
* then the stub will be atomically updated to point to
* this new database).
*
* @param flags Any of the following combined using bitwise-or (| in
* C++):
* - Xapian::DBCOMPACT_NO_RENUMBER By default the document ids will
* be renumbered the output - currently by applying the
* same offset to all the document ids in a particular
* source database. If this flag is specified, then this
* renumbering doesn't happen, but all the document ids
* must be unique over all source databases. Currently
* the ranges of document ids in each source must not
* overlap either, though this restriction may be removed
* in the future.
* - Xapian::DBCOMPACT_MULTIPASS
* If merging more than 3 databases, merge the postlists
* in multiple passes, which is generally faster but
* requires more disk space for temporary files.
* - Xapian::DBCOMPACT_SINGLE_FILE
* Produce a single-file database (only supported for
* glass currently).
*
* @param block_size This specifies the block size (in bytes) for
* to use for the output. For glass, the block size must
* be a power of 2 between 2048 and 65536 (inclusive), and
* the default (also used if an invalid value is passed)
* is 8192 bytes.
*
* @param compactor Functor
*/
void compact(const std::string & output,
unsigned flags,
int block_size,
Xapian::Compactor & compactor)
{
compact_(&output, 0, flags, block_size, &compactor);
}
/** Produce a compact version of this database.
*
* New 1.3.4. Various methods of the Compactor class were deprecated
* in 1.3.4.
*
* The @a compactor functor allows handling progress output and
* specifying how user metadata is merged.
*
* This variant writes a single-file database to the specified file
* descriptor. Only the glass backend supports such databases, so
* this form is only supported for this backend.
*
* @param fd File descriptor to write the compact version to. The
* descriptor needs to be readable and writable (open with
* O_RDWR) and seekable. The current file offset is used,
* allowing compacting to a single file database embedded
* within another file. Xapian takes ownership of the
* file descriptor and will close it before returning.
*
* @param flags Any of the following combined using bitwise-or (| in
* C++):
* - Xapian::DBCOMPACT_NO_RENUMBER By default the document ids will
* be renumbered the output - currently by applying the
* same offset to all the document ids in a particular
* source database. If this flag is specified, then this
* renumbering doesn't happen, but all the document ids
* must be unique over all source databases. Currently
* the ranges of document ids in each source must not
* overlap either, though this restriction may be removed
* in the future.
* - Xapian::DBCOMPACT_MULTIPASS
* If merging more than 3 databases, merge the postlists
* in multiple passes, which is generally faster but
* requires more disk space for temporary files.
* - Xapian::DBCOMPACT_SINGLE_FILE
* Produce a single-file database (only supported for
* glass currently) - this flag is implied in this form
* and need not be specified explicitly.
*
* @param block_size This specifies the block size (in bytes) for
* to use for the output. For glass, the block size must
* be a power of 2 between 2048 and 65536 (inclusive), and
* the default (also used if an invalid value is passed)
* is 8192 bytes.
*
* @param compactor Functor
*/
void compact(int fd,
unsigned flags,
int block_size,
Xapian::Compactor & compactor)
{
compact_(NULL, fd, flags, block_size, &compactor);
}
};
/** This class provides read/write access to a database.
*/
class XAPIAN_VISIBILITY_DEFAULT WritableDatabase : public Database {
public:
/** Destroy this handle on the database.
*
* If no other handles to this database remain, the database will be
* closed.
*
* If a transaction is active cancel_transaction() will be implicitly
* called; if no transaction is active commit() will be implicitly
* called, but any exception will be swallowed (because throwing
* exceptions in C++ destructors is problematic). If you aren't using
* transactions and want to know about any failure to commit changes,
* call commit() explicitly before the destructor gets called.
*/
virtual ~WritableDatabase();
/** Create a WritableDatabase with no subdatabases.
*
* The created object isn't very useful in this state - it's intended
* as a placeholder value.
*/
WritableDatabase();
/** Open a database for update, automatically determining the database
* backend to use.
*
* If the database is to be created, Xapian will try
* to create the directory indicated by path if it doesn't already
* exist (but only the leaf directory, not recursively).
*
* @param path directory that the database is stored in.
* @param flags one of:
* - Xapian::DB_CREATE_OR_OPEN open for read/write; create if no db
* exists (the default if flags isn't specified)
* - Xapian::DB_CREATE create new database; fail if db exists
* - Xapian::DB_CREATE_OR_OVERWRITE overwrite existing db; create if
* none exists
* - Xapian::DB_OPEN open for read/write; fail if no db exists
*
* Additionally, the following flags can be combined with action
* using bitwise-or (| in C++):
*
* - Xapian::DB_NO_SYNC don't call fsync() or similar
* - Xapian::DB_DANGEROUS don't be crash-safe, no concurrent readers
* - Xapian::DB_RETRY_LOCK to wait to get a write lock
*
* @param block_size If a new database is created, this specifies
* the block size (in bytes) for backends which
* have such a concept. For chert and glass, the
* block size must be a power of 2 between 2048 and
* 65536 (inclusive), and the default (also used if
* an invalid value is passed) is 8192 bytes.
*
* @exception Xapian::DatabaseCorruptError will be thrown if the
* database is in a corrupt state.
*
* @exception Xapian::DatabaseLockError will be thrown if a lock
* couldn't be acquired on the database.
*/
explicit WritableDatabase(const std::string &path,
int flags = 0,
int block_size = 0);
/** @private @internal Create an WritableDatabase given its internals.
*/
explicit WritableDatabase(Database::Internal *internal);
/** Copying is allowed. The internals are reference counted, so
* copying is cheap.
*
* @param other The object to copy.
*/
WritableDatabase(const WritableDatabase &other);
/** Assignment is allowed. The internals are reference counted,
* so assignment is cheap.
*
* Note that only an WritableDatabase may be assigned to an
* WritableDatabase: an attempt to assign a Database is caught
* at compile-time.
*
* @param other The object to copy.
*/
void operator=(const WritableDatabase &other);
/** Commit any pending modifications made to the database.
*
* For efficiency reasons, when performing multiple updates to a
* database it is best (indeed, almost essential) to make as many
* modifications as memory will permit in a single pass through
* the database. To ensure this, Xapian batches up modifications.
*
* This method may be called at any time to commit any pending
* modifications to the database.
*
* If any of the modifications fail, an exception will be thrown and
* the database will be left in a state in which each separate
* addition, replacement or deletion operation has either been fully
* performed or not performed at all: it is then up to the
* application to work out which operations need to be repeated.
*
* It's not valid to call commit() within a transaction.
*
* Beware of calling commit() too frequently: this will make indexing
* take much longer.
*
* Note that commit() need not be called explicitly: it will be called
* automatically when the database is closed, or when a sufficient
* number of modifications have been made. By default, this is every
* 10000 documents added, deleted, or modified. This value is rather
* conservative, and if you have a machine with plenty of memory,
* you can improve indexing throughput dramatically by setting
* XAPIAN_FLUSH_THRESHOLD in the environment to a larger value.
*
* This method was new in Xapian 1.1.0 - in earlier versions it was
* called flush().
*
* @exception Xapian::DatabaseError will be thrown if a problem occurs
* while modifying the database.
*
* @exception Xapian::DatabaseCorruptError will be thrown if the
* database is in a corrupt state.
*/
void commit();
/** Pre-1.1.0 name for commit().
*
* Use commit() instead in new code. This alias may be deprecated in
* the future.
*/
void flush() { commit(); }
/** Begin a transaction.
*
* In Xapian a transaction is a group of modifications to the database
* which are linked such that either all will be applied
* simultaneously or none will be applied at all. Even in the case of
* a power failure, this characteristic should be preserved (as long
* as the filesystem isn't corrupted, etc).
*
* A transaction is started with begin_transaction() and can
* either be committed by calling commit_transaction() or aborted
* by calling cancel_transaction().
*
* By default, a transaction implicitly calls commit() before and
* after so that the modifications stand and fall without affecting
* modifications before or after.
*
* The downside of these implicit calls to commit() is that small
* transactions can harm indexing performance in the same way that
* explicitly calling commit() frequently can.
*
* If you're applying atomic groups of changes and only wish to
* ensure that each group is either applied or not applied, then
* you can prevent the automatic commit() before and after the
* transaction by starting the transaction with
* begin_transaction(false). However, if cancel_transaction is
* called (or if commit_transaction isn't called before the
* WritableDatabase object is destroyed) then any changes which
* were pending before the transaction began will also be discarded.
*
* Transactions aren't currently supported by the InMemory backend.
*
* @param flushed Is this a flushed transaction? By default
* transactions are "flushed", which means that
* committing a transaction will ensure those
* changes are permanently written to the
* database. By contrast, unflushed transactions
* only ensure that changes within the transaction
* are either all applied or all aren't.
*
* @exception Xapian::UnimplementedError will be thrown if transactions
* are not available for this database type.
*
* @exception Xapian::InvalidOperationError will be thrown if this is
* called at an invalid time, such as when a transaction
* is already in progress.
*/
void begin_transaction(bool flushed=true);
/** Complete the transaction currently in progress.
*
* If this method completes successfully and this is a flushed
* transaction, all the database modifications
* made during the transaction will have been committed to the
* database.
*
* If an error occurs, an exception will be thrown, and none of
* the modifications made to the database during the transaction
* will have been applied to the database.
*
* In all cases the transaction will no longer be in progress.
*
* @exception Xapian::DatabaseError will be thrown if a problem occurs
* while modifying the database.
*
* @exception Xapian::DatabaseCorruptError will be thrown if the
* database is in a corrupt state.
*
* @exception Xapian::InvalidOperationError will be thrown if a
* transaction is not currently in progress.
*
* @exception Xapian::UnimplementedError will be thrown if transactions
* are not available for this database type.
*/
void commit_transaction();
/** Abort the transaction currently in progress, discarding the
* pending modifications made to the database.
*
* If an error occurs in this method, an exception will be thrown,
* but the transaction will be cancelled anyway.
*
* @exception Xapian::DatabaseError will be thrown if a problem occurs
* while modifying the database.
*
* @exception Xapian::DatabaseCorruptError will be thrown if the
* database is in a corrupt state.
*
* @exception Xapian::InvalidOperationError will be thrown if a
* transaction is not currently in progress.
*
* @exception Xapian::UnimplementedError will be thrown if transactions
* are not available for this database type.
*/
void cancel_transaction();
/** Add a new document to the database.
*
* This method adds the specified document to the database,
* returning a newly allocated document ID. Automatically allocated
* document IDs come from a per-database monotonically increasing
* counter, so IDs from deleted documents won't be reused.
*
* If you want to specify the document ID to be used, you should
* call replace_document() instead.
*
* Note that changes to the database won't be immediately committed to
* disk; see commit() for more details.
*
* As with all database modification operations, the effect is
* atomic: the document will either be fully added, or the document
* fails to be added and an exception is thrown (possibly at a
* later time when commit() is called or the database is closed).
*
* @param document The new document to be added.
*
* @return The document ID of the newly added document.
*
* @exception Xapian::DatabaseError will be thrown if a problem occurs
* while writing to the database.
*
* @exception Xapian::DatabaseCorruptError will be thrown if the
* database is in a corrupt state.
*/
Xapian::docid add_document(const Xapian::Document & document);
/** Delete a document from the database.
*
* This method removes the document with the specified document ID
* from the database.
*
* Note that changes to the database won't be immediately committed to
* disk; see commit() for more details.
*
* As with all database modification operations, the effect is
* atomic: the document will either be fully removed, or the document
* fails to be removed and an exception is thrown (possibly at a
* later time when commit() is called or the database is closed).
*
* @param did The document ID of the document to be removed.
*
* @exception Xapian::DatabaseError will be thrown if a problem occurs
* while writing to the database.
*
* @exception Xapian::DatabaseCorruptError will be thrown if the
* database is in a corrupt state.
*/
void delete_document(Xapian::docid did);
/** Delete any documents indexed by a term from the database.
*
* This method removes any documents indexed by the specified term
* from the database.
*
* A major use is for convenience when UIDs from another system are
* mapped to terms in Xapian, although this method has other uses
* (for example, you could add a "deletion date" term to documents at
* index time and use this method to delete all documents due for
* deletion on a particular date).
*
* @param unique_term The term to remove references to.
*
* @exception Xapian::DatabaseError will be thrown if a problem occurs
* while writing to the database.
*
* @exception Xapian::DatabaseCorruptError will be thrown if the
* database is in a corrupt state.
*/
void delete_document(const std::string & unique_term);
/** Replace a given document in the database.
*
* This method replaces the document with the specified document ID.
* If document ID @a did isn't currently used, the document will be
* added with document ID @a did.
*
* The monotonic counter used for automatically allocating document
* IDs is increased so that the next automatically allocated document
* ID will be did + 1. Be aware that if you use this method to
* specify a high document ID for a new document, and also use
* WritableDatabase::add_document(), Xapian may get to a state where
* this counter wraps around and will be unable to automatically
* allocate document IDs!
*
* Note that changes to the database won't be immediately committed to
* disk; see commit() for more details.
*
* As with all database modification operations, the effect is
* atomic: the document will either be fully replaced, or the document
* fails to be replaced and an exception is thrown (possibly at a
* later time when commit() is called or the database is closed).
*
* @param did The document ID of the document to be replaced.
* @param document The new document.
*
* @exception Xapian::DatabaseError will be thrown if a problem occurs
* while writing to the database.
*
* @exception Xapian::DatabaseCorruptError will be thrown if the
* database is in a corrupt state.
*/
void replace_document(Xapian::docid did,
const Xapian::Document & document);
/** Replace any documents matching a term.
*
* This method replaces any documents indexed by the specified term
* with the specified document. If any documents are indexed by the
* term, the lowest document ID will be used for the document,
* otherwise a new document ID will be generated as for add_document.
*
* One common use is to allow UIDs from another system to easily be
* mapped to terms in Xapian. Note that this method doesn't
* automatically add unique_term as a term, so you'll need to call
* document.add_term(unique_term) first when using replace_document()
* in this way.
*
* Note that changes to the database won't be immediately committed to
* disk; see commit() for more details.
*
* As with all database modification operations, the effect is
* atomic: the document(s) will either be fully replaced, or the
* document(s) fail to be replaced and an exception is thrown
* (possibly at a
* later time when commit() is called or the database is closed).
*
* @param unique_term The "unique" term.
* @param document The new document.
*
* @return The document ID that document was given.
*
* @exception Xapian::DatabaseError will be thrown if a problem occurs
* while writing to the database.
*
* @exception Xapian::DatabaseCorruptError will be thrown if the
* database is in a corrupt state.
*/
Xapian::docid replace_document(const std::string & unique_term,
const Xapian::Document & document);
/** Add a word to the spelling dictionary.
*
* If the word is already present, its frequency is increased.
*
* @param word The word to add.
* @param freqinc How much to increase its frequency by (default 1).
*/
void add_spelling(const std::string & word,
Xapian::termcount freqinc = 1) const;
/** Remove a word from the spelling dictionary.
*
* The word's frequency is decreased, and if would become zero or less
* then the word is removed completely.
*
* @param word The word to remove.
* @param freqdec How much to decrease its frequency by (default 1).
*/
void remove_spelling(const std::string & word,
Xapian::termcount freqdec = 1) const;
/** Add a synonym for a term.
*
* @param term The term to add a synonym for.
* @param synonym The synonym to add. If this is already a
* synonym for @a term, then no action is taken.
*/
void add_synonym(const std::string & term,
const std::string & synonym) const;
/** Remove a synonym for a term.
*
* @param term The term to remove a synonym for.
* @param synonym The synonym to remove. If this isn't currently
* a synonym for @a term, then no action is taken.
*/
void remove_synonym(const std::string & term,
const std::string & synonym) const;
/** Remove all synonyms for a term.
*
* @param term The term to remove all synonyms for. If the
* term has no synonyms, no action is taken.
*/
void clear_synonyms(const std::string & term) const;
/** Set the user-specified metadata associated with a given key.
*
* This method sets the metadata value associated with a given key.
* If there is already a metadata value stored in the database with
* the same key, the old value is replaced. If you want to delete an
* existing item of metadata, just set its value to the empty string.
*
* User-specified metadata allows you to store arbitrary information
* in the form of (key,tag) pairs.
*
* There's no hard limit on the number of metadata items, or the size
* of the metadata values. Metadata keys have a limited length, which
* depends on the backend. We recommend limiting them to 200 bytes.
* Empty keys are not valid, and specifying one will cause an
* exception.
*
* Metadata modifications are committed to disk in the same way as
* modifications to the documents in the database are: i.e.,
* modifications are atomic, and won't be committed to disk
* immediately (see commit() for more details). This allows metadata
* to be used to link databases with versioned external resources
* by storing the appropriate version number in a metadata item.
*
* You can also use the metadata to store arbitrary extra information
* associated with terms, documents, or postings by encoding the
* termname and/or document id into the metadata key.
*
* @param key The key of the metadata item to set.
*
* @param value The value of the metadata item to set.
*
* @exception Xapian::DatabaseError will be thrown if a problem occurs
* while writing to the database.
*
* @exception Xapian::DatabaseCorruptError will be thrown if the
* database is in a corrupt state.
*
* @exception Xapian::InvalidArgumentError will be thrown if the
* key supplied is empty.
*
* @exception Xapian::UnimplementedError will be thrown if the
* database backend in use doesn't support user-specified
* metadata.
*/
void set_metadata(const std::string & key, const std::string & value);
/// Return a string describing this object.
std::string get_description() const;
};
}
#endif /* XAPIAN_INCLUDED_DATABASE_H */
|