contentWEB documentation – version 3.8

7.1.Search query language

The search query language is a computer language used to retrieve documents from the database regarding the specified properties. The search query language used in any user interface of contentACCESS can be divided into following categories:

Source specification
The searching user can specify where to search on different levels: tenant, model, entity
Tenant:(string) – select a tenant by name; search in tenants having the specified string in name
MTID:(string) – select a model by type identifier (EmailArchive, FileSystemArchive, SharePointArchive)
Source:(string) – select a model by keyword; search in models having the specified string as a keyword (email, file, sharepoint). This is similar as the MTID mentioned, but accepts more free model specification. Possible values are:

  • For FileSystemArchive: file, fs, filesystem, archive
  • For EmailArchive: archive, email, mail, mailarchive, emailarchive

Examples:

  • source:file
  • source:mail

Entity:(string) – select one or more entities by name; search in entities having the specified string in name. Entity name is mailbox address in Email archive and Root folder path in File system archive.
Examples:

  • entity:abal@tech-arrow.com – search in ABAL’s mailbox
  • entity:c:\temp – search in c:\temp folder

Property value specification
The following properties can be used to specify conditions on documents to be returned as result when searching the archive:

Date
Applicable only for properties of “date” type. Exact date specification has to be in format YYYY-MM-DD (no hours, minutes, seconds can be specified).
Example:

  • date:(2016-12-05)

Available placeholders: now – means this hour; today, yesterday, this week, last week, this month, last month, this year, last year
Example:

  • date:(now), date:(last week)

Number
Numbers are written as usually (1, 2, 3…). For the size conditions also units can be specified:
K | KB – size in kilobytes
M | MB – size in megabytes
G | GB – size in gigabytes
T | TB – size in terabytes

Example:

  • size:(>1K) – files or emails (depending on the archive) larger than 1 KB

Range
Two types of ranges can be specified: numerical and date ranges. Ranges can be upper bound, lower bound or an interval. A range can be specified as a value for all properties of type “date” and “number”.
Prop:(>value) – the value of property “Prop” is greater than “value”
Prop:(< value) – the value of property “Prop” is less than “value”
Prop:(value1, value2) – the value of property “Prop” is greater than “value1” and less than “value2”
Examples:

  • size:(1K, 1M) – files/emails (depending on the archive) larger than 1KB and smaller than 1MB
  • date:(2016-10, 2016-12) – files created/modified or emails sent (depending on the archive) in the last quarter of 2016

Filename
Finds items by attachment name (Email archive) or file name (File archive). Wildcard characters can be used for filename pattern specification (* or ?). They have the same meaning as when searching for files in Windows.
Filename:(*.txt) – this will find all attachments and files having the extension .txt
Filename:(file) – this will find attachments and files having the exact name “file”
Filename:(file.*) – this will find attachments and files named “file” of any type (extension)

Boolean queries
Boolean query is a search type that allows you to combine desired keywords with operators like AND and OR to get more specific results.

Operator AND
This operator will narrow your search down to items containing only the words separated by it. Every blank space has the same meaning as the AND operator.
Example (both will do the same):

  • content AND access AND email AND archive
  • content access email archive

Operator OR
This operator, on the other hand, expands your search by connecting multiple phrases. The OR operator works like “at least one phrase from the entered must be present”. It means that the search will return results containing one of the selected phrases, two, three…or even all.
Example:

  • content OR access – finds all item containing “content” or “access” or “content access”

Grouping
Multiple terms or clauses can be grouped together by using parentheses “( )” to form sub-queries, for example:

  • (email OR file) AND archive – the returned results must contain at least one of the following: email archive, file archive

Regular expressions
Regular expression (regexp) is a sequence of characters defining a search pattern. This pattern is then often use to “find” or to “find and replace” strings. Regular expressions can be specified in search query by using double asterisk prefix:
**

Regular expressions can be used for property queries, but also for free text queries.

Standard operators
Anchoring
It is possible to define the start and end on a string for your regexp pattern, but it needs to be anchored specifically. The symbol ^ indicates the beginning, while the $ symbol indicates the end.

Patterns are always anchored by default. The provided pattern must match the entire string. For example, for string “abcde”:

  • ab.* = match
  • abcd = no match

Allowed characters
Any Unicode character may be used in the pattern, but there are some exceptions that are reserved and must be escaped. The standard reserved characters are:

  • . ? + * | { } [ ] ( ) ” \

If you enable optional features (described in this section), then the following characters may also be reserved:

  • # @ & < > ~
Note: Any reserved character can be escaped using a backslash “\*”, including a backslash character itself: “\\”.

Any character (except double quotes) is interpreted literally when bounded by double quotes:

  • john”@smith.com”

Match any character
The period symbol “.” can be used to represent any character. The string “abcde” can be found like this:

  • ab… = match
  • a.c.e = match

Once or more
The plus symbol “+” can be used to repeat the preceding pattern once or multiple times. The string “aaabbb” can be found like this:

  • a+b+ = match
  • aa+bb+ = match
  • a+.+ = match
  • aa+bbb+ = match

Zero or more times
The asterisk symbol “*” can be used to match the preceding pattern zero or more times. The string “aaabbb” can be found like this:

  • a*b* = match
  • a*b*c* = match
  • .*bbb.* = match
  • aaa*bbb* = match

Zero times or once
The question mark “?” makes the preceding pattern optional, so it can matches zero times or once. The string “aaabbb” can be found like this:

  • aaa?bbb? = match
  • aaaa?bbbb? = match
  • …..?.? = match
  • aa?bb? = no match

Minimum to maximum
Curly brackets “{}” can be used to specify a minimum and also maximum number of times the preceding shortest pattern can be repeated. The allowed forms are:

  • {5} – the pattern repeats exactly 5 times
  • {2,5} – the pattern repeats 2 to 5 times
  • {2,} – the pattern repeats at least twice

For string “aaabbb”, the following applies:

  • a{3}b{3} = match
  • a{2,4}b{2,4} = match
  • a{2,}b{2,} = match
  • .{3}.{3} = match
  • a{4}b{4} = no match
  • a{4,6}b{4,6} = no match
  • a{4,}b{4,} = no match

Grouping
By using parentheses “()”, it is possible to form sub-patterns. The quantity operators listed above operate on the shortest previous pattern, which can also be a group. For string “ababab”, the following applies:

  • (ab)+ = match
  • ab(ab)+ = match
  • (..)+ = match
  • (…)+ = no match
  • (ab)* = match
  • abab(ab)? = match
  • ab(ab)? = no match
  • (ab){3} = match
  • (ab){1,2} = no match

Alternation
The pipe symbol “|” works the same as the OR operator mentioned above in this section. The match will be successful if the pattern on either the left side OR the right side matches. Alternation applies to the longest pattern. For string “aabb”, the following applies:

  • aabb|bbaa = match
  • aacc|bb = no match
  • aa(cc|bb) = match
  • a+|b+ = no match
  • a+b+|b+a+ = match
  • a+(b|c)+ = match

Character classes
Ranges of characters may be specified as character classes, by being enclosed in square brackets “[]”. A leading ^ symbol negates the character class. The following forms are allowed:

  • [abc] = ‘a’ or ‘b’ or ‘c’
  • [a-c] = ‘a’ or ‘b’ or ‘c’
  • [-abc] = ‘-‘ or ‘a’ or ‘b’ or ‘c’
  • [abc\-] = ‘-‘ or ‘a’ or ‘b’ or ‘c’
  • [^abc] = any character except ‘a’ or ‘b’ or ‘c’
  • [^a-c] = any character except ‘a’ or ‘b’ or ‘c’
  • [^-abc] = any character except ‘-‘ or ‘a’ or ‘b’ or ‘c’
  • [^abc\-] = any character except ‘-‘ or ‘a’ or ‘b’ or ‘c’
Note: The dash “-” indicates a range of characters, except when it is the first character or when it is escaped with a backslash.

For string “abcd”, the following applies:

  • ab[cd]+ = match
  • [a-d]+ = match
  • [^a-d]+ = no match

Optional operators
Complement
Complement is probably the most used and helpful option. The shortest pattern that comes after a tilde symbol “~” is negated. For example, `”ab~cd” means:

  • Starts with a
  • a is followed by b
  • b is followed by a string of any length that is anything, except c
  • Ends with d

For the string “abcdef”, the following applies:

  • ab~df = match
  • ab~cf = match
  • ab~cdef = no match
  • a~(cb)def = match
  • a~(bc)def = no match

Interval
The interval option enables the use of numeric ranges. The ranges have to be always enclosed by angle brackets “< >“. For string “access90”, the following applies:

  • access<1-100> = match
  • access<01-100> = match
  • access<001-100> = no match

Intersection
The ampersand symbol “&” joins two patterns. They both of them have to match the string. For string “aaabbb”, the following applies:

  • aaa.+&.+bbb = match
  • aaa&bbb = no match

Any string
The at sign “@” matches any string in its entire length. This can be combined with intersection and complement (mentioned above) in cases when you want to search for “everything except something”. For example:

  • @&~(content.+) finds everything, except strings beginning with “content”

Properties in different archives
When specifying a boolean value for a property in query, the following notations can be used:

  • true | yes | y stand for True
  • false | no | n stand for False

Property names and values are not case sensitive. Wildcard characters (* and ?) can be used everywhere.

The character ‘|’ means an option or alternative (in cases if multiple property names and values can be used).

If the value is specified in quotes (e.g. “value”), it is considered as a phrase.
Example:

  • “brown fox” will find all documents that contains the words “brown” followed by word “fox”

Email properties
The properties below are applicable when searching in Email archive

Property Specificity Description
HasAttachment: true | false if true, finds emails having one or more attachments; if false, finds emails having no attachments
Importance: Low | Normal | High finds emails with the specified importance level
Sensitivity: Normal | Personal | Private | Confidential finds emails with the specified sensitivity level
Flag: true | false find emails having a flag set (true) or not set (false)
AttachmentCount: (number) finds emails with the specified attachment count
Bcc: (string) condition on addresses in BCC tag of the email
Category: (string) condition on category
Cc: (string) condition on addresses in CC tag of the email
Folder: (string) condition on folder path; possible to find emails only in the specified folder (backslash is used as path separator, e.g. Inbox\Important)
ReceivedDate: (date) condition on receiving date
RetentionTime: (number) condition on retention time (in months)
Sender | From: (string) condition on email sender
Date | SentDate: (date) condition on email’s sent date
Size: (number) condition on email’s size in bytes
Title | Subject: (string) condition on email subject
To: (string) condition on email’s recipient
Body: (string) search in the mail’s body text
Attachment: (string) search in mail’s attachment text

File properties
The properties below are applicable when searching in File archive

Property Specificity Description
CreationDate: (date) condition on file’s creation date
Title | Filename: (string) condition on file’s name
Folder: (string) condition on file’s path (\ is the path separator as in Windows, e.g. c:\documents\rfa)
Date | ModifiedDate: (date) condition on file’s creation date
Size: (number) condition on file’s size in bytes

SharePoint document properties
The properties below are applicable when searching in SharePoint archive

Property Specificity Description
CreatedBy: (string) condition on user who created the file
CreationDate: (date) condition on creation date
FileSize: (number) condition on file size
Date | ModificationDate: (date) condition on modification date
ModifiedBy: (string) condition on user who modified the document
Name: (string) condition on document name
Title: (string) condition on document title
VersionNum: (number) condition on document’s version number

Yes No Suggest edit
Help Guide Powered by Documentor
Suggest Edit