7.1.Search query language
The search query language is a computer language used to retrieve documents from the database regarding the specified properties. The search query language used in any user interface of contentACCESS can be divided into following categories:
Source specification
The searching user can specify where to search on different levels: tenant, model, entity
Tenant:(string) – select a tenant by name; search in tenants having the specified string in name
MTID:(string) – select a model by type identifier (EmailArchive, FileSystemArchive, SharePointArchive)
Source:(string) – select a model by keyword; search in models having the specified string as a keyword (email, file, sharepoint). This is similar as the MTID mentioned, but accepts more free model specification. Possible values are:
- For FileSystemArchive: file, fs, filesystem, archive
- For EmailArchive: archive, email, mail, mailarchive, emailarchive
Examples:
- source:file
- source:mail
Entity:(string) – select one or more entities by name; search in entities having the specified string in name. Entity name is mailbox address in Email archive and Root folder path in File system archive.
Examples:
- entity:abal@tech-arrow.com – search in ABAL’s mailbox
- entity:c:\temp – search in c:\temp folder
Property value specification
The following properties can be used to specify conditions on documents to be returned as result when searching the archive:
Date
Applicable only for properties of “date” type. Exact date specification has to be in format YYYY-MM-DD (no hours, minutes, seconds can be specified).
Example:
- date:(2016-12-05)
Available placeholders: now – means this hour; today, yesterday, this week, last week, this month, last month, this year, last year
Example:
- date:(now), date:(last week)
Number
Numbers are written as usually (1, 2, 3…). For the size conditions also units can be specified:
K | KB – size in kilobytes
M | MB – size in megabytes
G | GB – size in gigabytes
T | TB – size in terabytes
Example:
- size:(>1K) – files or emails (depending on the archive) larger than 1 KB
Range
Two types of ranges can be specified: numerical and date ranges. Ranges can be upper bound, lower bound or an interval. A range can be specified as a value for all properties of type “date” and “number”.
Prop:(>value) – the value of property “Prop” is greater than “value”
Prop:(< value) – the value of property “Prop” is less than “value”
Prop:(value1, value2) – the value of property “Prop” is greater than “value1” and less than “value2”
Examples:
- size:(1K, 1M) – files/emails (depending on the archive) larger than 1KB and smaller than 1MB
- date:(2016-10, 2016-12) – files created/modified or emails sent (depending on the archive) in the last quarter of 2016
Filename
Finds items by attachment name (Email archive) or file name (File archive). Wildcard characters can be used for filename pattern specification (* or ?). They have the same meaning as when searching for files in Windows.
Filename:(*.txt) – this will find all attachments and files having the extension .txt
Filename:(file) – this will find attachments and files having the exact name “file”
Filename:(file.*) – this will find attachments and files named “file” of any type (extension)
Boolean queries
Boolean query is a search type that allows you to combine desired keywords with operators like AND and OR to get more specific results.
Operator AND
This operator will narrow your search down to items containing only the words separated by it. Every blank space has the same meaning as the AND operator.
Example (both will do the same):
- content AND access AND email AND archive
- content access email archive
Operator OR
This operator, on the other hand, expands your search by connecting multiple phrases. The OR operator works like “at least one phrase from the entered must be present”. It means that the search will return results containing one of the selected phrases, two, three…or even all.
Example:
- content OR access – finds all item containing “content” or “access” or “content access”
Grouping
Multiple terms or clauses can be grouped together by using parentheses “( )” to form sub-queries, for example:
- (email OR file) AND archive – the returned results must contain at least one of the following: email archive, file archive
Regular expressions
Regular expression (regexp) is a sequence of characters defining a search pattern. This pattern is then often use to “find” or to “find and replace” strings. Regular expressions can be specified in search query by using double asterisk prefix:
**<regular-expression-pattern>
Regular expressions can be used for property queries, but also for free text queries.
Standard operators
Anchoring
It is possible to define the start and end on a string for your regexp pattern, but it needs to be anchored specifically. The symbol ^ indicates the beginning, while the $ symbol indicates the end.
Patterns are always anchored by default. The provided pattern must match the entire string. For example, for string “abcde”:
- ab.* = match
- abcd = no match
Allowed characters
Any Unicode character may be used in the pattern, but there are some exceptions that are reserved and must be escaped. The standard reserved characters are:
- . ? + * | { } [ ] ( ) ” \
If you enable optional features (described in this section), then the following characters may also be reserved:
- # @ & < > ~
Any character (except double quotes) is interpreted literally when bounded by double quotes:
- john”@smith.com”
Match any character
The period symbol “.” can be used to represent any character. The string “abcde” can be found like this:
- ab… = match
- a.c.e = match
Once or more
The plus symbol “+” can be used to repeat the preceding pattern once or multiple times. The string “aaabbb” can be found like this:
- a+b+ = match
- aa+bb+ = match
- a+.+ = match
- aa+bbb+ = match
Zero or more times
The asterisk symbol “*” can be used to match the preceding pattern zero or more times. The string “aaabbb” can be found like this:
- a*b* = match
- a*b*c* = match
- .*bbb.* = match
- aaa*bbb* = match
Zero times or once
The question mark “?” makes the preceding pattern optional, so it can matches zero times or once. The string “aaabbb” can be found like this:
- aaa?bbb? = match
- aaaa?bbbb? = match
- …..?.? = match
- aa?bb? = no match
Minimum to maximum
Curly brackets “{}” can be used to specify a minimum and also maximum number of times the preceding shortest pattern can be repeated. The allowed forms are:
- {5} – the pattern repeats exactly 5 times
- {2,5} – the pattern repeats 2 to 5 times
- {2,} – the pattern repeats at least twice
For string “aaabbb”, the following applies:
- a{3}b{3} = match
- a{2,4}b{2,4} = match
- a{2,}b{2,} = match
- .{3}.{3} = match
- a{4}b{4} = no match
- a{4,6}b{4,6} = no match
- a{4,}b{4,} = no match
Grouping
By using parentheses “()”, it is possible to form sub-patterns. The quantity operators listed above operate on the shortest previous pattern, which can also be a group. For string “ababab”, the following applies:
- (ab)+ = match
- ab(ab)+ = match
- (..)+ = match
- (…)+ = no match
- (ab)* = match
- abab(ab)? = match
- ab(ab)? = no match
- (ab){3} = match
- (ab){1,2} = no match
Alternation
The pipe symbol “|” works the same as the OR operator mentioned above in this section. The match will be successful if the pattern on either the left side OR the right side matches. Alternation applies to the longest pattern. For string “aabb”, the following applies:
- aabb|bbaa = match
- aacc|bb = no match
- aa(cc|bb) = match
- a+|b+ = no match
- a+b+|b+a+ = match
- a+(b|c)+ = match
Character classes
Ranges of characters may be specified as character classes, by being enclosed in square brackets “[]”. A leading ^ symbol negates the character class. The following forms are allowed:
- [abc] = ‘a’ or ‘b’ or ‘c’
- [a-c] = ‘a’ or ‘b’ or ‘c’
- [-abc] = ‘-‘ or ‘a’ or ‘b’ or ‘c’
- [abc\-] = ‘-‘ or ‘a’ or ‘b’ or ‘c’
- [^abc] = any character except ‘a’ or ‘b’ or ‘c’
- [^a-c] = any character except ‘a’ or ‘b’ or ‘c’
- [^-abc] = any character except ‘-‘ or ‘a’ or ‘b’ or ‘c’
- [^abc\-] = any character except ‘-‘ or ‘a’ or ‘b’ or ‘c’
For string “abcd”, the following applies:
- ab[cd]+ = match
- [a-d]+ = match
- [^a-d]+ = no match
Optional operators
Complement
Complement is probably the most used and helpful option. The shortest pattern that comes after a tilde symbol “~” is negated. For example, `”ab~cd” means:
- Starts with a
- a is followed by b
- b is followed by a string of any length that is anything, except c
- Ends with d
For the string “abcdef”, the following applies:
- ab~df = match
- ab~cf = match
- ab~cdef = no match
- a~(cb)def = match
- a~(bc)def = no match
Interval
The interval option enables the use of numeric ranges. The ranges have to be always enclosed by angle brackets “< >”. For string “access90”, the following applies:
- access<1-100> = match
- access<01-100> = match
- access<001-100> = no match
Intersection
The ampersand symbol “&” joins two patterns. They both of them have to match the string. For string “aaabbb”, the following applies:
- aaa.+&.+bbb = match
- aaa&bbb = no match
Any string
The at sign “@” matches any string in its entire length. This can be combined with intersection and complement (mentioned above) in cases when you want to search for “everything except something”. For example:
- @&~(content.+) finds everything, except strings beginning with “content”
Properties in different archives
When specifying a boolean value for a property in query, the following notations can be used:
- true | yes | y stand for True
- false | no | n stand for False
Property names and values are not case sensitive. Wildcard characters (* and ?) can be used everywhere.
The character ‘|’ means an option or alternative (in cases if multiple property names and values can be used).
If the value is specified in quotes (e.g. “value”), it is considered as a phrase.
Example:
- “brown fox” will find all documents that contains the words “brown” followed by word “fox”
Email properties
The properties below are applicable when searching in Email archive
Property | Specificity | Description |
HasAttachment: | true | false | if true, finds emails having one or more attachments; if false, finds emails having no attachments |
Importance: | Low | Normal | High | finds emails with the specified importance level |
Sensitivity: | Normal | Personal | Private | Confidential | finds emails with the specified sensitivity level |
Flag: | true | false | find emails having a flag set (true) or not set (false) |
AttachmentCount: | (number) | finds emails with the specified attachment count |
Bcc: | (string) | condition on addresses in BCC tag of the email |
Category: | (string) | condition on category |
Cc: | (string) | condition on addresses in CC tag of the email |
Folder: | (string) | condition on folder path; possible to find emails only in the specified folder (backslash is used as path separator, e.g. Inbox\Important) |
ReceivedDate: | (date) | condition on receiving date |
RetentionTime: | (number) | condition on retention time (in months) |
Sender | From: | (string) | condition on email sender |
Date | SentDate: | (date) | condition on email’s sent date |
Size: | (number) | condition on email’s size in bytes |
Title | Subject: | (string) | condition on email subject |
To: | (string) | condition on email’s recipient |
Body: | (string) | search in the mail’s body text |
Attachment: | (string) | search in mail’s attachment text |
File properties
The properties below are applicable when searching in File archive
Property | Specificity | Description |
CreationDate: | (date) | condition on file’s creation date |
Title | Filename: | (string) | condition on file’s name |
Folder: | (string) | condition on file’s path (\ is the path separator as in Windows, e.g. c:\documents\rfa) |
Date | ModifiedDate: | (date) | condition on file’s creation date |
Size: | (number) | condition on file’s size in bytes |
SharePoint document properties
The properties below are applicable when searching in SharePoint archive
Property | Specificity | Description |
CreatedBy: | (string) | condition on user who created the file |
CreationDate: | (date) | condition on creation date |
FileSize: | (number) | condition on file size |
Date | ModificationDate: | (date) | condition on modification date |
ModifiedBy: | (string) | condition on user who modified the document |
Name: | (string) | condition on document name |
Title: | (string) | condition on document title |
VersionNum: | (number) | condition on document’s version number |