7.1.Search query language
The search query language is used to specify conditions on documents, which have to be returned as result when searching the archive.
All search conditions can be negated using the NOT keyword. The keyword is case sensitive, so it must be always written in UPPERCASE. The exclamation mark “!” is a shorter version of the NOT keyword and has the exact same function. The syntax may look like as follows:
For freetext search:
- NOT value
- !value
For property search:
- Subject:(NOT test)
- Subject:(!test)
You can also combine the NOT operator with:
- phrase queries: NOT “this is a test”
- regex queries: filename:(NOT **[0-9]{3}\.txt)
- wildcard queries: NOT test*
- range queries: Size:(NOT 1M,2M) OR Date:(NOT 2020-01-01,2020-12-31)
The search query language used in any user interface of contentACCESS can be divided into following categories:
Source specification
The searching user can specify where to search on different levels: tenant, model, entity
Tenant:(string) – select a tenant by name; search in tenants having the specified string in name
MTID:(string) – select a model by type identifier (EmailArchive, FileSystemArchive, SharePointArchive, TeamsArchive, TeamsChatArchive)
Source:(string) – select a model by keyword; search in models having the specified string as a keyword (email, file, sharepoint). This is similar as the MTID mentioned, but accepts more free model specification. Possible values are:
- For FileSystemArchive: archive, file, fs, filesystem
- For EmailArchive: archive, email, mail, mailarchive, emailarchive
- For SharePointArchive: archive, sharepoint, sharepointarchive, sp
- For TeamsArchive: archive, teams, teamsarchive, tea
- For TeamsChatArchive: archive, teams, chat, teamschat, teamschatarchive, tca
Examples:
- source:file
- source:mail
Entity:(string) – select one or more entities by name; search in entities having the specified string in name. Entity name is mailbox address in Email archive and Root folder path in File system archive.
Examples:
- entity:john.smith@tech-arrow.com – search in John Smith’s mailbox
- entity:c:\temp – search in c:\temp folder
Property value specification
The following properties can be used to specify conditions on documents to be returned as result when searching the archive:
Date
Applicable only for properties of “date” type. Exact date specification has to be in format YYYY-MM-DD (no hours, minutes, seconds can be specified).
Example:
- date:(2016-12-05) (year-month-day; searches for items only from that specified day, in this case 5th of December 2016)
- date:(2016-12) (year-month; searches only for items from that specified month, in this case December 2016)
- date:(2016) (year; searches for items from whole year 2016)
Date intervals can be specified too, for example:
- date:(>2016-12) (searches for items younger than from December 2016)
- date:(<2019-11-10) (searches from items older than from 10th of November 2019)
- date:(2017-10, 2017-12) (searches for items created from October 2017 to December 2017)
Available placeholders: now – means this hour; today, yesterday, this week, last week, this month, last month, this year, last year
Example:
- date:(now), date:(last week)
Number
Numbers are written as usually (1, 2, 3…). For the size conditions also units can be specified:
K | KB – size in kilobytes
M | MB – size in megabytes
G | GB – size in gigabytes
T | TB – size in terabytes
Example:
- size:(>1K) – files or emails (depending on the archive) larger than 1 KB
Range
Two types of ranges can be specified: numerical and date ranges. Ranges can be upper bound, lower bound or an interval. A range can be specified as a value for all properties of type “date” and “number”.
Prop:(>value) – the value of property “Prop” is greater than “value”
Prop:(< value) – the value of property “Prop” is less than “value”
Prop:(value1, value2) – the value of property “Prop” is greater than “value1” and less than “value2”
Examples:
- size:(1K, 1M) – files/emails (depending on the archive) larger than 1KB and smaller than 1MB
- date:(2016-10, 2016-12) – files created/modified or emails sent (depending on the archive) in the last quarter of 2016
Filename
Finds items by attachment name (Email archive) or file name (File archive). Wildcard characters can be used for filename pattern specification (* or ?). They have the same meaning as when searching for files in Windows.
Filename:(*.txt) – this will find all attachments and files having the extension .txt
Filename:(file) – this will find attachments and files having the exact name “file”
Filename:(file.*) – this will find attachments and files named “file” of any type (extension)
Text
Properties with text values can be searched for a single term and phrase or multiple terms and phrases. Results of a single-term search will contain documents having the specified term in the text value of the specified property. Results of a multi-term search will contain documents having all of the specified terms in the text value of the specified property in any order.
Example:
- Subject:(lemon) – finds emails having the term “lemon” in their subject
- Subject:(lemon orange) – finds emails having the terms “lemon” and “orange” anywhere in their subject
Phrases must be enclosed in double quotes. Results of a phrase search will contain items having all of the specified terms in the text value of the specified property in the specified order.
Example:
- Subject:(“John Smith”) – finds emails having the name “John Smith” in their subject in the specified order
Character escaping
The following characters have special meaning in the query syntax:
- ( ) ” : \
They cannot be used directly in search terms. They have to be escaped using a backslash ‘\’, otherwise the search query will be ambiguous and will produce unexpected results.
Example:
- Subject:(apple \(pear\)) – finds emails having the terms “apple” and “pear” in their subject
- 8\:00 – finds documents containing the text 8:00
Boolean queries
Boolean query is a search type that allows you to combine desired keywords with operators like AND and OR to get more specific results.
Operator AND
This operator will narrow your search down to items containing only the words separated by it. Every blank space has the same meaning as the AND operator.
Example (both will do the same):
- content AND access AND email AND archive
- content access email archive
Operator OR
This operator, on the other hand, expands your search by connecting multiple phrases. The OR operator works like “at least one phrase from the entered must be present”. It means that the search will return results containing one of the selected phrases, two, three…or even all.
Example:
- content OR access – finds all item containing “content” or “access” or “content access”
Grouping
Multiple terms or clauses can be grouped together by using parentheses “( )” to form sub-queries, for example:
- (email OR file) AND archive – the returned results must contain at least one of the following: email archive, file archive
Regular expressions
Regular expression (regexp) is a sequence of characters defining a search pattern. This pattern is then often use to “find” or to “find and replace” strings. Regular expressions can be specified in search query by using double asterisk prefix:
** < regular-expression-pattern >
Regular expressions can be used for property queries, but also for free text queries.
Standard operators
Anchoring
It is possible to define the start and end on a string for your regexp pattern, but it needs to be anchored specifically. The symbol ^ indicates the beginning, while the $ symbol indicates the end.
Patterns are always anchored by default. The provided pattern must match the entire string. For example, for string “abcde”:
- ab.* = match
- abcd = no match
Allowed characters
Any Unicode character may be used in the pattern, but there are some exceptions that are reserved and must be escaped. The standard reserved characters are:
- . ? + * | { } [ ] ( ) ” \
If you enable optional features (described in this section), then the following characters may also be reserved:
- # @ & < > ~
Any character (except double quotes) is interpreted literally when bounded by double quotes:
- john”@smith.com”
Match any character
The period symbol “.” can be used to represent any character. The string “abcde” can be found like this:
- ab… = match
- a.c.e = match
Once or more
The plus symbol “+” can be used to repeat the preceding pattern once or multiple times. The string “aaabbb” can be found like this:
- a+b+ = match
- aa+bb+ = match
- a+.+ = match
- aa+bbb+ = match
Zero or more times
The asterisk symbol “*” can be used to match the preceding pattern zero or more times. The string “aaabbb” can be found like this:
- a*b* = match
- a*b*c* = match
- .*bbb.* = match
- aaa*bbb* = match
Zero times or once
The question mark “?” makes the preceding pattern optional, so it can matches zero times or once. The string “aaabbb” can be found like this:
- aaa?bbb? = match
- aaaa?bbbb? = match
- …..?.? = match
- aa?bb? = no match
Minimum to maximum
Curly brackets “{}” can be used to specify a minimum and also maximum number of times the preceding shortest pattern can be repeated. The allowed forms are:
- {5} – the pattern repeats exactly 5 times
- {2,5} – the pattern repeats 2 to 5 times
- {2,} – the pattern repeats at least twice
For string “aaabbb”, the following applies:
- a{3}b{3} = match
- a{2,4}b{2,4} = match
- a{2,}b{2,} = match
- .{3}.{3} = match
- a{4}b{4} = no match
- a{4,6}b{4,6} = no match
- a{4,}b{4,} = no match
Grouping
By using parentheses “()”, it is possible to form sub-patterns. The quantity operators listed above operate on the shortest previous pattern, which can also be a group. For string “ababab”, the following applies:
- (ab)+ = match
- ab(ab)+ = match
- (..)+ = match
- (…)+ = no match
- (ab)* = match
- abab(ab)? = match
- ab(ab)? = no match
- (ab){3} = match
- (ab){1,2} = no match
Alternation
The pipe symbol “|” works the same as the OR operator mentioned above in this section. The match will be successful if the pattern on either the left side OR the right side matches. Alternation applies to the longest pattern. For string “aabb”, the following applies:
- aabb|bbaa = match
- aacc|bb = no match
- aa(cc|bb) = match
- a+|b+ = no match
- a+b+|b+a+ = match
- a+(b|c)+ = match
Character classes
Ranges of characters may be specified as character classes, by being enclosed in square brackets “[]”. A leading ^ symbol negates the character class. The following forms are allowed:
- [abc] = ‘a’ or ‘b’ or ‘c’
- [a-c] = ‘a’ or ‘b’ or ‘c’
- [-abc] = ‘-‘ or ‘a’ or ‘b’ or ‘c’
- [abc\-] = ‘-‘ or ‘a’ or ‘b’ or ‘c’
- [^abc] = any character except ‘a’ or ‘b’ or ‘c’
- [^a-c] = any character except ‘a’ or ‘b’ or ‘c’
- [^-abc] = any character except ‘-‘ or ‘a’ or ‘b’ or ‘c’
- [^abc\-] = any character except ‘-‘ or ‘a’ or ‘b’ or ‘c’
For string “abcd”, the following applies:
- ab[cd]+ = match
- [a-d]+ = match
- [^a-d]+ = no match
Optional operators
Complement
Complement is probably the most used and helpful option. The shortest pattern that comes after a tilde symbol “~” is negated. For example, `”ab~cd” means:
- Starts with a
- a is followed by b
- b is followed by a string of any length that is anything, except c
- Ends with d
For the string “abcdef”, the following applies:
- ab~df = match
- ab~cf = match
- ab~cdef = no match
- a~(cb)def = match
- a~(bc)def = no match
Interval
The interval option enables the use of numeric ranges. The ranges have to be always enclosed by angle brackets “< >“. For string “access90”, the following applies:
- access<1-100> = match
- access<01-100> = match
- access<001-100> = no match
Intersection
The ampersand symbol “&” joins two patterns. They both of them have to match the string. For string “aaabbb”, the following applies:
- aaa.+&.+bbb = match
- aaa&bbb = no match
Any string
The at sign “@” matches any string in its entire length. This can be combined with intersection and complement (mentioned above) in cases when you want to search for “everything except something”. For example:
- @&~(content.+) finds everything, except strings beginning with “content”
Properties in different archives
When specifying a boolean value for a property in query, the following notations can be used:
- true | yes | y stand for True
- false | no | n stand for False
Property names and values are not case sensitive. Wildcard characters (* and ?) can be used everywhere.
The character ‘|’ means an option or alternative (in cases if multiple property names and values can be used).
If the value is specified in quotes (e.g. “value”), it is considered as a phrase.
Example:
- “brown fox” will find all documents that contains the words “brown” followed by word “fox”
Email properties
The properties below are applicable when searching in Email archive
Property | Specificity | Description |
HasAttachment: | true | false | if true, finds emails having one or more attachments; if false, finds emails having no attachments |
Importance: | Low | Normal | High | finds emails with the specified importance level |
Sensitivity: | Normal | Personal | Private | Confidential | finds emails with the specified sensitivity level |
Flag: | true | false | find emails having a flag set (true) or not set (false) |
AttachmentCount: | (number) | finds emails with the specified attachment count |
Bcc: | (string) | condition on addresses in BCC tag of the email |
Category: | (string) | condition on category |
Cc: | (string) | condition on addresses in CC tag of the email |
Folder: | (string) | condition on folder path; possible to find emails only in the specified folder (backslash is used as path separator, e.g. Inbox\Important) |
ReceivedDate: | (date) | condition on receiving date |
RetentionTime: | (number) | condition on retention time (in months) |
Sender | From: | (string) | condition on email sender |
Date | SentDate: | (date) | condition on email’s sent date |
Size: | (number) | condition on email’s size in bytes |
Title | Subject: | (string) | condition on email subject |
To: | (string) | condition on email’s recipient |
Body: | (string) | search in the mail’s body text |
Attachment: | (string) | search in mail’s attachment text |
File properties
The properties below are applicable when searching in File archive
Property | Specificity | Description |
CreationDate: | (date) | condition on file’s creation date |
Title | Filename: | (string) | condition on file’s name |
Folder: | (string) | condition on file’s path (\ is the path separator as in Windows, e.g. c:\documents\rfa) |
Date | ModifiedDate: | (date) | condition on file’s creation date |
Size: | (number) | condition on file’s size in bytes |
SharePoint document properties
The properties below are applicable when searching in SharePoint archive
Property | Specificity | Description |
CreatedBy: | (string) | condition on user who created the file |
CreationDate: | (date) | condition on creation date |
FileSize: | (number) | condition on file size |
Date | ModificationDate: | (date) | condition on modification date |
ModifiedBy: | (string) | condition on user who modified the document |
Name: | (string) | condition on document name |
Title: | (string) | condition on document title |
VersionNum: | (number) | condition on document’s version number |
Teams archive properties
The properties below are applicable when searching in Teams archive
Property | Specificity | Description |
Title: | (string) | message title |
Date, CreationDate: | (date) | message’s sent date |
Size: | (number) | message size, including attachments |
Folder, Location, Path, Url: | (string) | specifies the channel name or attachment location (SharePoint document URL) |
FileName: | (string) | attachment name |
Subject: | (string) | subject of the email message posted to a channel |
Author, Sender: | (string) | the user who sent the message |
Channel, ChannelName: | (string) | the Teams Channel the message was sent to |
Mentioned: | (string) | name of the mentioned user |
Reacted: | (string) | name of the user who sent a reaction |
ReactedOn: | (date) | the date when a reaction was sent |
Reaction: | (string) | type of the reaction; possible values are: Like, Angry, Sad, Laugh, Heart, Surprised |
Attachment: | (string) | attachment name and content |
HasAttachment: | (boolean) | message has attachment or not |
Type: | (string) | the type of the item, possible values are: Message (normal chat message), Reply (reply on a message), File (attachment file or file on Teams-related SharePoint sites) |
Teams chat archive properties
The properties below are applicable when searching in Teams chat archive
Property | Specificity | Description |
Title: | (string) | message title |
Date, CreationDate: | (date) | message’s sent date |
Size: | (number) | message size, including attachments |
Folder, Category: | (string) | specifies the message category, valid values: Chats, Group chats or Meetings |
FileName: | (string) | attachment name |
Author, Sender: | (string) | the user who sent the message |
Mentioned: | (string) | name of the mentioned user |
Reacted: | (string) | name of the user who sent a reaction |
ReactedOn: | (date) | the date when a reaction was sent |
Reaction: | (string) | type of the reaction; possible values are: Like, Angry, Sad, Laugh, Heart, Surprised |
Attachment: | (string) | attachment name and content |
HasAttachment: | (boolean) | message has attachment or not |
Member: | (string) | name of the user who is member of a chat |
Topic: | (string) | topic of a meeting |