HubStor Inc.

How to Detect and Tag Private/Sensitive Data in HubStor


In this article, we'll walk through the process of creating a RegEx with Tags. 

To understand what can be achieved with Tags in HubStor, see UNDERSTANDING TAGGING IN HUBSTOR.

For examples/templates of regular expressions and queries, see REGEX AND QUERY EXAMPLES FOR PII DETECTION.

To create a RegEx, follow these steps

Step 1 -- Navigate to RegEx

In the HubStor Admin Portal, select 'Tagging' and then the 'RegEx' tab.


Step 2 -- Create RegEx

Click 'Add RegEx'.



You will be prompted to enter a name for the RegEx. Do so, and click 'Apply'. This will create the RegEx, which now needs to be configured.

Step 3 -- Configure RegEx

Click to expand the RegEx you wish to configure.

Next, select the 'RegEx Type' using the drop down.



It is important to understand the different RegEx Types and how they are used:
  • Single Term Regexp -- Allows you to compose a regular expression that will work against a single term (e.g. 4543934039222343).
  • Query String -- Allows you to query for strings, supports wildcards, etc. Operates on terms (i.e. individual words). For more information and examples of Query Strings, see SEARCH SYNTAX IN HUBSTOR
  • Advanced ElasticSearch Query -- Allows a raw query to be composed. Here you can use span_near to provide matching over multiple terms (i.e. credit card numbers with dashes or spaces, etc.).

NOTE: HubStor uses ElasticSearch to perform identification of patterns. Unfortunately, this does not support advanced regular expressions that handle various spacing and character separators. Hence, to find variations of a credit card, for example, you will need to use a combination of Single Term Regexp and Advanced ElasticSearch Query types.

For examples of Single Term Regexp and Advanced ElasticSearch Query, see REGEX AND QUERY EXAMPLES FOR PII DETECTION.

Step 4 -- Optionally Associate DLP Tags

Next, select the 'Output DLP Tags' tab if you wish to associate one or multiple DLP Tags to the query / regular expression.

Click 'Add Tag' to see the list of DLP Tags. Select the DLP Tag you wish to associate with the RegEx, and click 'Apply'. Repeat if you wish to associate multiple DLP Tags to the RegEx.

NOTE: If the DLP Tag has associated tag behaviors then these behaviors will apply to the items identified with your RegEx.

When finished, click 'Apply' to save your changes.


Additional Information

Your RegEx is now active and will be evaluated during the next policy interval with the same scope as your indexing policies. For instructions on viewing or modifying the policy interval, see HOW TO CONFIGURE THE POLICY EVALUATION INTERVAL IN HUBSTOR.

For more information on content indexing scope, see HOW TO DEFINE THE SCOPE OF CONTENT INDEXING.

Unlike Tagging Policies which are deployed per Stor, RegExs are deployed globally at the Hub level and will apply across all Stors that are PII enabled. To see if a Stor is PII enabled, see HOW TO ENABLE A STOR FOR PII DETECTION.



    • Related Articles

    • Overview of How Search Works in HubStor

      Search is optionally deployed in HubStor. It can be deployed at any time -- indexing can be run retrospectively against a backlog of archived data. NOTE: It is possible that your HubStor tenant does not yet have search enabled. The initial ...
    • Tag Behaviors Dashboard

      The Tags dashboard within the Analytics app provides 7 different areas using graph charts displaying data that's been associated to a tag behavior through RegExs and Tagging Policies. Start by selecting a Tag from the drop-down: By default, ...
    • How To Configure Continuous Data Protection

      HubStor’s Continuous Data Protection (CDP) monitors file system directories and SharePoint to detect new and changed files in near real-time (within seconds).  Changes are captured dynamically into HubStor to support backup scenarios with a very ...
    • Overview of Data Deletion in HubStor

      In HubStor, by default all data is indefinitely retained. Data is only ever deleted through retention policies. How Deletion Works When a retention policy is run in production mode, all items meeting the policy's selection criteria will be deleted, ...
    • How to Interpret HubStor Invoices

      First, thank you for being a HubStor customer! This article shows you how to dissect and reverse engineer the numbers in a HubStor invoice. Before we walk through a few examples, here are some essentials you need to know: HubStor invoices are posted ...