HubStor Inc.

How to Create Connectors for Files

This article describes how to configure settings specific to File connectors within the General and Advanced tabs of the connector properties. 

The connector must be created before these settings can be applied. For information on creating a connector, refer to the article:  How To Create Connectors In HubStor

Configure Connector's General Settings 

In the Connector Properties window, complete the following settings: 
  1. Get Location ACLs -- During the ingestion process, it will synchronize unique ACLs at the folder level. Generally, this option should be enabled.
  2. Get Item ACLs -- During the ingestion process, it will synchronize unique ACLs at the item level. But note that syncing item-level ACLs introduces an additional call per item as part of the crawl process. This option will slow the crawl process and add more load on the file server.

Configure Connector Type Specific Settings

  1. Root Path -- Provide the path to the top level share you wish this connector to crawl. It will crawl this top-level folder and everything below it. It is recommended to click the ' Validate ' button to verify that HCS can connect to the path.  If at a later date, the entire directory structure is moved to a new location, update the existing connector to point to the new location.  If any changes to the directory structure have occurred during a move, then contact HubStor support for guidance.
  2. Manage Folder Exclusions -- This option should be used when there is a need to exclude data from specific folders.  
  3. Stubbing Format -- Select 'HTML', 'Link-based', or 'Seamless'. More information on these options is provided below related to 'Stub Policy' in Step 4.
  4. Validate Stubs During Next Crawl -- Use this setting to force the validation of seamless stubs during the next crawl.  This setting will be reset after the next crawl.  Typically, evaluation of an item is skipped during a crawl if the last modified date has not changed since the last crawl.  This setting overrides that behavior during the next crawl.  This option is useful if stubs have been replaced or updated in such a way that the last modified date has not changed.  Stub tracking information in HubStor will be updated as a result.
  5. Reinitialize Stubs During Next Crawl -- This setting causes seamless stubs to be evaluated and rebuilt if necessary on the next crawl.  Each stub signature is evaluated to ensure it matches the latest stub format (version 2).  If a stub signature is out of sync or out of date, then it will be updated on the file system.

  1. Skip PSTs -- If you wish to skip PSTs (usually because you want to archive them with a PST Connector type), enable this option. 
  2. Populate Data Owner -- Enable this option if you want HubStor to sync the data owner on files. 
  3. Preserve Directory Dates -- This option tells HubStor to preserve the original directory dates when it crawls folders. 
  4. Delete Empty Folders --   This option tells HubStor to delete an empty folder whenever the connector has a 'Delete Policy' configured and all the items in the folder have been deleted because of the policy. 
  5. Use .hsmetadata.xml Files -- There are often requirements for files ingested via the file connector to be augmented with additional metadata fields. This option is often used with integration scenarios or for decommissioning data from legacy applications. See HOW TO USE THE .HSMETADATA XML FILES OPTION
  6. Output .hscapture.xml Files --   This option is used for integration scenarios where an application places data in a directory expecting HubStor to capture it, and the writing application want to receive a confirmation when HubStor has successfully written the content. As well, the application may want to provide a retrieval link to the item for its users. Thus, with this option enabled, any HubStor captured file (somefilename.extension) will have an XML reply file (somefilename.extension.hscapture.xml) created in the source directory as a commit. The XML file contains information the application can use to request the file. 
  7. Use .hubstorinfo File -- The .hubstorinfo file contains a GUID to uniquely identify the parent folder.  This GUID is used to, upon a folder being renamed, properly identify the folder during subsequent crawls.  Without the .hubstorinfo file, renaming a folder and then running a crawl would result in the folder being considered as a new folder and any content contained therein being re-ingested (though it would be dedup’d).  If the renamed folder is far up the folder hierarchy, this could potentially result in a lot of content being re-ingested. Therefore, it is best practice to always use the .hubstorinfo file unless you are sure the source folders will not be renamed. This option will create one .hubstorinfo file per folder/location where at least one item was captured. The file by default is a hidden system file. 
  8. Restrict .hubstorinfo File Permissions -- With this enabled, when the file is created, it will add only the HubStor service account to file permissions.  Otherwise, this file will inherit permissions at the folder level.  This option is useful to restrict users from tampering with this file.  
  9. Log Diagnostics -- This is only used for troubleshooting and enabled when instructed by a HubStor representative. 
  10. Case Sensitive (UNIX) -- Enable this option when there are folders on a Unix system that are the same name with different cases: I.e. Folder1, folder1.  When using this option, the following option must also be enabled: NFS Compatibility (see below)
    1. Note: This option is not available on existing connectors. A new connector must be created in order to use this feature.  Contact HubStor support for further assistance if this applies to your scenario.  
  11. Capture Locked Files -- Enable this option to capture files that are currently open and are not locked exclusively. If enabled, it will only capture the data in which it has access to at the time of scanning the file. and therefore, recommended to leave this option unchecked as the file being captured may not be in a consistent state.    
  12. Capture System Files -- Enable this option to capture system files. (e.g.  thumbs.db or desktop.ini ) 
  13. Suppress Inheritance Source Population -- Enable this option when the volume root is not accessible (shared), and you are not using deny permissions in your file system.  If you are using deny permissions and this option is enabled, then the administrator should be comfortable with the possibility that the deny permission behavior in HubStor may be more restrictive than the source system.  The hierarchy of precedence for NTFS permissions can be summarized as follows, with the higher precedence permissions listed at the top of the list.
                  Explicit Deny 
                 Explicit Allow 
                 Inherited Deny 
                 Inherited Allow            

Permissions inherited from near relatives take precedence over permissions inherited from distant predecessors. So permissions inherited from the object's parent folder take precedence over permissions inherited from the object's "grandparent" folder, and so on.  Due to this behavior, it may be necessary to traverse the folder hierarchy to the root to correctly determine how a deny is applied when this option is not enabled. When this option is enabled all inherited permissions from all predecessors are effectively collapsed into the parent folder.  In other words, deny permissions from the grandparent level and above will always take priority over inherited allow permissions. 

  1. Windows Data Deduplication Drive -- Enable this option if you plan on using seamless stubs and the underlying Windows drive has Windows Data Deduplication enabled.  This option will force Windows to garbage collect storage for items that have been stubbed.  The option changes the behavior in how stubs are created. 
  2. Read-only Drive -- This option can override the behavior of other options.  When enabled this option will: prevent stubbing, prevent the use of the .hubstorinfo files, and the last accessed times on files will change after archival. 
  3. Traverse Symbolic Link Folders -- This option will skip traversal into symbolic link folders to the HCS File Connector. Symbolic link folders may point to widely different sources and have their own connector configured with different crawl/stubbing settings. The default is ON, which means the crawl will traverse any symbolic link folder encountered.
  4. Suppress Symbolic Link Errors -- Errors related to symbolic links are not logged.  
  5. Continuous Data Protection -- Changes to the files under the root folder or sub-folders will trigger a crawl to occur after a specified delay.  The delay allows a consolidation of file system changes to occur before kicking off a crawl.  The default delay is 30 seconds.  Reducing this delay is not recommended. For detailed information on this topic, see the following:  HOW TO CONFIGURE CONTINUOUS DATA PROTECTION
  6. NFS Compatibility – Enable this option when capturing data from an NFS Share.  This option will not preserve the last accessed date on HubStor captured files.  When using this option, disable the Preserve Directory Dates option.  

    • Related Articles

    • How To Create Connectors In HubStor

      This article describes the process of how to create connectors in the HubStor Connector Service (HCS).  Add Connector  In the HubStor Connector Service (HCS) app, go to  'Connectors' and click 'Add Connector '.      In the drop-down, select the type ...
    • Does HubStor Support Azure File Sync?

      HubStor can capture data directly from an Azure File Sync drive.  This can be done using the standard Files Connector.  Refer to the article below on configuring a Files Connector.  How to Create Connectors for Files
    • How To Create EML/MSG Files Connectors

      This article describes how to configure settings specific to EML/MSG file connectors within the General tab of the connector properties. The connector must be created before these settings can be applied. For information on creating a connector, ...
    • HubStor File Copy Utility

      Overview The HubStor File Copy Utility can be a very useful tool when there is a requirement to copy files from one source to another. One of the most common use cases is migrating data from one server to another. Often times, stubs are part of this ...
    • How To Create SharePoint Connectors

      This article describes how to configure settings specific for the following connectors: SharePoint Online SharePoint Server on premises (versions 2013 and higher) SharePoint Groups/Teams  Each connector type listed above requires its own connector.  ...