Nuix · Schema

ProcessorSettings

Schema for ProcessorSettings in Nuix REST API

ForensicseDiscoveryInvestigationsComplianceData ProcessingLegal TechnologyIntelligence

Properties

Name	Type	Description
processText	boolean	If true, stores and indexes the text of data items. Default is true.
traversalScope	string	Selects the traversal scope for ingestion. * `full_traversal` - Processes all items and their descendants during processing except where specific MIME type settings apply. * `loose_files` - Processes
processLooseFileContents	boolean	If true, the contents of loose files will be extracted and processed. If false metadata about loose files will be extracted but their contents will not be processed. Default is true. Deprecated in Nui
processForensicImages	boolean	If true, the contents of forensic images will be exposed. If false metadata about forensic images will be extracted but their contents will not be processed. This settings can be used in combination w
analysisLanguage	string	Specifies the language to use for text analysis when indexing. A supported language code. Example codes are: "en" - English, "ja" - Japanese. Default is "en".
stopWords	boolean	If true, removes English stop words ("a", "and", "the", etc.) from the text index. If false, no stop words are removed. Default is false.
stemming	boolean	If true, stems words using English rules before indexing (e.g. "fishing" -> "fish".). If false, no stemming is performed. Default is false.
enableExactQueries	boolean	If true, enables search using "exact" queries. Default is false.
extractNamedEntities	boolean	If true, extract named entities from the text of a document. Default is false. NOTE: This is deprecated and will be removed in future release. Deprecated in Nuix 6.0.
extractNamedEntitiesFromText	boolean	If true, extracts named entities from the text of a document. Default is false
extractNamedEntitiesFromProperties	boolean	If true, extracts named entities from the properties of a document. Default is false
extractNamedEntitiesFromTextStripped	boolean	If true, extracts named entities from the text of text-stripped items, if and only if 'extractNamedEntitesFromText' is true. The 'extractNamedEntitiesFromProperties' setting is independent of this pro
extractNamedEntitiesFromTextCommunications	boolean	If true, extract named entities from the communication metadata of a document. Default is false.
extractShingles	boolean	If true, extract shingles from item text. Enabling this setting enables near deduplication. Default is true.
processTextSummaries	boolean	If true, process item text and summarise. Default is true.
calculateSSDeepFuzzyHash	boolean	If true, calculate SSDeep fuzzy hash values for item. Default is false.
calculatePhotoDNARobustHash	boolean	If true, calculate PhotoDNA robust hash values for image items.
detectFaces	boolean	If true, detect faces in photographic items. Default is false.
classifyImagesWithDeepLearning	boolean	If true, classify images using Deep Learning. Requires additional following settings: imageClassificationModelUrl. Default is false.
imageClassificationModelUrl	string	URL pointing to Deep Learning model - can be a local file too, but in URL format (file://path/to/model).
extractFromSlackSpace	boolean	If true, extract deleted data from mailbox file formats and slack space from the end of file records in file system disk images. Default is false.
carveFileSystemUnallocatedSpace	boolean	If true, carve data out of file system unallocated space for disk images. Default is false.
carveUnidentifiedData	boolean	If true, carve data out of unidentified data items. Default is false.
carvingBlockSize	integer	If null, the block size of the file system is used. Otherwise the given block size is used. File identification is attempted at start of each block, so the smaller the value the longer processing will
recoverDeletedFiles	boolean	If true, recover deleted file records from disk images. Default is true.
extractEndOfFileSlackSpace	boolean	If true, extract the slack space from the end of file records in disk images. Default is false.
smartProcessRegistry	boolean	If true, only process sections of the Registry that have decoders of have been explicitly selected. Default is false
identifyPhysicalFiles	boolean	If false, only file system metadata is extracted for physical files on disk. Default is true.
createThumbnails	boolean	If true, create and store thumbnails of image data items. Default is true.
skinToneAnalysis	boolean	If true, perform analysis on images to detect skintones. Default is false.
calculateAuditedSize	boolean	If true, calculates audited size. Default is false.
storeBinary	boolean	If true, store the binary of data items. Default is false.
maxStoredBinarySize	integer	Specifies the maximum size of binary which will be stored into the binary store, in bytes. Default is 250000000 (250 MB).
maxDigestSize	integer	Specifies the maximum size of binary which will be digested, in bytes. Default is 250000000 (250 MB).
digests	array	A list of digests to calculate. Valid values "MD5", "SHA-1" or "SHA-256". Default is [ "MD5" ].
addBccToEmailDigests	boolean	If true, adds the Bcc field when computing email digests. Using the Bcc field in email digests may prevent the sender and recipients digests from matching. This is because only the sender will have th
addCommunicationDateToEmailDigests	boolean	If true, adds the communication date when computing email digests. Using the communication date in the email digests may prevent the sender and recipients digests from matching. This is because the se
reuseEvidenceStores	boolean	If true, existing evidence stores are used to add any additional data into. Default is false.
processFamilyFields	boolean	If true, top-level items will contain search fields containing text from their family. Default is false.
hideEmbeddedImmaterialData	boolean	If true, hides embedded immaterial data items such as embedded images in documents. Default is false.
reportProcessingStatus	string	If "physical_files", then the total evidence physical file size is calculated before processing starts. If "none", then no up-front calculation is performed. Non-file data will always be treated as "n
enableCustomProcessing	array	Which aspects of the data to expose in the callback from whenItemProcessed(ItemProcessedCallback). They are accessible from ProcessedItem.getProperties(), ProcessedItem.getTextFile() and ProcessedItem
workerItemCallback	string	A string prefixed with "java:" followed by the name of a class with a no argument constructor which implements Consumer, or the name of a scripting engine followed by a colon character fol
workerItemCallbacks	array	A list of strings specifying worker scripts. See workerItemCallback for script details. Scripts are processed in list order.
performOcr	boolean	If true, performs the OCR on the items based on the selected OCR profile. Default is false.
ocrProfileName	string	This must be set to an OCR profile name that exists.
createPrintedImage	boolean	If true, creates a printed image for items based on an imaging profile. Default is false.
imagingProfileName	string	If createPrintedImage is set to true, this specifies the name of the imaging profile to use during processing. This must be set to the name of an existing imaging profile that can be reached within th
exportMetadata	boolean	If true, exports an item's metadata using a given metadata export profile. Default is false.
metadataExportProfileName	string	If exportMetadata is set to true, this specifies the name of the metadata export profile to use during processing. This must be set to the name of a metadata export profile that can be reached within
namedEntities	array	Passes in the list of named entities to be used in processing. If null, the default named entity settings are used.

View JSON Schema on GitHub

JSON Schema

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "https://raw.githubusercontent.com/api-evangelist/nuix/refs/heads/main/json-schema/nuix-rest-processorsettings.json",
  "title": "ProcessorSettings",
  "description": "Schema for ProcessorSettings in Nuix REST API",
  "type": "object",
  "properties": {
    "processText": {
      "type": "boolean",
      "description": "If true, stores and indexes the text of data items.  Default is true.",
      "default": true
    },
    "traversalScope": {
      "type": "string",
      "description": "Selects the traversal scope for ingestion.\n* `full_traversal` - Processes all items and their descendants during processing except where specific MIME type settings apply.\n* `loose_files` - Processes loose files but not their contents.\n* `loose_files_and_forensic_images` - Processes loose files and forensic images but not their contents. This can be used to 'explode' forensic\n  images without processing all the files inside. Note: `identifyPhysicalFiles` must be set to true to explode forensic images.\n",
      "enum": [
        "full_traversal",
        "loose_files",
        "loose_files_and_forensic_images"
      ]
    },
    "processLooseFileContents": {
      "type": "boolean",
      "description": "If true, the contents of loose files will be extracted and processed. If false metadata about loose files will be extracted but their contents will not be processed. Default is true. Deprecated in Nuix 7.0, eplaced with traversalScope.",
      "deprecated": true,
      "default": true
    },
    "processForensicImages": {
      "type": "boolean",
      "description": "If true, the contents of forensic images will be exposed. If false metadata about forensic images will be extracted but their contents will not be processed. This settings can be used in combination with processLooseFileContents to explode forensic images but not process their contents. Default is true. Deprecated in Nuix 7.0, replaced with traversalScope.",
      "deprecated": true,
      "default": true
    },
    "analysisLanguage": {
      "type": "string",
      "description": "Specifies the language to use for text analysis when indexing. A supported language code.  Example codes are: \"en\" - English, \"ja\" - Japanese. Default is \"en\".",
      "default": "en"
    },
    "stopWords": {
      "type": "boolean",
      "description": "If true, removes English stop words (\"a\", \"and\", \"the\", etc.) from the text index. If false, no stop words are removed. Default is false.",
      "default": false
    },
    "stemming": {
      "type": "boolean",
      "description": "If true, stems words using English rules before indexing (e.g. \"fishing\" -> \"fish\".). If false, no stemming is performed. Default is false.",
      "default": false
    },
    "enableExactQueries": {
      "type": "boolean",
      "description": "If true, enables search using \"exact\" queries.  Default is false.",
      "default": false
    },
    "extractNamedEntities": {
      "type": "boolean",
      "description": "If true, extract named entities from the text of a document.  Default is false. NOTE: This is deprecated and will be removed in future release.  Deprecated in Nuix 6.0.",
      "deprecated": true,
      "default": false
    },
    "extractNamedEntitiesFromText": {
      "type": "boolean",
      "description": "If true, extracts named entities from the text of a document.  Default is false",
      "default": false
    },
    "extractNamedEntitiesFromProperties": {
      "type": "boolean",
      "description": "If true, extracts named entities from the properties of a document.  Default is false",
      "default": false
    },
    "extractNamedEntitiesFromTextStripped": {
      "type": "boolean",
      "description": "If true, extracts named entities from the text of text-stripped items, if and only if 'extractNamedEntitesFromText' is true. The 'extractNamedEntitiesFromProperties' setting is independent of this property.  Default is false",
      "default": false
    },
    "extractNamedEntitiesFromTextCommunications": {
      "type": "boolean",
      "description": "If true, extract named entities from the communication metadata of a document.  Default is false.",
      "default": false
    },
    "extractShingles": {
      "type": "boolean",
      "description": "If true, extract shingles from item text. Enabling this setting enables near deduplication.  Default is true.",
      "default": true
    },
    "processTextSummaries": {
      "type": "boolean",
      "description": "If true, process item text and summarise.  Default is true.",
      "default": true
    },
    "calculateSSDeepFuzzyHash": {
      "type": "boolean",
      "description": "If true, calculate SSDeep fuzzy hash values for item. Default is false.",
      "default": false
    },
    "calculatePhotoDNARobustHash": {
      "type": "boolean",
      "description": "If true, calculate PhotoDNA robust hash values for image items.",
      "default": false
    },
    "detectFaces": {
      "type": "boolean",
      "description": "If true, detect faces in photographic items. Default is false.",
      "default": false
    },
    "classifyImagesWithDeepLearning": {
      "type": "boolean",
      "description": "If true, classify images using Deep Learning. Requires additional following settings: imageClassificationModelUrl. Default is false.",
      "default": false
    },
    "imageClassificationModelUrl": {
      "type": "string",
      "description": "URL pointing to Deep Learning model - can be a local file too, but in URL format (file://path/to/model)."
    },
    "extractFromSlackSpace": {
      "type": "boolean",
      "description": "If true, extract deleted data from mailbox file formats and slack space from the end of file records in file system disk images.  Default is false.",
      "default": false
    },
    "carveFileSystemUnallocatedSpace": {
      "type": "boolean",
      "description": "If true, carve data out of file system unallocated space for disk images.  Default is false.",
      "default": false
    },
    "carveUnidentifiedData": {
      "type": "boolean",
      "description": "If true, carve data out of unidentified data items.  Default is false.",
      "default": false
    },
    "carvingBlockSize": {
      "type": "integer",
      "description": "If null, the block size of the file system is used. Otherwise the given block size is used. File identification is attempted at start of each block, so the smaller the value the longer processing will take. Avoid values smaller than 512 bytes except in specific cases. Default is null.",
      "format": "int32"
    },
    "recoverDeletedFiles": {
      "type": "boolean",
      "description": "If true, recover deleted file records from disk images.  Default is true.",
      "default": true
    },
    "extractEndOfFileSlackSpace": {
      "type": "boolean",
      "description": "If true, extract the slack space from the end of file records in disk images.  Default is false.",
      "default": false
    },
    "smartProcessRegistry": {
      "type": "boolean",
      "description": "If true, only process sections of the Registry that have decoders of have been explicitly selected. Default is false",
      "default": false
    },
    "identifyPhysicalFiles": {
      "type": "boolean",
      "description": "If false, only file system metadata is extracted for physical files on disk.  Default is true.",
      "default": true
    },
    "createThumbnails": {
      "type": "boolean",
      "description": "If true, create and store thumbnails of image data items.  Default is true.",
      "default": true
    },
    "skinToneAnalysis": {
      "type": "boolean",
      "description": "If true, perform analysis on images to detect skintones.  Default is false.",
      "default": false
    },
    "calculateAuditedSize": {
      "type": "boolean",
      "description": "If true, calculates audited size.  Default is false.",
      "default": false
    },
    "storeBinary": {
      "type": "boolean",
      "description": "If true, store the binary of data items.  Default is false.",
      "default": false
    },
    "maxStoredBinarySize": {
      "type": "integer",
      "description": "Specifies the maximum size of binary which will be stored into the binary store, in bytes. Default is 250000000 (250 MB).",
      "default": 250000000,
      "format": "int32"
    },
    "maxDigestSize": {
      "type": "integer",
      "description": "Specifies the maximum size of binary which will be digested, in bytes. Default is 250000000 (250 MB).",
      "default": 250000000,
      "format": "int64"
    },
    "digests": {
      "type": "array",
      "description": "A list of digests to calculate.  Valid values \"MD5\", \"SHA-1\" or \"SHA-256\".  Default is [ \"MD5\" ].",
      "default": [
        "MD5"
      ],
      "items": {
        "type": "string"
      }
    },
    "addBccToEmailDigests": {
      "type": "boolean",
      "description": "If true, adds the Bcc field when computing email digests. Using the Bcc field in email digests may prevent the sender and recipients digests from matching. This is because only the sender will have the Bcc field if it is present. Default is false.",
      "default": false
    },
    "addCommunicationDateToEmailDigests": {
      "type": "boolean",
      "description": "If true, adds the communication date when computing email digests. Using the communication date in the email digests may prevent the sender and recipients digests from matching. This is because the sender and recipients communication date / times can be slightly different for the same email. Default is false.",
      "default": false
    },
    "reuseEvidenceStores": {
      "type": "boolean",
      "description": "If true, existing evidence stores are used to add any additional data into.  Default is false.",
      "default": false
    },
    "processFamilyFields": {
      "type": "boolean",
      "description": "If true, top-level items will contain search fields containing text from their family.  Default is false.",
      "default": false
    },
    "hideEmbeddedImmaterialData": {
      "type": "boolean",
      "description": "If true, hides embedded immaterial data items such as embedded images in documents.  Default is false.",
      "default": false
    },
    "reportProcessingStatus": {
      "type": "string",
      "description": "If \"physical_files\", then the total evidence physical file size is calculated before processing starts. If \"none\", then no up-front calculation is performed. Non-file data will always be treated as \"none\". Default is none.",
      "default": "none"
    },
    "enableCustomProcessing": {
      "type": "array",
      "description": "Which aspects of the data to expose in the callback from whenItemProcessed(ItemProcessedCallback). They are accessible from ProcessedItem.getProperties(), ProcessedItem.getTextFile() and ProcessedItem.getBinaryFile() respectively. All other processing options will be skipped when this option is enabled.",
      "items": {
        "type": "string",
        "enum": [
          "properties",
          "binary",
          "text"
        ]
      }
    },
    "workerItemCallback": {
      "type": "string",
      "description": "A string prefixed with \"java:\" followed by the name of a class with a no argument constructor which implements Consumer<WorkerItem>, or the name of a scripting engine followed by a colon character follow by the script to execute. Example scripting engine names include \"ruby\", \"python\" and \"ecmascript\".  For scripting languages, this string is the actual script code."
    },
    "workerItemCallbacks": {
      "type": "array",
      "description": "A list of strings specifying worker scripts. See workerItemCallback for script details. Scripts are processed in list order.",
      "items": {
        "type": "string"
      }
    },
    "performOcr": {
      "type": "boolean",
      "description": "If true, performs the OCR on the items based on the selected OCR profile.  Default is false.",
      "default": false
    },
    "ocrProfileName": {
      "type": "string",
      "description": "This must be set to an OCR profile name that exists.",
      "default": "Default"
    },
    "createPrintedImage": {
      "type": "boolean",
      "description": "If true, creates a printed image for items based on an imaging profile.  Default is false.",
      "default": false
    },
    "imagingProfileName": {
      "type": "string",
      "description": "If createPrintedImage is set to true, this specifies the name of the imaging profile to use during processing. This must be set to the name of an existing imaging profile that can be reached within the current context.",
      "default": "Processing Default"
    },
    "exportMetadata": {
      "type": "boolean",
      "description": "If true, exports an item's metadata using a given metadata export profile.  Default is false.",
      "default": false
    },
    "metadataExportProfileName": {
      "type": "string",
      "description": "If exportMetadata is set to true, this specifies the name of the metadata export profile to use during processing. This must be set to the name of a metadata export profile that can be reached within the current context."
    },
    "namedEntities": {
      "type": "array",
      "description": "Passes in the list of named entities to be used in processing. If null, the default named entity settings are used.",
      "items": {
        "$ref": "#/components/schemas/NamedEntity"
      }
    }
  }
}