1. **Access the VPC dashboard**:
a. In the AWS Management Console, in the top menu bar, click **Services > Networking & Content Delivery > VPC**.
3. **Create the subnets**:
a. After creating the VPC, in the sidebar, click **Subnets**.
4. **Create the internet gateway (for the public subnets)**:
a. In the sidebar, click **Internet gateways**.
5. **Set up route tables (for the public subnets)**:
AWS automatically created a default route table in *Step 3 - Create the subnets*. To tailor your network architecture, you will create a new route table specifically for your public subnets, which will include a route to the internet gateway from *Step 4 - Create the internet gateway (for the public subnets)*.
a. In the sidebar, click *Route tables*.
b. Click **Create route table**.
c. Enter a **Name**.
d. Select the **VPC** from *Step 2 - Create the VPC*.
e. Click **Create route table**.
6. **Associate public subnets to the route table and internet gateway**:
a. Connect the **public subnets** to the **route table** from *Step 5 - Set up route tables (for the public subnets)*:
7. **Inspect the VPC resource map**:
You can check the configurations from the resource maps on the VPC details dashboard by clicking **Your VPCs** in the sidebar, clicking the **VPC ID** for your VPC, and then clicking the **Resource map** tab.
## Part II: Deploying the Unstructured API from the AWS Marketplace
8. **Go to the Unstructured API page on AWS Marketplace**:
a. Leaving the VPC dashboard from Part I open, in a separate web browser tab, go to the [Unstructured API](http://aws.amazon.com/marketplace/pp/prodview-fuvslrofyuato) product page in the AWS Marketplace.
b. Click **Continue to Subscribe**.
c. Review the terms and conditions.
d. Click **Continue to Configuration**.
9. **Configure the CloudFormation template**:
a. In the **Fulfillment option** dropdown list, select **CloudFormation Template**.
b. For **Fulfillment option** and **Software version**, leave the default `UnstructuredAPI` template and software version.
c. In the **Region** dropdown list, select the Region that corresponds to the VPC from Part I.
* *Note: You must select the same Region where you set up the VPC in Part I. To find the Region, on the VPC dashboard tab from Part I that you left open, with your VPC displayed, find the VPC's Region name next to your username in the top navigation bar.*
d. Click **Continue to Launch**.
e. In the **Choose Action** dropdown list, select **Launch CloudFormation**.
f. Click **Launch**.
10. **Create the CloudFormation stack**:
After you click **Launch**, the **Create stack** page appears in CloudFormation.
**Step 1: Create the stack**
a. Leave **Choose an existing template** selected.
b. Leave **Amazon S3 URL** selected and the default **Amazon S3 URL** value unchanged.
c. Click **Next**.
**Step 2: Specify the stack's details**
a. Enter some unique **Stack name**.
b. In the **Parameters** section, in the **InstanceType** drop-down list, select **m5.xlarge**.
c. In the **KeyName** drop-down list, select the name of the SSH key pair from the beginning of this article.
d. In the **LoadBalancerScheme** dropdown list, select **internet-facing**.
e. For **SSHLocation**, enter `0.0.0.0/0`, but only if you allow public access on the internet.
* **Note**: It is generally recommended to limit SSH access to a specific IP range for enhanced security. This can be done by setting the `SSHLocation` to the IP address or range associated with your organization. Please consult your IT department or VPN vendor to obtain the correct IP information for these settings.
* AWS provides `AWS Client VPN`, which is a managed client-based VPN service that enables secure access AWS resources and resources in your on-premises network. To learn more, see [Getting started with AWS Client VPN](https://docs.aws.amazon.com/vpn/latest/clientvpn-admin/cvpn-getting-started.html).
f. In the **Subnets** dropdown multiselect list, select the two public subnets and the private subnet from Part I.
g. In the **VPC** dropdown list, select the VPC from Part I.
h. You can leave the default values for all of the other **Parameters** fields.
i. Click **Next**.
**Step 3: Configure the stack's options**
a. You can leave the default values, or specify any non-default stack options.
b. Click **Next**.
**Step 4: Review**
a. Review the stack's settings.
b. Click **Submit**.
11. **Get the Unstructured API endpoint**:
a. The CloudFormation details page for the stack appears. If you do not see it, on the sidebar, click **Stacks**, and then click the name of your stack.
b. Check the status of the CloudFormation stack. A successful deployment will show a **CREATE\_COMPLETE** value for the **Status** field on the **Stack Info** tab on this stack's details page. The deployment can take several minutes.
c. After a successful deployment, click the **Resources** tab on this stack's details page. Then click the **Physical ID** link next to **ApplicationLoadBalancer** on this tab.
d. On the **EC2 > Load balancers > (Load balancer ID)** page that appears, copy the **DNS Name** value, which is shown as an **(A Record)** and ends with `.elb.amazonaws.com`.
* Note: You will use this **DNS Name** to replace the `
## Healthcheck
Perform a health check by running this [curl](https://curl.se/) command from a terminal on your local machine, replacing `
## Data processing
For example, run one of the following, setting the following environment variables to make your code more portable:
* Set `UNSTRUCTURED_API_URL` to `http://`, followed by your load balancer's DNS name, followed by `/general/v0/general`.
5. Click **Create**. The function is created, and the **Code + Test** page appears.
## Step 3: Customize the function for your workflow
1. With the **Code + Test** page open from the previous step, on the **Code + Test** tab, replace the content of the `index.js` file with the following code:
```javascript theme={null}
module.exports = async function (context, myBlob) {
context.log("JavaScript blob trigger function processed blob \n Blob:", context.bindingData.blobTrigger, "\n Blob Size:", myBlob.length, "Bytes");
const apiKey = process.env.UNSTRUCTURED_API_KEY;
const apiUrl = process.env.UNSTRUCTURED_API_URL;
const headers = {
"accept": "application/json",
"unstructured-api-key": apiKey
};
try {
const response = await fetch(apiUrl, {
method: "POST",
headers: headers
});
const data = await response.json();
context.log("POST response:", data);
} catch (error) {
context.log.error("Error calling external API:", error);
}
};
```
2. Click **Save**.
3. In the navigation breadcrumb toward the top of the page, click your function app's name. The function app's settings page appears.
4. In the sidebar, expand **Settings**, and then click **Environment variables**.
5. Click **+ Add**.
6. For **Name**, enter `UNSTRUCTURED_API_URL`.
7. For **Value**, enter your `
3. In the **Instance details** section, enter a name in the **Virtual machine name** field. Note this name, as you will need it later steps.
4. Select a **Region** from the dropdown menu.
5. For **Image**, select **Unstructured Customer Hosted API Hourly - x64 Gen2** (*default*).
6. For **Size**, select a VM size from the dropdown menu, or leave the default VM size selection. To learn more, see [Azure VM comparisons](https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/).
7. In the **Administrator account** section, for **Authentication type**, select **SSH public key** or **Password**.
8. Enter the credential settings, depending on the authentication type.
4. Click **Create**.
4. The deployed endpoint URL is **http\://\Hello 😀
""" elements = partition_html(text=text) elements[0].apply(bytes_string_to_string) # The output should be "Hello 😀" elements[0].text ``` For more information about the `bytes_string_to_string` function, you can check the [source code here](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/cleaners/core.py). ## `clean` Cleans a section of text with options including removing bullets, extra whitespace, dashes and trailing punctuation. Optionally, you can choose to lowercase the output. Options: * Applies `clean_bullets` if `bullets=True`. * Applies `clean_extra_whitespace` if `extra_whitespace=True`. * Applies `clean_dashes` if `dashes=True`. * Applies `clean_trailing_punctuation` if `trailing_punctuation=True`. * Lowercases the output if `lowercase=True`. Examples: ```python theme={null} from unstructured.cleaners.core import clean # Returns "an excellent point!" clean("● An excellent point!", bullets=True, lowercase=True) # Returns "ITEM 1A: RISK FACTORS" clean("ITEM 1A: RISK-FACTORS", extra_whitespace=True, dashes=True) ``` For more information about the `clean` function, you can check the [source code here](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/cleaners/core.py). ## `clean_bullets` Removes bullets from the beginning of text. Bullets that do not appear at the beginning of the text are not removed. Examples: ```python theme={null} from unstructured.cleaners.core import clean_bullets # Returns "An excellent point!" clean_bullets("● An excellent point!") # Returns "I love Morse Code! ●●●" clean_bullets("I love Morse Code! ●●●") ``` For more information about the `clean_bullets` function, you can check the [source code here](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/cleaners/core.py). ## `clean_dashes` Removes dashes from a section of text. Also handles special characters such as `\u2013`. Examples: ```python theme={null} from unstructured.cleaners.core import clean_dashes # Returns "ITEM 1A: RISK FACTORS" clean_dashes("ITEM 1A: RISK-FACTORS\u2013") ``` For more information about the `clean_dashes` function, you can check the [source code here](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/cleaners/core.py). ## `clean_non_ascii_chars` Removes non-ascii characters from a string. Examples: ```python theme={null} from unstructured.cleaners.core import clean_non_ascii_chars text = "\x88This text contains ®non-ascii characters!●" # Returns "This text contains non-ascii characters!" clean_non_ascii_chars(text) ``` For more information about the `clean_non_ascii_chars` function, you can check the [source code here](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/cleaners/core.py). ## `clean_ordered_bullets` Remove alphanumeric bullets from the beginning of text up to three “sub-section” levels. Examples: ```python theme={null} from unstructured.cleaners.core import clean_ordered_bullets # Returns "This is a very important point" clean_ordered_bullets("1.1 This is a very important point") # Returns "This is a very important point ●" clean_ordered_bullets("a.b This is a very important point ●") ``` For more information about the `clean_ordered_bullets` function, you can check the [source code here](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/cleaners/core.py). ## `clean_postfix` Removes the postfix from a string if they match a specified pattern. Options: * Ignores case if `ignore_case` is set to `True`. The default is `False`. * Strips trailing whitespace is `strip` is set to `True`. The default is `True`. Examples: ```python theme={null} from unstructured.cleaners.core import clean_postfix text = "The end! END" # Returns "The end!" clean_postfix(text, r"(END|STOP)", ignore_case=True) ``` For more information about the `clean_postfix` function, you can check the [source code here](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/cleaners/core.py). ## `clean_prefix` Removes the prefix from a string if they match a specified pattern. Options: * Ignores case if `ignore_case` is set to `True`. The default is `False`. * Strips leading whitespace is `strip` is set to `True`. The default is `True`. Examples: ```python theme={null} from unstructured.cleaners.core import clean_prefix text = "SUMMARY: This is the best summary of all time!" # Returns "This is the best summary of all time!" clean_prefix(text, r"(SUMMARY|DESCRIPTION):", ignore_case=True) ``` For more information about the `clean_prefix` function, you can check the [source code here](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/cleaners/core.py). ## `clean_trailing_punctuation` Removes trailing punctuation from a section of text. Examples: ```python theme={null} from unstructured.cleaners.core import clean_trailing_punctuation # Returns "ITEM 1A: RISK FACTORS" clean_trailing_punctuation("ITEM 1A: RISK FACTORS.") ``` For more information about the `clean_trailing_punctuation` function, you can check the [source code here](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/cleaners/core.py). ## `group_broken_paragraphs` Groups together paragraphs that are broken up with line breaks for visual or formatting purposes. This is common in `.txt` files. By default, `group_broken_paragraphs` groups together lines split by `\n`. You can change that behavior with the `line_split` kwarg. The function considers `\n\n` to be a paragraph break by default. You can change that behavior with the `paragraph_split` kwarg. Examples: ```python theme={null} from unstructured.cleaners.core import group_broken_paragraphs text = """The big brown fox was walking down the lane. At the end of the lane, the fox met a bear.""" group_broken_paragraphs(text) ``` ```python theme={null} import re from unstructured.cleaners.core import group_broken_paragraphs para_split_re = re.compile(r"(\s*\n\s*){3}") text = """The big brown fox was walking down the lane. At the end of the lane, the fox met a bear.""" group_broken_paragraphs(text, paragraph_split=para_split_re) ``` For more information about the `group_broken_paragraphs` function, you can check the [source code here](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/cleaners/core.py). ## `remove_punctuation` Removes ASCII and unicode punctuation from a string. Examples: ```python theme={null} from unstructured.cleaners.core import remove_punctuation # Returns "A lovely quote" remove_punctuation("“A lovely quote!”") ``` For more information about the `remove_punctuation` function, you can check the [source code here](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/cleaners/core.py). ## `replace_unicode_quotes` Replaces unicode quote characters such as `\x91` in strings. Examples: ``` from unstructured.cleaners.core import replace_unicode_quotes # Returns "“A lovely quote!”" replace_unicode_characters("\x93A lovely quote!\x94") # Returns ""‘A lovely quote!’" replace_unicode_characters("\x91A lovely quote!\x92") ``` For more information about the `replace_unicode_quotes` function, you can check the [source code here](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/cleaners/core.py). ## `translate_text` The `translate_text` cleaning functions translates text between languages. `translate_text` uses the [Helsinki NLP MT models](https://huggingface.co/Helsinki-NLP) from `transformers` for machine translation. Works for Russian, Chinese, Arabic, and many other languages. Parameters: * `text`: the input string to translate. * `source_lang`: the two letter language code for the source language of the text. If `source_lang` is not specified, the language will be detected using `langdetect`. * `target_lang`: the two letter language code for the target language for translation. Defaults to `"en"`. Examples: ```python theme={null} from unstructured.cleaners.translate import translate_text # Output is "I'm a Berliner!" translate_text("Ich bin ein Berliner!") # Output is "I can also translate Russian!" translate_text("Я тоже можно переводать русский язык!", "ru", "en") ``` For more information about the `translate_text` function, you can check the [source code here](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/cleaners/translate.py). --- # Source: https://docs.unstructured.io/support/issues/configuration-resource.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt > Use this file to discover all available pages before exploring further. # Configuration and resource issues ## Issues When you try to connect Unstructured to a specific source or destination, you get one of the following error types: * `FileNotFoundError`: For example, for Amazon S3, a `path not found` error. * `ClientRequestException`: For example, for OneDrive, an `itemNotFound` error. * `ValueError`: For example, for Google Drive, a `File not found` error. For Amazon S3, an `Invalid endpoint` error. * `UserError`: For example, for Azure Blob Storage, a `DeploymentNotFound` error. * `ParamValidationError`: For example, for Amazon S3, an `Invalid bucket name` error. * `EndpointResolutionError`: For example, for Amazon S3, a `Custom endpoint not valid URI` error. * `ProgrammingError`: For example, for Snowflake, a `No active warehouse selected` error. * `UnboundLocalError`: For example, for SharePoint, a `cannot access 'site_drive_item'` error. * `HTTPError`: For example, for Confluence, a `404 Not Found` error for a specific page or attachment URL. * `KeyError`: For example, for Jira, an error containing the word 'total'. For Amazon S3, an error containing the word `Key`. ## Possible causes * Unstructured is configured to interact with a resource—such as a file, path, deployment, endpoint, or database object—that doesn't exist, is misnamed, or the configuration itself is invalid. * There is a typo in a bucket name, folder path, file ID, deployment name, hostname, site path, or database name. * The specified resource has been deleted or moved. * An endpoint URL is incorrectly formatted, for example, is missing `https://` or contains invalid characters. * An Amazon S3 bucket name is not formatted correctly. * A required configuration is missing in the source or destination connector, for example, there is no active Snowflake warehouse specified. * An Azure OpenAI deployment name is mismatched, failed, or does not exist. * An invalid URL is specified for an attachment or a link within a source document. * There is a specific configuration issue with a connector, for example, the specified SharePoint path does not lead to a valid drive. ## Possible solutions * **Verify names and paths**: Carefully check all configured names, IDs, and paths—such as for buckets, folders, files, sites, deployments, and endpoint URLs—for accuracy. Ensure they exist in the source and destination system. Case sensitivity often matters. * **Check formatting**: Ensure that URLs, bucket names, and other parameters adhere to the required format. * **Verify that the resource exists**: Confirm that the target file, folder, deployment, or other resource exists and has not been moved or deleted. * **Check configuration dependencies**: Ensure the necessary configurations are set in the source or destination, for example, select and start a Snowflake warehouse by running the `USE WAREHOUSE` command first. * For **Azure OpenAI**: Double-check that the Deployment Name matches a successful deployment in your Azure portal. * For **Confluence**: If a `404` error occurs during download, check if the page or attachment link is valid within Confluence itself. * For **SharePoint**: Verify the Site Path leads to a valid location containing document libraries. * **Reconfigure the connector**: Review and, as needed, correct any misconfigured settings in the source or destination connector. ## Additional resources To ask questions or get additional help with this issue, see [requesting support](/support/request). --- # Source: https://docs.unstructured.io/ui/sources/confluence.md # Source: https://docs.unstructured.io/open-source/ingestion/source-connectors/confluence.md # Source: https://docs.unstructured.io/api-reference/workflow/sources/confluence.md > ## Documentation Index > Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt > Use this file to discover all available pages before exploring further. # Confluence| Dataset | | Base Model’ | | Notes |
|---|---|---|
| PubLayNet | [38] F/M | Layouts of modern scientific documents |
| PRImA [3] | M | Layouts of scanned modern magazines and scientific reports |
| Newspaper | F | Layouts of scanned US newspapers from the 20th century |
| TableBank | F | Table region on modern scientific and business document |
| HJDataset [31] | F/M | Layouts of history Japanese documents |
## Step 2: Create a Dropbox app
In this step, you create a Dropbox app in your Dropbox account. Unstructured will use this app to access your Dropbox account.
1. From a new tab in your web browser, open the Dropbox Developers page, at [https://www.dropbox.com/developers](https://www.dropbox.com/developers).
2. Click **Create apps**.
8. On the Dropbox app's **Permissions** tab, under **Files and folders**, check the box labelled **files.content.read**, and then click **Submit**.
9. On the app's **Settings** tab, note the value of the **App folder name** field. This is the name of the subfolder that Dropbox will create under the `Apps` top-level folder in your Dropbox account. Your new Dropbox app will use this subfolder for access.
10. With the app's **Settings** tab still showing, scroll down to **App key**.
11. Next to **App secret**, click **Show**.
12. Note the values of **App key** and **App secret**, as you will need them later for Steps 3 and 5.
## Step 3: Get a refresh token for your Dropbox app
In this step, you get a refresh token for your Dropbox app. Unstructured needs this refresh token, along with the
**App key** and **App secret** from the previous step, to be able to use your Dropbox app to connect to your Dropbox account.
1. In a new tab in your web browser, enter the following address. In this address,
replace `
3. Click **Allow**.
4. Note the value in the **Access Code Generated** box.
3. Give the subfolder a name, and then click **Create**.
4. Click **Upload or drop** (or **Upload > Files** or **Upload > Folder**), and then follow the on-screen instructions to upload some documents to this subfolder in your Dropbox app folder.
For a Dropbox Basic account, the total size of all of the files you upload and store in your Dropbox account (not just this subfolder) cannot exceed 2 GB.
## Step 5: Create the Dropbox source connector
In this step, you create a Dropbox source connector in your Unstructured account. This source connector
is used by Unstructured to connect to your Dropbox account and then process the documents in the specified folder.
1. If you do not already have an Unstructured account, [sign up for free](https://unstructured.io/?modal=try-for-free).
After you sign up, you are automatically signed in to your new Unstructured **Let's Go** account, at [https://platform.unstructured.io](https://platform.unstructured.io).
4. Click **+ New**.
5. Enter some unique name for this connector, for example `dropbox-source-connector`.
6. For **Type**, click **Source**.
7. For **Provider**, click **Dropbox**.
8. Click **Continue**.
9. For **Data URL**, enter `dropbox://`, followed by the name of the subfolder you created in the previous step. For example,
if the name of the subfolder is `my-folder`, then the data URL would be `dropbox://my-folder`.
10. For **App key**, enter the **App key** you noted in Step 2.
11. For **App secret**, enter the **App secret** you noted in Step 2.
12. For **Refresh token**, enter the **Refresh token** you noted in Step 3.
13. Click **Save and Test**, and wait while Unstructured tests the connector.
14. If a green **Successful** message appears, then you have successfully created the connector.
If, however, a red error message appears, fix the issue, and try this step again.
If you cannot fix the issue, contact Unstructured Support at [support@unstructured.io](mailto:support@unstructured.io).
Congratulations! You have successfully created a Dropbox source connector in your Unstructured account.
If you are not able to complete these steps, contact Unstructured Support at [support@unstructured.io](mailto:support@unstructured.io).
## Next steps
* If you do not have a destination connector in your Unstructured account, then complete the [Pinecone destination connector quickstart](/ui/destinations/pinecone-destination-quickstart).
If you're not sure if you have a destination connector, click **Connectors** in your Unstructured account's sidebar, and then click **Destinations** to see if there are any listed.
* If you already have a destination connector, then you can add this Dropbox source connector as well as your destination connector to a workflow in your Unstructured account. To do this:
1. Click **Workflows** in your Unstructured account's sidebar.
2. Click **New Workflow +**.
3. With **Build it Myself** already selected, click **Continue**.
4. Click the **Source** node. (Do not click **Drop file to test**.)
5. On the **Details** tab, click **Connectors**, and then click the name of your Dropbox source connector.
6. Click the **Destination** node.
7. On the **Details** tab, click the name of your destination connector.
8. Switch **Active** to on, and then click **Save**.
10. Click **Jobs** in your Unstructured account's sidebar.
11. After the job shows **Finished** with a green checkmark, go to your destination's location to see Unstructured's processed data output.
If you are not able to complete these steps, contact Unstructured Support at [support@unstructured.io](mailto:support@unstructured.io).
---
# Source: https://docs.unstructured.io/ui/sources/dropbox.md
# Source: https://docs.unstructured.io/open-source/ingestion/source-connectors/dropbox.md
# Source: https://docs.unstructured.io/open-source/ingestion/destination-connectors/dropbox.md
# Source: https://docs.unstructured.io/api-reference/workflow/sources/dropbox.md
> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
> Use this file to discover all available pages before exploring further.
# Dropbox
## Replace an expired access token
Dropbox app access tokens are valid for **only four hours**. After this time, you can no longer use the expired access token.
To have Unstructured automatically replace expired access tokens on your behalf, do the following:
1. Get the app key and app secret values for your Dropbox app. To do this:
a) Sign in to the [Dropbox Developers](https://www.dropbox.com/developers) portal with the same credentials as your Dropbox account.
Before (vertical watermarked text, represented incorrectly):
```json theme={null}
{
"...": "...",
"text": "3 2 0 2 t c O 9 2 ] V C . s c [ 2 v 9 0 8 6 1 . 0 1 3 2 : v i X r",
"...": "..."
}
```
After (vertical watermarked text, now represented correctly from the original content):
```json theme={null}
{
"...": "...",
"text": "arXiv:2310.16809v2 [cs.CV] 29 Oct 2023",
"...": "..."
}
```
Example 2: Hyperlink
Before (hyperlink, represented incorrectly):
```json theme={null}
{
"...": "...",
"text": "con/Yuliang-Liu/MultinodalOCR|",
"...": "..."
}
```
After (hyperlink, now represented correctly from the original content):
```json theme={null}
{
"...": "...",
"text": "https://github.com/Yuliang-Liu/MultimodalOCR",
"...": "..."
}
```
Example 3: Chinese characters
Before (Chinese characters, represented incorrectly):
```json theme={null}
{
"...": "...",
"text": "GT SHE GPT4-V: EHES",
"...": "..."
}
```
After (Chinese characters, now represented correctly from the original content, expressed as Unicode):
```json theme={null}
{
"...": "...",
"text": "GT : \u91d1\u724c\u70e7\u814a GPT4-V: \u6587\u9759\u5019\u9e1f",
"...": "..."
}
```
## Improve text fidelity with generative OCR
To produce generative OCR optimizations, in an **Enrichment** node in a workflow, click the following
in the node's settings pane's **Details** tab:
* **Image** under **Input Type**.
* One of the following providers and models:
* **Anthropic** under **Provider** and any choice under **Model**
* **OpenAI** under **Provider** and any choice under **Model**
* **Generative OCR** under **Task**.
| Gels and karyotypes | 600 dpi (8 bit grayscale depth) |
| High pressure liquid chromatography | 300 |