Data Sources
List Data Sources for a Knowledge Base
Add Data Source to a Knowledge Base
Delete a Data Source from a Knowledge Base
Create Presigned URLs for Data Source File Upload
ModelsExpand Collapse
APIFileUploadDataSource = object { original_file_name, size_in_bytes, stored_object_key } File to upload as data source for knowledge base.
File to upload as data source for knowledge base.
The original file name
The size of the file in bytes
The object key the file was stored as
APIKnowledgeBaseDataSource = object { aws_data_source, bucket_name, created_at, 10 more } Data Source configuration for Knowledge Bases
Data Source configuration for Knowledge Bases
aws_data_source: optional object { bucket_name, item_path, region } AWS S3 Data Source for Display
AWS S3 Data Source for Display
Spaces bucket name
Region of bucket
Name of storage bucket - Deprecated, moved to data_source_details
Creation date / time
dropbox_data_source: optional object { folder } Dropbox Data Source for Display
Dropbox Data Source for Display
file_upload_data_source: optional APIFileUploadDataSource { original_file_name, size_in_bytes, stored_object_key } File to upload as data source for knowledge base.
File to upload as data source for knowledge base.
The original file name
The size of the file in bytes
The object key the file was stored as
google_drive_data_source: optional object { folder_id, folder_name } Google Drive Data Source for Display
Google Drive Data Source for Display
Name of the selected folder if available
Path of folder or object in bucket - Deprecated, moved to data_source_details
last_datasource_indexing_job: optional APIIndexedDataSource { completed_at, data_source_uuid, error_details, 11 more }
Timestamp when data source completed indexing
Uuid of the indexed data source
A detailed error description
A string code provinding a hint which part of the system experienced an error
Total count of files that have failed
Total count of files that have been indexed
Total count of files that have been indexed
Total count of files that have been removed
Total count of files that have been skipped
Timestamp when data source started indexing
status: optional "DATA_SOURCE_STATUS_UNKNOWN" or "DATA_SOURCE_STATUS_IN_PROGRESS" or "DATA_SOURCE_STATUS_UPDATED" or 4 more
Total size of files in data source in bytes
Total size of files in data source in bytes that have been indexed
Total file count in the data source
Region code - Deprecated, moved to data_source_details
Spaces Bucket Data Source
Spaces Bucket Data Source
Spaces bucket name
Region of bucket
Last modified
Unique id of knowledge base
web_crawler_data_source: optional APIWebCrawlerDataSource { base_url, crawling_option, embed_media, exclude_tags } WebCrawlerDataSource
WebCrawlerDataSource
The base url to crawl.
crawling_option: optional "UNKNOWN" or "SCOPED" or "PATH" or 3 moreOptions for specifying how URLs found on pages should be handled.
- UNKNOWN: Default unknown value
- SCOPED: Only include the base URL.
- PATH: Crawl the base URL and linked pages within the URL path.
- DOMAIN: Crawl the base URL and linked pages within the same domain.
- SUBDOMAINS: Crawl the base URL and linked pages for any subdomain.
- SITEMAP: Crawl URLs discovered in the sitemap.
Options for specifying how URLs found on pages should be handled.
- UNKNOWN: Default unknown value
- SCOPED: Only include the base URL.
- PATH: Crawl the base URL and linked pages within the URL path.
- DOMAIN: Crawl the base URL and linked pages within the same domain.
- SUBDOMAINS: Crawl the base URL and linked pages for any subdomain.
- SITEMAP: Crawl URLs discovered in the sitemap.
Whether to ingest and index media (images, etc.) on web pages.
Declaring which tags to exclude in web pages while webcrawling
APISpacesDataSource = object { bucket_name, item_path, region } Spaces Bucket Data Source
Spaces Bucket Data Source
Spaces bucket name
Region of bucket
APIWebCrawlerDataSource = object { base_url, crawling_option, embed_media, exclude_tags } WebCrawlerDataSource
WebCrawlerDataSource
The base url to crawl.
crawling_option: optional "UNKNOWN" or "SCOPED" or "PATH" or 3 moreOptions for specifying how URLs found on pages should be handled.
- UNKNOWN: Default unknown value
- SCOPED: Only include the base URL.
- PATH: Crawl the base URL and linked pages within the URL path.
- DOMAIN: Crawl the base URL and linked pages within the same domain.
- SUBDOMAINS: Crawl the base URL and linked pages for any subdomain.
- SITEMAP: Crawl URLs discovered in the sitemap.
Options for specifying how URLs found on pages should be handled.
- UNKNOWN: Default unknown value
- SCOPED: Only include the base URL.
- PATH: Crawl the base URL and linked pages within the URL path.
- DOMAIN: Crawl the base URL and linked pages within the same domain.
- SUBDOMAINS: Crawl the base URL and linked pages for any subdomain.
- SITEMAP: Crawl URLs discovered in the sitemap.
Whether to ingest and index media (images, etc.) on web pages.
Declaring which tags to exclude in web pages while webcrawling
AwsDataSource = object { bucket_name, item_path, key_id, 2 more } AWS S3 Data Source
AWS S3 Data Source
Spaces bucket name
The AWS Key ID
Region of bucket
The AWS Secret Key