Data Sources
Data Sources
Add Data Source to a Knowledge Base
Create Presigned URLs for Data Source File Upload
Delete a Data Source from a Knowledge Base
List Data Sources for a Knowledge Base
ModelsExpand Collapse
APIFileUploadDataSource = object { original_file_name, size_in_bytes, stored_object_key } File to upload as data source for knowledge base.
File to upload as data source for knowledge base.
The original file name
The size of the file in bytes
The object key the file was stored as
APIKnowledgeBaseDataSource = object { aws_data_source, bucket_name, created_at, 10 more } Data Source configuration for Knowledge Bases
Data Source configuration for Knowledge Bases
aws_data_source: optional object { bucket_name, item_path, region } AWS S3 Data Source for Display
AWS S3 Data Source for Display
Spaces bucket name
Region of bucket
Name of storage bucket - Deprecated, moved to data_source_details
Creation date / time
dropbox_data_source: optional object { folder } Dropbox Data Source for Display
Dropbox Data Source for Display
file_upload_data_source: optional APIFileUploadDataSource { original_file_name, size_in_bytes, stored_object_key } File to upload as data source for knowledge base.
File to upload as data source for knowledge base.
The original file name
The size of the file in bytes
The object key the file was stored as
Path of folder or object in bucket - Deprecated, moved to data_source_details
last_datasource_indexing_job: optional APIIndexedDataSource { completed_at, data_source_uuid, error_details, 11 more }
Timestamp when data source completed indexing
Uuid of the indexed data source
A detailed error description
A string code provinding a hint which part of the system experienced an error
Total count of files that have failed
Total count of files that have been indexed
Total count of files that have been indexed
Total count of files that have been removed
Total count of files that have been skipped
Timestamp when data source started indexing
status: optional "DATA_SOURCE_STATUS_UNKNOWN" or "DATA_SOURCE_STATUS_IN_PROGRESS" or "DATA_SOURCE_STATUS_UPDATED" or 3 more
Total size of files in data source in bytes
Total size of files in data source in bytes that have been indexed
Total file count in the data source
last_indexing_job: optional APIIndexingJob { completed_datasources, created_at, data_source_uuids, 12 more } IndexingJob description
IndexingJob description
Number of datasources indexed completed
Creation date / time
Knowledge base id
phase: optional "BATCH_JOB_PHASE_UNKNOWN" or "BATCH_JOB_PHASE_PENDING" or "BATCH_JOB_PHASE_RUNNING" or 4 more
status: optional "INDEX_JOB_STATUS_UNKNOWN" or "INDEX_JOB_STATUS_PARTIAL" or "INDEX_JOB_STATUS_IN_PROGRESS" or 4 more
Number of tokens
Number of datasources being indexed
Total Items Failed
Total Items Indexed
Total Items Skipped
Last modified
Unique id
Region code - Deprecated, moved to data_source_details
Spaces Bucket Data Source
Spaces Bucket Data Source
Spaces bucket name
Region of bucket
Last modified
Unique id of knowledge base
web_crawler_data_source: optional APIWebCrawlerDataSource { base_url, crawling_option, embed_media } WebCrawlerDataSource
WebCrawlerDataSource
The base url to crawl.
crawling_option: optional "UNKNOWN" or "SCOPED" or "PATH" or 2 moreOptions for specifying how URLs found on pages should be handled.
- UNKNOWN: Default unknown value
- SCOPED: Only include the base URL.
- PATH: Crawl the base URL and linked pages within the URL path.
- DOMAIN: Crawl the base URL and linked pages within the same domain.
- SUBDOMAINS: Crawl the base URL and linked pages for any subdomain.
Options for specifying how URLs found on pages should be handled.
- UNKNOWN: Default unknown value
- SCOPED: Only include the base URL.
- PATH: Crawl the base URL and linked pages within the URL path.
- DOMAIN: Crawl the base URL and linked pages within the same domain.
- SUBDOMAINS: Crawl the base URL and linked pages for any subdomain.
Whether to ingest and index media (images, etc.) on web pages.
APISpacesDataSource = object { bucket_name, item_path, region } Spaces Bucket Data Source
Spaces Bucket Data Source
Spaces bucket name
Region of bucket
APIWebCrawlerDataSource = object { base_url, crawling_option, embed_media } WebCrawlerDataSource
WebCrawlerDataSource
The base url to crawl.
crawling_option: optional "UNKNOWN" or "SCOPED" or "PATH" or 2 moreOptions for specifying how URLs found on pages should be handled.
- UNKNOWN: Default unknown value
- SCOPED: Only include the base URL.
- PATH: Crawl the base URL and linked pages within the URL path.
- DOMAIN: Crawl the base URL and linked pages within the same domain.
- SUBDOMAINS: Crawl the base URL and linked pages for any subdomain.
Options for specifying how URLs found on pages should be handled.
- UNKNOWN: Default unknown value
- SCOPED: Only include the base URL.
- PATH: Crawl the base URL and linked pages within the URL path.
- DOMAIN: Crawl the base URL and linked pages within the same domain.
- SUBDOMAINS: Crawl the base URL and linked pages for any subdomain.
Whether to ingest and index media (images, etc.) on web pages.
AwsDataSource = object { bucket_name, item_path, key_id, 2 more } AWS S3 Data Source
AWS S3 Data Source
Spaces bucket name
The AWS Key ID
Region of bucket
The AWS Secret Key