Skip to content
  • Auto
  • Light
  • Dark

Create

Create a Knowledge Base
knowledge_bases.create(KnowledgeBaseCreateParams**kwargs) -> KnowledgeBaseCreateResponse
post/v2/gen-ai/knowledge_bases

To create a knowledge base, send a POST request to /v2/gen-ai/knowledge_bases.

ParametersExpand Collapse
database_id: Optional[str]

Identifier of the DigitalOcean OpenSearch database this knowledge base will use, optional. If not provided, we create a new database for the knowledge base in the same region as the knowledge base.

datasources: Optional[Iterable[Datasource]]

The data sources to use for this knowledge base. See Organize Data Sources for more information on data sources best practices.

aws_data_source: Optional[AwsDataSourceParam]

AWS S3 Data Source

bucket_name: Optional[str]

Spaces bucket name

item_path: Optional[str]
key_id: Optional[str]

The AWS Key ID

region: Optional[str]

Region of bucket

secret_key: Optional[str]

The AWS Secret Key

bucket_name: Optional[str]

Deprecated, moved to data_source_details

bucket_region: Optional[str]

Deprecated, moved to data_source_details

dropbox_data_source: Optional[DatasourceDropboxDataSource]

Dropbox Data Source

folder: Optional[str]
refresh_token: Optional[str]

Refresh token. you can obrain a refresh token by following the oauth2 flow. see /v2/gen-ai/oauth2/dropbox/tokens for reference.

file_upload_data_source: Optional[APIFileUploadDataSourceParam]

File to upload as data source for knowledge base.

original_file_name: Optional[str]

The original file name

size_in_bytes: Optional[str]

The size of the file in bytes

formatuint64
stored_object_key: Optional[str]

The object key the file was stored as

item_path: Optional[str]
spaces_data_source: Optional[APISpacesDataSourceParam]

Spaces Bucket Data Source

bucket_name: Optional[str]

Spaces bucket name

item_path: Optional[str]
region: Optional[str]

Region of bucket

web_crawler_data_source: Optional[APIWebCrawlerDataSourceParam]

WebCrawlerDataSource

base_url: Optional[str]

The base url to crawl.

crawling_option: Optional[Literal["UNKNOWN", "SCOPED", "PATH", 2 more]]

Options for specifying how URLs found on pages should be handled.

  • UNKNOWN: Default unknown value
  • SCOPED: Only include the base URL.
  • PATH: Crawl the base URL and linked pages within the URL path.
  • DOMAIN: Crawl the base URL and linked pages within the same domain.
  • SUBDOMAINS: Crawl the base URL and linked pages for any subdomain.
Accepts one of the following:
"UNKNOWN"
"SCOPED"
"PATH"
"DOMAIN"
"SUBDOMAINS"
embed_media: Optional[bool]

Whether to ingest and index media (images, etc.) on web pages.

embedding_model_uuid: Optional[str]

Identifier for the embedding model.

name: Optional[str]

Name of the knowledge base.

project_id: Optional[str]

Identifier of the DigitalOcean project this knowledge base will belong to.

region: Optional[str]

The datacenter region to deploy the knowledge base in.

tags: Optional[SequenceNotStr[str]]

Tags to organize your knowledge base.

vpc_uuid: Optional[str]

The VPC to deploy the knowledge base database in

ReturnsExpand Collapse
class KnowledgeBaseCreateResponse:

Information about a newly created knowledge base

knowledge_base: Optional[APIKnowledgeBase]

Knowledgebase Description

added_to_agent_at: Optional[datetime]

Time when the knowledge base was added to the agent

formatdate-time
created_at: Optional[datetime]

Creation date / time

formatdate-time
database_id: Optional[str]
embedding_model_uuid: Optional[str]
is_public: Optional[bool]

Whether the knowledge base is public or not

last_indexing_job: Optional[APIIndexingJob]

IndexingJob description

completed_datasources: Optional[int]

Number of datasources indexed completed

formatint64
created_at: Optional[datetime]

Creation date / time

formatdate-time
data_source_uuids: Optional[List[str]]
finished_at: Optional[datetime]
formatdate-time
knowledge_base_uuid: Optional[str]

Knowledge base id

phase: Optional[Literal["BATCH_JOB_PHASE_UNKNOWN", "BATCH_JOB_PHASE_PENDING", "BATCH_JOB_PHASE_RUNNING", 4 more]]
Accepts one of the following:
"BATCH_JOB_PHASE_UNKNOWN"
"BATCH_JOB_PHASE_PENDING"
"BATCH_JOB_PHASE_RUNNING"
"BATCH_JOB_PHASE_SUCCEEDED"
"BATCH_JOB_PHASE_FAILED"
"BATCH_JOB_PHASE_ERROR"
"BATCH_JOB_PHASE_CANCELLED"
started_at: Optional[datetime]
formatdate-time
status: Optional[Literal["INDEX_JOB_STATUS_UNKNOWN", "INDEX_JOB_STATUS_PARTIAL", "INDEX_JOB_STATUS_IN_PROGRESS", 4 more]]
Accepts one of the following:
"INDEX_JOB_STATUS_UNKNOWN"
"INDEX_JOB_STATUS_PARTIAL"
"INDEX_JOB_STATUS_IN_PROGRESS"
"INDEX_JOB_STATUS_COMPLETED"
"INDEX_JOB_STATUS_FAILED"
"INDEX_JOB_STATUS_NO_CHANGES"
"INDEX_JOB_STATUS_PENDING"
tokens: Optional[int]

Number of tokens

formatint64
total_datasources: Optional[int]

Number of datasources being indexed

formatint64
total_items_failed: Optional[str]

Total Items Failed

formatuint64
total_items_indexed: Optional[str]

Total Items Indexed

formatuint64
total_items_skipped: Optional[str]

Total Items Skipped

formatuint64
updated_at: Optional[datetime]

Last modified

formatdate-time
uuid: Optional[str]

Unique id

name: Optional[str]

Name of knowledge base

project_id: Optional[str]
region: Optional[str]

Region code

tags: Optional[List[str]]

Tags to organize related resources

updated_at: Optional[datetime]

Last modified

formatdate-time
user_id: Optional[str]

Id of user that created the knowledge base

formatint64
uuid: Optional[str]

Unique id for knowledge base

Create a Knowledge Base
from gradient import Gradient

client = Gradient(
    access_token="My Access Token",
)
knowledge_base = client.knowledge_bases.create()
print(knowledge_base.knowledge_base)
{
  "knowledge_base": {
    "added_to_agent_at": "2023-01-01T00:00:00Z",
    "created_at": "2023-01-01T00:00:00Z",
    "database_id": "123e4567-e89b-12d3-a456-426614174000",
    "embedding_model_uuid": "123e4567-e89b-12d3-a456-426614174000",
    "is_public": true,
    "last_indexing_job": {
      "completed_datasources": 123,
      "created_at": "2023-01-01T00:00:00Z",
      "data_source_uuids": [
        "example string"
      ],
      "finished_at": "2023-01-01T00:00:00Z",
      "knowledge_base_uuid": "123e4567-e89b-12d3-a456-426614174000",
      "phase": "BATCH_JOB_PHASE_UNKNOWN",
      "started_at": "2023-01-01T00:00:00Z",
      "status": "INDEX_JOB_STATUS_UNKNOWN",
      "tokens": 123,
      "total_datasources": 123,
      "total_items_failed": "12345",
      "total_items_indexed": "12345",
      "total_items_skipped": "12345",
      "updated_at": "2023-01-01T00:00:00Z",
      "uuid": "123e4567-e89b-12d3-a456-426614174000"
    },
    "name": "example name",
    "project_id": "123e4567-e89b-12d3-a456-426614174000",
    "region": "example string",
    "tags": [
      "example string"
    ],
    "updated_at": "2023-01-01T00:00:00Z",
    "user_id": "user_id",
    "uuid": "123e4567-e89b-12d3-a456-426614174000"
  }
}
Returns Examples
{
  "knowledge_base": {
    "added_to_agent_at": "2023-01-01T00:00:00Z",
    "created_at": "2023-01-01T00:00:00Z",
    "database_id": "123e4567-e89b-12d3-a456-426614174000",
    "embedding_model_uuid": "123e4567-e89b-12d3-a456-426614174000",
    "is_public": true,
    "last_indexing_job": {
      "completed_datasources": 123,
      "created_at": "2023-01-01T00:00:00Z",
      "data_source_uuids": [
        "example string"
      ],
      "finished_at": "2023-01-01T00:00:00Z",
      "knowledge_base_uuid": "123e4567-e89b-12d3-a456-426614174000",
      "phase": "BATCH_JOB_PHASE_UNKNOWN",
      "started_at": "2023-01-01T00:00:00Z",
      "status": "INDEX_JOB_STATUS_UNKNOWN",
      "tokens": 123,
      "total_datasources": 123,
      "total_items_failed": "12345",
      "total_items_indexed": "12345",
      "total_items_skipped": "12345",
      "updated_at": "2023-01-01T00:00:00Z",
      "uuid": "123e4567-e89b-12d3-a456-426614174000"
    },
    "name": "example name",
    "project_id": "123e4567-e89b-12d3-a456-426614174000",
    "region": "example string",
    "tags": [
      "example string"
    ],
    "updated_at": "2023-01-01T00:00:00Z",
    "user_id": "user_id",
    "uuid": "123e4567-e89b-12d3-a456-426614174000"
  }
}