Setting chunk size in knowledge databases
Note: This feature is currently available upon request and can be activated as an option for your company. To do so, please contact us at support@nele.ai.
In modern data processing, the term “chunks” is becoming increasingly important — particularly in the context of artificial intelligence (AI) and text analysis applications. A “chunk” is a piece or segment of data, in our case text, that is used for various analysis and processing processes. The “chunk size” determines how big these individual pieces of data are.
What is the chunk size?
The chunk size is a parameter that determines how many characters, words, or phrases are contained in a chunk. This setting influences how text information is divided into smaller units that can be processed by AI models. Adjusting the chunk size can have a significant impact on word processing efficiency and accuracy.
Importance of chunk size in knowledge bases
In nele.ai, administrators create knowledge databases that contain collections of documents from which the AI extracts targeted information. Chunk size plays an important role in setting up these knowledge databases, as it influences how documents are divided into processable units.
Creating knowledge databases
When adding new documents to a knowledge base, the administrator can set the chunk size. This attitude is critical as it influences the subsequent efficiency of information retrieval through AI.
Adjustments
Note that the set chunk size is fixed to the document when uploading. Changing the chunk size therefore only has an effect on newly uploaded documents. Existing documents in a knowledge base retain the chunk size that was set at the time of upload.
Effects of chunk size
“Detailed analysis” type
If a smaller chunk size is chosen, more chunks are created that can be analyzed in more detail. This is useful for tasks that require fine recognition of patterns or keywords.
Type “Understanding the full picture”
A larger chunk size allows a more comprehensive view and is suitable for applications such as text summaries.
Performance and efficiency vs. cost considerations
Larger chunks can provide faster processing, but they potentially contribute to higher costs as more data is transferred. Be aware of the possible higher costs for larger chunk sizes as more data is processed and transferred.
Conclusion
Understanding chunk size is critical to effectively using and managing knowledge databases in nele.ai. Be sure to set the chunk size carefully when creating new documents in your knowledge bases. This allows you to make optimal use of the resources of your AI application and increase productivity in your organization.