WeKnora

mirror of https://github.com/Tencent/WeKnora.git synced 2026-06-04 13:30:32 +08:00

Author	SHA1	Message	Date
Windfarer	c1816fe6d6	add oidc	2026-03-30 11:13:44 +08:00
wizardchen	72dfb9ce75	feat(prompt): enhance context and intent classification in templates - Updated context template to include runtime metadata such as current time and week, improving contextual awareness in user queries. - Enhanced rewrite template with critical instructions for intent classification, ensuring that rewritten questions preserve essential entities and keywords. - Refined intent classification logic to prioritize user intents more effectively, improving the accuracy of responses based on user queries. - Added examples to clarify expected input and output formats for intent classification, enhancing usability for developers.	2026-03-26 18:57:12 +08:00
ChenRussell	27d4525ba9	fix(app): prevent UTF-8 truncation in summary fallback When LLM summary generation fails, the fallback logic truncates the first chunk content to 500 characters. Previously used byte-based slicing (summary[:500]) which could cut multi-byte UTF-8 characters in half, causing PostgreSQL to reject invalid byte sequences. Changed to rune-based truncation to ensure valid UTF-8 after truncation. Fixes: Documents stuck in "Generating summary" status when containing Chinese/emoji characters and summary generation fails.	2026-03-26 18:52:08 +08:00
PR Bot	d0811898b0	feat: upgrade MiniMax models from M2.1 to M2.7 Update MiniMax provider description and model references from the legacy MiniMax-M2.1/M2.1-lightning to the latest MiniMax-M2.7/M2.7-highspeed models. Also update Novita AI's MiniMax model reference from M2.5 to M2.7. Changes: - Update MiniMax provider description in minimax.go - Update Novita provider MiniMax model reference in novita.go - Update all i18n locale files (en-US, zh-CN, ko-KR, ru-RU) - Add MiniMax provider validation tests	2026-03-26 18:51:25 +08:00
wizardchen	fd5278d62d	feat(security): enhance SSRF protection in RemoteAPIChat - Replaced the default DialContext with SSRFSafeDialContext in the raw HTTP client to improve security against DNS rebinding attacks. - Added SSRF validation for BaseURL and endpoint in NewRemoteAPIChat and chat methods, ensuring safer URL handling. - Updated security utility functions to provide consistent SSRF checks across the application.	2026-03-26 11:37:07 +08:00
wizardchen	41fcdf1a83	chore(go.mod): update dependencies to include tiktoken-go and golang.org/x/sys - Added github.com/tiktoken-go/tokenizer v0.7.0 to the required dependencies for improved tokenization support. - Included golang.org/x/sys v0.40.0 to enhance system call functionalities. - Removed indirect references to these packages to streamline the dependency management.	2026-03-25 22:12:50 +08:00
wizardchen	ac0eaf2b3d	feat(agent): add GetSuggestedQuestions API for retrieving agent-specific question suggestions - Implemented a new API method to fetch suggested questions based on an agent's knowledge bases, enhancing user interaction in the chat interface. - Introduced data structures for request and response handling, including options for limiting results and specifying knowledge base IDs. - Enhanced query construction to support flexible filtering of suggested questions, improving the overall chat experience.	2026-03-25 22:08:29 +08:00
wizardchen	23290efcec	feat(image-multimodal): add task enqueuing for question generation based on image knowledge - Introduced a new task enqueuer in the ImageMultimodalService to handle question generation for image knowledge. - Implemented logic to enqueue question generation tasks when valid captions or OCR content are available, enhancing the knowledge base's interactivity. - Updated image reference handling to ensure markdown URLs are correctly formatted, accommodating filenames with spaces. - Enhanced regex pattern in image resolver to support filenames with spaces, improving image processing accuracy.	2026-03-25 22:08:29 +08:00
wizardchen	1960fc2d90	feat(suggested-questions): implement suggested questions feature for chat interface - Added a new API endpoint to fetch suggested questions based on the agent's associated knowledge bases. - Implemented frontend components to display suggested questions in both chat and create chat views, enhancing user interaction. - Updated localization files to include translations for suggested questions in English, Korean, Russian, and Chinese. - Introduced debounce logic for fetching suggestions to optimize performance during knowledge base changes. - Enhanced backend services to support fetching and returning suggested questions, improving the overall chat experience.	2026-03-25 22:08:29 +08:00
wizardchen	1416217a0a	refactor(prompt): streamline context handling in templates - Removed redundant instructions from the context template to simplify user guidance. - Added context retrieval information to the system prompt, ensuring consistency in how context is presented across templates. - Enhanced parameter casting to allow single strings to be treated as string arrays, improving flexibility in parameter handling. - Adjusted minimum thought requirements in sequential thinking to allow for more varied input scenarios.	2026-03-25 22:08:29 +08:00
wizardchen	ae89871ac5	feat(knowledge): enhance search functionality to include title filtering and URL handling - Updated search queries to filter by both file name and title when a keyword is provided, improving search accuracy. - Added support for filtering by URL and HTML file types in search queries, enhancing the flexibility of the search functionality. - Refactored query construction for file types to streamline the inclusion of additional conditions, ensuring robust handling of various file types.	2026-03-25 22:08:29 +08:00
wizardchen	ac04a8efcb	feat(agent): enhance chunk search functionality to include total chunk counts - Updated the searchChunks method to calculate and include the total chunk count for each knowledge ID in the results. - Introduced logic to fetch chunk counts based on unique knowledge IDs, improving the efficiency of data retrieval. - Enhanced error handling for chunk count fetching, ensuring robust logging in case of failures.	2026-03-25 22:08:29 +08:00
wizardchen	3daaabfc4a	refactor(elasticsearch): remove ".keyword" suffix from ID fields in Elasticsearch queries - Updated Elasticsearch repository methods to use ID fields without the ".keyword" suffix, aligning with the new index mapping requirements. - Enhanced documentation for the VectorEmbedding struct to clarify expected index mapping and usage guidelines for ID fields. - Ensured consistency across all repository methods that interact with ID fields, improving query performance and maintainability.	2026-03-25 22:08:29 +08:00
wizardchen	164c39444d	feat(knowledge): add optional channel parameter to CreateKnowledgeFromFile - Updated the CreateKnowledgeFromFile function to include a new optional `channel` parameter, allowing for better tracking of knowledge entry sources. - Enhanced documentation to reflect the addition of the channel parameter, improving clarity for future developers.	2026-03-25 22:08:29 +08:00
wizardchen	5245475e0f	feat(knowledge): add channel support to knowledge entries and related components - Introduced a new `channel` field in the Knowledge struct and associated request types to track the source channel (e.g., "web", "api", "browser_extension"). - Updated various frontend components to display channel information and enhance user experience with channel labels. - Enhanced localization files to support channel labels in English and Chinese. - Modified backend services and database migrations to accommodate the new channel feature, ensuring consistent tracking across knowledge entries. - Refactored related functions to integrate channel handling, improving overall knowledge management and context.	2026-03-25 22:08:29 +08:00
wizardchen	aedaea274d	feat(agent): improve logging of LLM call messages to enhance clarity and reduce redundancy - Updated logging in the callLLMWithRetry function to summarize message details, limiting the output to the last few messages to avoid repetition. - Introduced a constant to define the maximum number of detailed messages logged, improving the readability of logs during LLM calls.	2026-03-25 22:08:29 +08:00
wizardchen	5b2ebd1f44	feat(channel): add channel support across various components and locales - Introduced a new `channel` field in multiple request and message structures to track the source channel (e.g., "web", "api", "im"). - Updated frontend API calls and chat components to include the `channel` parameter, ensuring consistent channel tracking in user messages. - Enhanced localization files to support channel labels in English, Korean, Russian, and Chinese. - Added database migration scripts to incorporate the `channel` column in the messages table, facilitating the storage of channel information. - Refactored related functions and components to accommodate the new channel feature, improving overall message context and tracking.	2026-03-25 22:08:29 +08:00
wizardchen	de0169d7c0	feat(chat): add ParallelToolCalls option to ChatOptions and update related functionality - Introduced a new ParallelToolCalls field in ChatOptions to control parallel execution of tool calls. - Updated the BuildChatCompletionRequest method to handle the new ParallelToolCalls option, ensuring it is correctly propagated in requests. - Enhanced the streamThinkingToEventBus function to utilize the ParallelToolCalls setting. - Added comprehensive tests to validate the behavior of ParallelToolCalls in various scenarios, ensuring correct propagation and default handling.	2026-03-25 22:08:29 +08:00
wizardchen	af2956fdf1	feat(tests): add comprehensive tests for MCPTool functionality - Introduced a new test file for MCPTool, covering various aspects such as name sanitization, name generation based on service and tool names, and description handling. - Implemented tests to ensure consistent tool name generation across different UUIDs and validate the maximum length constraints for tool names. - Added tests for tool registration behavior, ensuring that the first registered tool wins in case of name collisions. - Enhanced parameter handling tests to verify schema retrieval and default behavior when no schema is provided.	2026-03-25 22:08:29 +08:00
wizardchen	a8924e301e	feat(agent): enhance message consolidation and context management - Improved the message consolidation logic to preserve the current turn's user query and all subsequent assistant/tool messages, ensuring better context retention. - Updated the CompressContext function to reflect the new consolidation strategy, maintaining the system prompt and relevant recent messages. - Refactored the context manager to support optional message repository for improved context rebuilding from persistent storage. - Added comprehensive tests to validate the new consolidation behavior and ensure correct message handling across various scenarios.	2026-03-25 22:08:29 +08:00
wizardchen	b980d4619d	feat(knowledge): add ClearKnowledgeBaseContents API to delete all entries in a knowledge base - Implemented ClearKnowledgeBaseContents function in the Client to asynchronously delete all knowledge entries while preserving the knowledge base. - Added corresponding ClearKnowledgeBaseContents handler in the KnowledgeHandler to validate access and enqueue deletion tasks. - Updated router to include a new DELETE endpoint for clearing knowledge base contents.	2026-03-25 22:08:29 +08:00
wizardchen	2b56c4fbac	feat(agent): enhance streaming diagnostics and event tracking - Added diagnostics for streaming events in the AgentEngine, including chunk counts and response type distributions. - Implemented logging for emitted event types during the thinking process to aid in debugging and performance analysis. - Updated the session handler to provide more informative warnings when fallback answers are emitted due to non-streaming behavior. - Introduced a dedicated HTTP client for Ollama to improve connection handling and timeout management during streaming calls.	2026-03-25 22:08:29 +08:00
wizardchen	1c9503b063	feat(docparser): enhance image resolution capabilities in markdown and HTML - Updated regex patterns in MarkdownImageUtil to support alt text containing brackets and handle MIME types with hyphens. - Implemented new functions in ImageResolver for resolving HTML <img> tags with data URIs and bare base64 content, improving image handling in markdown. - Added comprehensive tests for various image scenarios, ensuring robust handling of data URIs and base64 images.	2026-03-25 22:08:29 +08:00
wizardchen	2b4e614dc2	refactor(agent): enhanced the checkFAQQuestionDuplicate function in knowledge.go by adding structured comments to outline the validation steps for FAQ questions, improving readability and maintainability.	2026-03-25 22:08:29 +08:00
wizardchen	4fb08f3ec8	feat(docparser): enhance sortedKeys function for numeric string sorting - Modified the sortedKeys function to sort keys numerically when all keys are numeric strings, ensuring correct order for keys like "2" and "10". - Updated the NeedsKBRetrieval method to include IntentSummarize as a valid case for knowledge base retrieval. - Added a new test data file for chat import to support testing of document parsing and processing functionalities.	2026-03-25 22:08:29 +08:00
wizardchen	e7c21c0733	feat(chat): add Volcengine support and refactor thinking configuration - Introduced Volcengine as a new provider in the chat provider specifications. - Refactored the thinking configuration to a more generic format, applicable to both LKEAP and Volcengine. - Updated request customizers to handle the new thinking configuration structure for both LKEAP and Volcengine providers.	2026-03-25 22:08:29 +08:00
wizardchen	3e3ffa4b0b	feat(i18n): add JSON file type support across multiple locales - Introduced support for JSON file types in English, Korean, Russian, and Chinese locale files. - Updated the file type acceptance criteria in the document upload component to include JSON. - Enhanced the knowledge base parser settings to recognize JSON as a valid file type. - Implemented JSON conversion functionality in the document parser, allowing for JSON content to be processed and converted to markdown. - Added comprehensive tests for JSON to markdown conversion to ensure functionality and reliability.	2026-03-25 22:08:29 +08:00
wizardchen	649c9d3dc9	refactor(chat_pipeline): improve regex patterns for table parsing - Updated regex patterns for table separator and row matching to use [ \t] instead of \s, preventing newline consumption across rows. - Enhanced comments for clarity on the changes made to the regex expressions.	2026-03-25 22:08:29 +08:00
wizardchen	47397464ef	refactor(chat_pipeline): enhance merging logic and add partial overlap removal - Updated the merging process in the chat pipeline to include a new step for re-merging overlapping ranges introduced by context expansion. - Implemented a new function, removePartialOverlaps, to eliminate chunks that are largely contained within higher-scored chunks, improving deduplication accuracy. - Enhanced normalization and overlap checking utilities to support the new deduplication logic.	2026-03-25 22:08:29 +08:00
wizardchen	15ebfe0b4c	refactor(model): streamline model filtering and SSRF validation - Simplified model filtering logic by removing the deduplication function and directly filtering models based on type. - Updated SSRF validation to preserve backend-managed fields when parameters are not provided by the frontend. - Clarified comments and improved code readability in the ModelEditorDialog and ModelSettings components.	2026-03-25 22:08:29 +08:00
wizardchen	68dd6aa778	refactor(knowledge): increase page size for tag retrieval in buildTagMap function - Updated the page size from 100 to 1000 in the buildTagMap function to enhance performance and efficiency when retrieving tags from the knowledge base.	2026-03-25 22:08:29 +08:00
wizardchen	a6c38123b6	refactor(knowledge): optimize tag retrieval in buildTagMap function - Changed the tag retrieval process to support pagination, allowing for more efficient handling of large sets of tags. - Adjusted the page size to 100 for better performance and reduced memory usage. - Ensured that the function continues fetching tags until all are retrieved, improving the completeness of the tag map.	2026-03-25 22:08:29 +08:00
wizardchen	f368b987bb	feat(agent): refactor agent engine creation and enhance MCP tool registration - Improved the CreateAgentEngine method by breaking down the process into distinct steps for better readability and maintainability. - Added a new registerMCPTools method to streamline the registration of MCP tools based on the agent configuration. - Enhanced error handling and logging for MCP service registration, ensuring clearer feedback during the process. - Introduced resolveKBAndDocInfos to load knowledge base metadata and selected document information for prompt generation.	2026-03-25 22:08:29 +08:00
wizardchen	b02e3f1dbb	feat(knowledge): implement knowledge item rebuild confirmation dialog and enhance chunk processing options - Added a confirmation dialog for rebuilding knowledge items in the KnowledgeBase component. - Introduced new state variables to manage the rebuild process and item selection. - Enhanced the processChunks function to support question generation options based on configuration. - Updated the knowledge service to handle question generation settings during document processing.	2026-03-25 22:08:29 +08:00
wizardchen	0e01a8ca64	feat(server): enhance server startup and shutdown process - Added support for SO_REUSEPORT in the listenWithRetry function to improve port binding during hot-reloads. - Implemented graceful shutdown by closing the listener immediately upon receiving a shutdown signal, allowing for quicker port release. - Updated logging to provide clearer feedback during server shutdown and error handling.	2026-03-25 22:08:29 +08:00
wizardchen	39816f2756	feat(security): add validation for file path to prevent path traversal attacks - Implemented a check to reject file paths containing "..", enhancing security against path traversal vulnerabilities in the file serving functionality.	2026-03-25 22:08:29 +08:00
wizardchen	b2c009e61c	feat(security): implement SSRF and path traversal protections in image handling and file downloads - Added SSRF protection by stripping client-supplied URL and Caption fields from image attachments in the QA request handler. - Introduced a validation function to ensure Feishu API path parameters contain only safe characters, preventing path traversal attacks. - Enhanced the WeCom webhook adapter to reject internal/private URLs unless they are on an allowlist, improving security during file downloads. - Implemented a mechanism to bypass SSRF checks for trusted IM platform API hosts.	2026-03-25 22:08:29 +08:00
wizardchen	052be53c42	feat(prompts): update assistant descriptions and add intent-specific templates - Updated the assistant descriptions in various prompt templates to specify that they are developed by Tencent. - Added new intent-specific system prompt templates for greeting, chitchat, follow-up, image analysis, summarization, and web search unavailability scenarios. - Enhanced citation rules in the system prompts to ensure proper inline citation formatting. - Refactored the rewrite prompt to classify user intent more accurately and output structured JSON responses.	2026-03-25 22:08:29 +08:00
wizardchen	109393dd16	fix(server): reduce kill delay and implement port retry mechanism - Changed the kill delay from 30 seconds to 2 seconds in the configuration. - Refactored the server startup process to include a retry mechanism for binding to the port, addressing potential issues during hot-reload scenarios. - Added a new function, listenWithRetry, to handle port binding with exponential backoff, improving server reliability during restarts.	2026-03-25 22:08:29 +08:00
wizardchen	0bb1a581de	feat(agent): enhance token management and context handling in AgentEngine - Added token usage tracking to the AgentEngine, allowing for better estimation of current context token counts. - Implemented a new method to estimate current tokens based on previous usage and newly appended messages, improving context management. - Updated context window management to utilize estimated token counts for more efficient message consolidation. - Enhanced streaming methods to include token usage information, providing better insights into LLM interactions. - Refactored the token estimator to support BPE tokenization, ensuring accurate token counts for messages. - Added unit tests to validate new token estimation and context management functionalities.	2026-03-25 22:08:29 +08:00
wizardchen	0d4ceb6c96	feat(image): enhance image resolution handling for data URIs and remote images - Added functionality to resolve embedded data:image;base64 images in markdown, storing them and replacing references with stable URLs. - Updated the knowledge service to process data URIs before handling remote images, ensuring all images are resolved consistently. - Improved logging for resolved images, providing better insights during manual processing. - Added unit tests to validate the new data URI resolution functionality and ensure reliability.	2026-03-25 22:08:29 +08:00
wizardchen	9650188a35	feat(chunk): add FindFAQChunkWithDuplicateQuestion method for enhanced duplicate question checking - Implemented a new method in the chunk repository to find FAQ chunks with overlapping standard or similar questions. - Updated the knowledge service to utilize this method for checking duplicates during FAQ question creation. - Improved error handling for duplicate questions, providing clearer feedback on conflicts.	2026-03-25 22:08:29 +08:00
wizardchen	3a8bd36d8a	feat(env): add SSRF whitelist configuration to .env.example and docker-compose.yml	2026-03-25 22:08:29 +08:00
wizardchen	d7fd3b5eee	feat(types): add language field to QuestionGenerationPayload for locale support	2026-03-25 22:08:29 +08:00
wizardchen	59d2f6b54c	feat(graph): enhance graph extraction prompt rendering and language context handling - Added a new method to render graph extraction prompts with shared placeholders for language. - Updated entity and relationship extraction methods to utilize the new rendering function, improving prompt customization. - Enhanced question generation task to include language context, ensuring prompts are generated in the correct language. - Improved error handling for empty prompt configurations in question generation, enhancing robustness.	2026-03-25 22:08:29 +08:00
wizardchen	08ac3937ac	fix(agent): update tool hint comment for clarity in UI progress display	2026-03-25 22:08:29 +08:00
wizardchen	a167886aac	chore(docker): update PostgreSQL image version to v0.22.2-pg17 in development and production configurations	2026-03-25 22:08:29 +08:00
wizardchen	e138e0c81a	refactor(agent): remove query display from DatabaseQuery component and update tool result structure - Removed the query display section from the DatabaseQuery component to streamline the UI. - Updated the DatabaseQueryData interface to eliminate the query field, reflecting changes in tool result handling. - Enhanced the DatabaseQuery tool execution to focus on returning structured results without exposing raw SQL queries. - Added a new utility to strip <think> blocks from LLM outputs, improving content clarity and user experience.	2026-03-25 22:08:29 +08:00
wizardchen	637376428e	feat(agent): implement core agent functionalities and context management - Introduced the AgentEngine with methods for executing tool calls, managing context windows, and handling final answer generation. - Added functionality to stream LLM responses and tool results through the EventBus, enhancing real-time interaction. - Implemented context management to consolidate messages and prevent exceeding token limits, ensuring efficient use of context. - Enhanced error handling and logging for tool execution and response analysis, improving robustness and traceability. - Added unit tests to validate the new functionalities and ensure reliability in agent operations.	2026-03-25 22:08:29 +08:00
wizardchen	2f7e3a8ac1	feat(agent): enhance tool execution and message sanitization - Introduced a default timeout for tool execution to prevent indefinite hangs during long-running tasks. - Implemented context cancellation handling in the execution loop to gracefully manage timeouts and user cancellations. - Added a JSON repair utility to fix common malformations in tool arguments, improving robustness in argument parsing. - Enhanced message sanitization to merge consecutive user messages and handle orphaned tool results, ensuring compatibility with LLM providers. - Added unit tests for the new JSON repair and message sanitization functionalities to validate their effectiveness.	2026-03-25 22:08:29 +08:00

1 2 3 4 5 ...

990 Commits