WeKnora

mirror of https://github.com/Tencent/WeKnora.git synced 2026-06-04 13:30:32 +08:00

Author	SHA1	Message	Date
wizardchen	ef1047bf67	feat(parser): add OpenDataLoader, PaddleOCR-VL engines, and parser improvements Introduce opendataloader and PaddleOCR-VL parser engines with tenant-level settings UI, replace liteparse, and harden Excel/PPT/Markdown parsing. Optional odl-hybrid sidecar stays local-build only and is excluded from default dev-start and full profiles.	2026-06-03 12:29:13 +08:00
wizardchen	bd68a0c377	feat(cloud-image): support apt-based docker install for restricted-egress hosts Mainland China cloud VMs (Tencent Lighthouse, Aliyun, etc.) frequently cannot reach get.docker.com, github.com, or even community GitHub mirrors like gh-proxy.com. The cloud-image bootstrap previously had no escape hatch for this and failed at the very first curl. This adds a new DOCKER_INSTALL_MIRROR env var to prepare.sh. When set, it skips get.docker.com and installs docker-ce + compose-plugin from an apt mirror of Docker's official repo (e.g. mirrors.tencent.com, mirrors.aliyun.com). README.md also gets: - A GH_PROXY env var threaded through bootstrap methods A and B so the initial script pull can route through gh-proxy / ghfast. - An explicit recommendation to prefer method C (scp from local) on mainland China VMs. - A consolidated "三件套" table mapping WEKNORA_GH_PROXY / DOCKER_INSTALL_MIRROR / DOCKER_REGISTRY_MIRROR to per-cloud endpoints, so users hit one place to copy the full env.	2026-05-11 21:14:08 +08:00
wizardchen	0cfbad7f97	fix(cloud-image): enhance firstboot and cleanup scripts for improved security and functionality - Updated cleanup.sh to avoid recreating .env during cleanup, preventing exposure of default passwords before firstboot. - Modified firstboot.sh to create .env from .env.example only if it doesn't exist, ensuring no sensitive data is present before initialization. - Added support for Docker Hub and GitHub tarball download acceleration via new environment variables WEKNORA_GH_PROXY and DOCKER_REGISTRY_MIRROR. - Implemented a mechanism to prune old WeKnora images based on the current version, reducing image size and maintaining a clean environment. - Enhanced README.md with instructions for using the new acceleration features and image pruning options.	2026-05-11 15:51:11 +08:00
wizardchen	cdfbf05524	fix(cloud-image): make firstboot idempotent and pin image versions Address review feedback on PR #1249: - prepare.sh: when WEKNORA_REF looks like a version tag (v*), write the matching WEKNORA_VERSION into .env so docker compose pulls images that match the compose YAML's git ref (previously stuck on :latest). - prepare.sh: detect docker binary path via `command -v docker` and template it into weknora.service (replacing hardcoded /usr/bin/docker that fails when docker lives in /usr/local/bin). - firstboot.sh: write a /opt/WeKnora/.firstboot.done marker immediately after rewriting .env, before `docker compose up -d`. If compose fails mid-run, the next boot is gated by ConditionPathExists=!marker so we never regenerate DB_PASSWORD against an already-initialized postgres volume (which previously bricked the database). - firstboot.sh: stop deleting its own unit file / script while the oneshot is still executing; rely on the marker + `systemctl disable` instead, avoiding "job failed" markings from systemd. - firstboot.sh: use detected docker path instead of /usr/bin/docker; add note in credentials file that .env is the source of truth. - weknora-firstboot.service: add ConditionPathExists=!.firstboot.done. - cleanup.sh: scope docker volume deletion to compose project label (com.docker.compose.project=<name>) instead of fuzzy substring match that could nuke unrelated postgres/redis volumes. - cleanup.sh: also remove .firstboot.done marker, firstboot log, and any leftover /root/weknora-credentials.txt so the image is clean. - README.md: clarify how to actually disable registration (edit the `replace` call list in firstboot.sh, not run that command in shell).	2026-05-11 12:25:19 +08:00
wizardchen	155f3b3e72	docs(cloud-image): clarify sudo + redirection pitfall in setup steps Recommend `sudo -i` to avoid the classic `sudo cmd >> file` failure where the shell redirection runs as the unprivileged user. Also document the `sudo tee -a` workaround and add a scp option C.	2026-05-11 12:25:19 +08:00
wizardchen	afd7d1fdf8	docs(cloud-image): add cloud-agnostic image packaging scripts Add scripts and docs for packaging WeKnora into cloud images (AMI, custom images, snapshots) so users can distribute one-click deployable templates on any cloud provider. - scripts/cloud-image/: cloud-agnostic prepare/cleanup/firstboot scripts plus systemd units. Downloads only the 4 runtime files needed by the compose stack (~100KB) instead of cloning the full repo, and pins to any git ref via WEKNORA_REF for reproducible builds. - firstboot.sh randomizes DB/Redis/JWT/AES secrets on first boot, writes credentials to /root/weknora-credentials.txt and self-removes. - docs/cloud-image/: per-platform packaging guides. Includes a guide for Tencent Cloud Lighthouse / CVM covering image creation, sharing, and marketplace listing. Default-on services match the unprofiled compose stack (frontend, app, docreader, postgres, redis); optional services (qdrant, milvus, neo4j, langfuse, etc.) remain opt-in via compose profiles to keep the image size small.	2026-05-11 12:25:19 +08:00

6 Commits