Files
ocups-kafka/job_crawler/config/config.yml.docker
李顺东 ae681575b9 feat(job_crawler): initialize job crawler service with kafka integration
- Add technical documentation (技术方案.md) with system architecture and design details
- Create FastAPI application structure with modular organization (api, core, models, services, utils)
- Implement job data crawler service with incremental collection from third-party API
- Add Kafka service integration with Docker Compose configuration for message queue
- Create data models for job listings, progress tracking, and API responses
- Implement REST API endpoints for data consumption (/consume, /status) and task management
- Add progress persistence layer using SQLite for tracking collection offsets
- Implement date filtering logic to extract data published within 7 days
- Create API client service for third-party data source integration
- Add configuration management with environment-based settings
- Include Docker support with Dockerfile and docker-compose.yml for containerized deployment
- Add logging configuration and utility functions for date parsing
- Include requirements.txt with all Python dependencies and README documentation
2026-01-15 17:09:43 +08:00

40 lines
766 B
Docker
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Docker环境配置文件
# 复制此文件为 config.yml 并修改账号密码
# 应用配置
app:
name: job-crawler
version: 1.0.0
debug: false
# 八爪鱼API配置
api:
base_url: https://openapi.bazhuayu.com
username: "your_username"
password: "your_password"
batch_size: 100
# 多任务配置
tasks:
- id: "00f3b445-d8ec-44e8-88b2-4b971a228b1e"
name: "青岛招聘数据"
enabled: true
- id: "task-id-2"
name: "任务2"
enabled: false
# Kafka配置Docker内部网络
kafka:
bootstrap_servers: kafka:29092
topic: job_data
consumer_group: job_consumer_group
# 采集配置
crawler:
interval: 300
filter_days: 7
max_workers: 5
# 数据库配置
database:
path: /app/data/crawl_progress.db