feat(job_crawler): initialize job crawler service with kafka integration

- Add technical documentation (技术方案.md) with system architecture and design details
- Create FastAPI application structure with modular organization (api, core, models, services, utils)
- Implement job data crawler service with incremental collection from third-party API
- Add Kafka service integration with Docker Compose configuration for message queue
- Create data models for job listings, progress tracking, and API responses
- Implement REST API endpoints for data consumption (/consume, /status) and task management
- Add progress persistence layer using SQLite for tracking collection offsets
- Implement date filtering logic to extract data published within 7 days
- Create API client service for third-party data source integration
- Add configuration management with environment-based settings
- Include Docker support with Dockerfile and docker-compose.yml for containerized deployment
- Add logging configuration and utility functions for date parsing
- Include requirements.txt with all Python dependencies and README documentation

This commit is contained in:

李顺东

2026-01-15 17:09:43 +08:00

commit ae681575b9

26 changed files with 1898 additions and 0 deletions

8

job_crawler/requirements.txt Normal file

View File

@@ -0,0 +1,8 @@
 fastapi==0.109.0
 uvicorn==0.27.0
 httpx==0.27.0
 kafka-python==2.0.2
 apscheduler==3.10.4
 pydantic==2.5.3
 python-dotenv==1.0.0
 PyYAML==6.0.1

feat(job_crawler): initialize job crawler service with kafka integration

8 job_crawler/requirements.txt Normal file Unescape Escape View File

8

job_crawler/requirements.txt Normal file

View File