A modular data scraping project that supports automated monitoring and scraping of multiple data sources.
cronjob-ziroom/
βββ .github/workflows/ # GitHub Actions configuration
β βββ ziroom-runner.yml # Ziroom monitoring workflow
β βββ 99-runner.yml # 99.com scraping workflow
βββ modules/ # Functional modules directory
β βββ ziroom/ # Ziroom monitoring module
β β βββ scraper.py # Scraping script
β β βββ cronjob.sh # Cron job script
β β βββ data.html # Output file
β β βββ README.md # Module documentation
β βββ 99/ # 99.com scraping module
β βββ scraper.py # Basic scraping script
β βββ cronjob.sh # Cron job script
β βββ data.json # JSON output
β βββ data.html # HTML output
β βββ README.md # Module documentation
βββ core/ # Common core code
β βββ requirements.txt # Dependencies
β βββ utils.py # Common utility functions
βββ README.md # Project overview
βββ EMAIL_SETUP.md # Email configuration guide
βββ LICENSE
- Function: Monitor Ziroom rental listings for specific keywords
- Frequency: Daily at 12:00 PM
- Output:
data.html - Email Subject: "OpenKikCoc: cronjob-ziroom"
- Function: Scrape game leaderboard data (number, fwq, player, hkzs)
- Frequency: Every 30 minutes
- Output:
data.jsonanddata.html - Email Subject: "99.com Data Update Notification"
pip3 install -r ./core/requirements.txtbash ./modules/ziroom/cronjob.shbash ./modules/99/cronjob.shQQEMAIL_USERNAME: QQ email usernameQQEMAIL_TOKEN: QQ email authorization tokenQQEMAIL_RECIPIENTS: Comma-separated list of recipient email addresses
URI: Target website URLKEYWORD: Search keyword
The project supports sending notifications to multiple recipients using the QQEMAIL_RECIPIENTS secret.
Setup:
- Go to your GitHub repository β Settings β Secrets and variables β Actions
- Create a new secret named
QQEMAIL_RECIPIENTS - Value:
[email protected],[email protected],[email protected]
Features:
- All workflows send emails to the same recipient list
- Easy to add/remove recipients without code changes
- Secure storage using GitHub Secrets
For detailed email configuration, see EMAIL_SETUP.md.
GitHub Actions will automatically run:
- Ziroom: Daily at 12:00 PM
- 99.com: Every 30 minutes
- Clear Separation: Each functional module is independent and non-interfering
- Easy Maintenance: Code is organized by functionality, making maintenance and updates easier
- Flexible Extension: New data source modules can be easily added
- Independent Configuration: Each module can have its own configuration and dependencies
- Multi-Recipient Notifications: Support for email lists and team notifications
To add a new data source module:
- Create a new directory under
modules/ - Add scraping scripts and cron job scripts
- Create corresponding workflow files
- Update project documentation
- Trigger: Daily at 12:00 PM
- Script:
./modules/ziroom/cronjob.sh - Output:
modules/ziroom/data.html - Email: Sent to all recipients in
QQEMAIL_RECIPIENTSwhen data changes detected
- Trigger: Every 30 minutes
- Script:
./modules/99/cronjob.sh - Output:
modules/99/data.jsonandmodules/99/data.html - Email: Sent to all recipients in
QQEMAIL_RECIPIENTSwhen data changes detected
- Network Connection: Ensure stable internet connection
- GitHub Secrets: Verify QQ email credentials and recipient list are properly configured
- Python Dependencies: Check if all required packages are installed
- Target Website: Confirm target websites are accessible
- Check GitHub Actions logs for error details
- Verify environment variables and secrets configuration
- Test scripts locally before pushing to GitHub
- Monitor email notifications for execution status
We welcome Issue submissions and Pull Requests to improve the project!
This project is licensed under the MIT License - see the LICENSE file for details.