Automated pharmaceutical intelligence platform that monitors Reddit discussions and clinical trial changes for drug development insights.
CTP Tracker is a comprehensive pharmaceutical intelligence platform designed to automate the monitoring of drug-related discussions and clinical trial developments. The system addresses the critical need for real-time insights in pharmaceutical research and investment decision-making.
Problem: Manual monitoring of drug mentions across social media and clinical trial databases is time-consuming, error-prone, and often misses critical developments that could impact investment decisions or regulatory timelines.
Solution: An automated monitoring system that continuously scans Reddit for drug discussions and tracks clinical trial protocol changes, providing stakeholders with timely, actionable intelligence through email alerts.
Outcomes:
- Real-time detection of drug mentions across 50+ medical subreddits
- Automated tracking of clinical trial protocol version changes
- Consolidated email alerts with direct links to source material
- Persistent tracking to avoid duplicate notifications
- Configurable drug watchlists with date-based filtering
Subject: Reddit Monitoring Alerts (3 new posts)
Syfovre: New treatment for macular degeneration - https://reddit.com/r/medicine/comments/...
Ozempic: Weight loss discussion in diabetes community - https://reddit.com/r/diabetes/comments/...
Keytruda: Cancer treatment updates - https://reddit.com/r/cancer/comments/...
Subject: Daily New Trials & Trial Changes Alert
| Drug | NCT ID | Title | Changes | Compare Link |
|------|--------|-------|---------|--------------|
| Syfovre | NCT06394674 | Study of SYFOVRE... | Updated inclusion criteria | [Compare changes] |
flowchart LR
User[Stakeholders] --> Email[Email Alerts]
Email --> Brevo[Brevo API]
subgraph "Monitoring System"
Reddit[Reddit Monitor] --> PRAW[PRAW API]
Trials[Trial Monitor] --> CT[ClinicalTrials.gov API]
Trials --> Playwright[Web Scraping]
end
subgraph "Data Sources"
PRAW --> RedditData[Reddit Posts]
CT --> TrialData[Trial Metadata]
Playwright --> VersionData[Protocol Versions]
end
subgraph "Storage"
RedditData --> JSON[redditIDs.json]
VersionData --> State[trial_versions.json]
Config[watchlist.xlsx] --> Drugs[Drug Watchlist]
end
subgraph "Processing"
Drugs --> Reddit
Drugs --> Trials
Reddit --> Email
Trials --> Email
end
- Python 3.8+
- Reddit API credentials (client_id, client_secret, user_agent)
- Brevo API key for email notifications
- Playwright for web scraping (auto-installed)
-
Clone the repository
git clone https://github.com/username/ctp-tracker.git cd ctp-tracker -
Install dependencies
pip install praw pandas sib_api_v3_sdk python-dotenv playwright beautifulsoup4 requests playwright install chromium
-
Set up environment variables
cp .env.example .env # Edit .env with your API credentials -
Configure watchlist
# Create watchlist.xlsx with required sheets: # - SocialMedia: Drug_Name, Date_added # - TrialChange: Drug_Name, Date_added
-
Run monitoring
# Monitor Reddit mentions python redditMonitoring.py # Monitor trial changes python TrialAlert.py # Test email functionality python test_email.py
from redditMonitoring import RedditMonitor
# Initialize monitor
monitor = RedditMonitor()
# Run monitoring for past week
results = monitor.monitor_all_drugs(time_filter='week', limit_per_search=10)
# Print results
monitor.print_results(results)# Reddit API (read-only access)
REDDIT_CLIENT_ID=your_client_id
REDDIT_CLIENT_SECRET=your_client_secret
REDDIT_USER_AGENT=DrugMonitor/1.0 by YourUsername
# Email notifications
BREVO_API_KEY=your_brevo_api_keySocialMedia Sheet - For Reddit monitoring:
| Drug_Name | Date_added |
|---|---|
| Syfovre | 2024-01-15 |
| Ozempic | 2024-01-10 |
TrialChange Sheet - For clinical trial monitoring:
| Drug_Name | Date_added |
|---|---|
| Syfovre | 2024-01-15 |
| Keytruda | 2024-01-10 |
redditIDs.json- Persistent storage of seen Reddit post IDstrial_versions.json- Highest protocol version seen for each NCT IDwatchlist.xlsx- Drug watchlist with date filters
ctp-tracker/
βββ redditMonitoring.py # Reddit mention monitoring
βββ TrialAlert.py # Clinical trial change tracking
βββ watchlist.py # Excel watchlist parser
βββ test_email.py # Email functionality testing
βββ watchlist.xlsx # Drug configuration (not in repo)
βββ redditIDs.json # Reddit post tracking (auto-generated)
βββ trial_versions.json # Trial version tracking (auto-generated)
βββ .env # Environment variables (not in repo)
βββ .gitignore # Git ignore rules
βββ docs/
β βββ img/ # Documentation images
βββ README.md # This file
- PRAW - Reddit API wrapper for read-only access
- Pandas - Excel file processing and data manipulation
- Playwright - Web scraping for clinical trial version pages
- BeautifulSoup4 - HTML parsing for trial data extraction
- Requests - HTTP client for ClinicalTrials.gov API
- Reddit API - Social media monitoring (read-only)
- ClinicalTrials.gov API - Trial metadata retrieval
- Brevo (Sendinblue) - Transactional email delivery
- JSON - Persistent state tracking
- Excel - Configurable drug watchlists
- CSV - Archival data export
# Test email functionality
python test_email.py
# Test Reddit monitoring (dry run)
python redditMonitoring.py
# Test trial monitoring
python TrialAlert.py- Logging - Comprehensive logging throughout all modules
- Error Handling - Graceful degradation for API failures
- Rate Limiting - Respectful API usage with built-in delays
# Set up virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run monitoring scripts
python redditMonitoring.py- Cron Jobs - Schedule regular monitoring runs
- Docker - Containerize for consistent deployment
- Monitoring - Add health checks and alerting
- Backup - Regular backup of state files
- Google Sheets Integration - Replace Excel with cloud-based watchlists
- Web Dashboard - Real-time monitoring interface
- Advanced Filtering - Sentiment analysis and relevance scoring
- Multiple Social Platforms - Twitter, LinkedIn monitoring
- Machine Learning - Automated drug mention classification
- API Endpoints - REST API for external integrations
- Database Backend - Replace JSON files with proper database
- Alert Customization - Configurable notification preferences
- Async Processing - Improve performance with async/await
- Caching Layer - Reduce API calls with intelligent caching
- Unit Tests - Comprehensive test coverage
- CI/CD Pipeline - Automated testing and deployment
- Documentation - API documentation and user guides
I welcome contributions! Please follow these guidelines:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes with proper logging and error handling
- Test thoroughly:
python test_email.py - Commit with descriptive messages:
git commit -m 'Add amazing feature' - Push to your branch:
git push origin feature/amazing-feature - Open a Pull Request
- Follow PEP 8 Python style guidelines
- Add comprehensive docstrings for all functions
- Include logging statements for debugging
- Handle exceptions gracefully
- Add type hints where appropriate
- Use descriptive issue titles
- Include error logs and stack traces
- Specify Python version and environment
- Provide steps to reproduce issues
This project is licensed under the MIT License - see the LICENSE file for details.
- Reddit - Social media platform for drug discussions
- ClinicalTrials.gov - Clinical trial registry and database
- Brevo - Email delivery service
- PRAW - Reddit API wrapper
- Pandas - Data manipulation library
- Playwright - Web automation framework
- BeautifulSoup4 - HTML parsing library
- Octagon Invest - Pharmaceutical investment research
- Medical Community - Reddit medical subreddits and contributors
Note: This system is designed for research and investment intelligence purposes. Always verify information through official sources before making any decisions.