Skip to content

Code Review #33

@4rthurCai

Description

@4rthurCai

Code Review Report

Date: 2025-12-01
Generated by Claude Code.


High Severity Issues

1. CSRF Protection Disabled for Authentication Endpoints

File: apps/auth/views.py:27-29, 41, 147, 472, 512
Severity: High
Type: Authentication Bypass / CSRF

Issue:

class CsrfExemptSessionAuthentication(SessionAuthentication):
    def enforce_csrf(self, request):
        return  # CSRF protection completely disabled

@api_view(["POST"])
@authentication_classes([CsrfExemptSessionAuthentication])  # Used on auth endpoints
@permission_classes([AllowAny])
def auth_initiate_api(request):

Impact:

  • Attackers can perform CSRF attacks against authentication endpoints
  • Users can be tricked into initiating auth flows, logging in, or logging out
  • While Turnstile provides some protection, CSRF tokens are the standard defense

Recommendation:
Use Django's standard CSRF protection with proper CSRF token handling in the frontend. If you must exempt CSRF for specific technical reasons, document why and add compensating controls:

# Only exempt CSRF if absolutely necessary and add alternative protections
# Consider using custom headers or origin validation instead

2. Missing Authorization Check on User Review Endpoints

File: apps/web/views.py:346-359
Severity: High
Type: Broken Authorization

Issue:

def get_queryset(self):
    """Only reviews belonging to the authenticated user with vote annotations."""
    return Review.objects.with_votes(
        request_user=self.request.user, user=self.request.user
    )

The queryset filtering is correct, but Django REST Framework's RetrieveModelMixin could potentially be vulnerable if the lookup is manipulated.

Impact:

  • While current implementation looks safe due to queryset filtering, it relies entirely on the ORM
  • No explicit authorization check in retrieve/update/delete operations
  • Potential for IDOR (Insecure Direct Object Reference) vulnerabilities

Recommendation:
Add explicit permission checks:

def get_object(self):
    obj = super().get_object()
    # Explicit authorization check
    if obj.user != self.request.user:
        raise PermissionDenied("You don't have permission to access this review")
    return obj

Medium Severity Issues

3. Race Condition in Vote System

File: apps/web/models/vote.py:10-41
Severity: Medium
Type: Race Condition / Data Integrity

Issue:

@transaction.atomic
def vote(self, value, course_id, category, user):
    # No select_for_update lock
    course = Course.objects.get(id=course_id)
    vote, created = self.get_or_create(course=course, category=category, user=user)

    # Race condition: concurrent votes can corrupt scores
    if not created:
        if category == Vote.CATEGORIES.QUALITY:
            course.quality_score -= vote.value  # Not atomic with read

Impact:

  • Under concurrent voting, scores can become incorrect
  • Two simultaneous votes could:
    1. Both read the same initial score
    2. Both modify it independently
    3. Last write wins, losing one vote's contribution
  • Data integrity violation

Recommendation:

@transaction.atomic
def vote(self, value, course_id, category, user):
    # Lock the course row to prevent concurrent modifications
    course = Course.objects.select_for_update().get(id=course_id)
    vote, created = self.get_or_create(course=course, category=category, user=user)
    # ... rest of logic

4. N+1 Query Problem in Course Serializer

File: apps/web/serializers.py:237-243, 138, 246, 256-274
Severity: Medium
Type: Performance / N+1 Queries

Issue:

def get_review_set(self, obj):
    request = self.context.get("request")
    if request and request.user.is_authenticated:
        return ReviewSerializer(
            obj.review_set.all(), many=True, context=self.context
        ).data  # Triggers N+1 queries for each review's votes
    return []

def get_review_count(self, obj):
    return obj.review_set.count()  # Separate query instead of annotation

Impact:

  • For a course with 50 reviews, this generates 50+ additional queries
  • Severe performance degradation as review counts grow
  • API response times increase linearly with review count

Recommendation:

# In the view's get_queryset:
queryset = Course.objects.annotate(
    review_count=Count('review')
).prefetch_related(
    Prefetch('review_set', queryset=Review.objects.with_votes(request.user))
)

# In serializer:
def get_review_count(self, obj):
    return obj.review_count  # Use annotation

5. Inefficient Average Calculation

File: apps/web/models/vote.py:43-52
Severity: Medium
Type: Performance

Issue:

def _calculate_average_score(self, course, category):
    votes = self.filter(course=course, category=category)
    if not votes.exists():
        return 0

    total_score = sum(vote.value for vote in votes)  # Loads all votes into memory
    vote_count = votes.count()  # Separate query
    return round(total_score / vote_count, 1)

Impact:

  • Loads all vote objects into memory
  • For popular courses with 1000+ votes, this is wasteful
  • Makes two database queries instead of one

Recommendation:

from django.db.models import Avg

def _calculate_average_score(self, course, category):
    result = self.filter(course=course, category=category).aggregate(
        avg_score=Avg('value')
    )
    avg = result['avg_score']
    return round(avg, 1) if avg is not None else 0

6. Missing Input Validation on Vote Value

File: apps/web/views.py:499-514
Severity: Medium
Type: Input Validation

Issue:

@api_view(["POST"])
@permission_classes([IsAuthenticated])
def course_vote_api(request, course_id):
    try:
        value = request.data["value"]  # No validation here
        forLayup = request.data["forLayup"]
    except KeyError:
        # ...

    # int() can raise ValueError
    # Value validation happens later in vote() which silently returns None
    new_score, is_unvote, new_vote_count = Vote.objects.vote(
        int(value), course_id, category, request.user
    )

Impact:

  • Non-integer values cause ValueError
  • Out-of-range values (e.g., 1000) fail silently
  • Poor error messages for API consumers

Recommendation:

def course_vote_api(request, course_id):
    try:
        value = int(request.data["value"])
        forLayup = request.data["forLayup"]
    except (KeyError, ValueError, TypeError):
        return Response(
            {"detail": "Invalid request. 'value' must be an integer between 1 and 5, 'forLayup' must be boolean"},
            status=400
        )

    if not 1 <= value <= 5:
        return Response({"detail": "Vote value must be between 1 and 5"}, status=400)

    # ... rest of code

7. Denial of Service - No Timeout on HTTP Requests

File: apps/spider/utils.py:36
Severity: Medium
Type: Denial of Service

Issue:

def retrieve_soup(url, data=None, preprocess=lambda x: x):
    print(url)
    if data is not None:
        data = data.encode("utf-8")
    with urllib_request.urlopen(url, data=data) as response:  # No timeout!
        return BeautifulSoup(preprocess(response.read().decode("utf-8")), "html.parser")

Impact:

  • If the remote server doesn't respond, this will hang indefinitely
  • Can block Celery workers permanently
  • Potential for resource exhaustion

Recommendation:

with urllib_request.urlopen(url, data=data, timeout=30) as response:
    return BeautifulSoup(preprocess(response.read().decode("utf-8")), "html.parser")

8. Crash on Missing Hardcoded User

File: apps/analytics/views.py:24
Severity: Medium
Type: Error Handling

Issue:

@staff_member_required
def home(request):
    course_picker = User.objects.get(username="CoursePicker")  # Crashes if not found!

Impact:

  • If "CoursePicker" user doesn't exist, the entire analytics dashboard crashes
  • User.DoesNotExist exception will be raised
  • Hardcoded dependency on specific database state

Recommendation:

course_picker = User.objects.filter(username="CoursePicker").first()
if course_picker:
    # filter out course_picker from queries
else:
    # handle case where user doesn't exist

9. Division by Zero / ValueError in Random Selection

File: apps/analytics/views.py:201-203
Severity: Medium
Type: Error Handling

Issue:

unlabeled_reviews = models.Review.objects.filter(...).exclude(...)
count = unlabeled_reviews.count()
random_index = randint(0, count - 1)  # Crashes if count == 0!
review = unlabeled_reviews[random_index]

Impact:

  • If no unlabeled reviews exist, randint(0, -1) raises ValueError
  • Crashes the sentiment labeler page
  • No graceful handling of empty state

Recommendation:

count = unlabeled_reviews.count()
if count == 0:
    return render(request, "sentiment_labeler.html", {"no_reviews": True})
random_index = randint(0, count - 1)

10. N+1 Query in Recommendations Task

File: apps/recommendations/tasks.py:95, 113
Severity: Medium
Type: Performance

Issue:

for i in range(psarray.shape[0]):
    current_class = Course.objects.get(id=course_ids[i])  # Query inside loop!
    # ...
    for other_i in np.argpartition(...):
        course_id = course_ids[other_i]
        # Another potential query

Impact:

  • For 1000 courses, this runs 1000+ individual queries
  • Significantly slows down recommendation generation
  • Inefficient database usage

Recommendation:

# Pre-fetch all courses into a dictionary
courses_dict = {c.id: c for c in Course.objects.filter(id__in=course_ids)}

for i in range(psarray.shape[0]):
    current_class = courses_dict[course_ids[i]]  # O(1) lookup

Low Severity Issues

11. Missing Permission Checks on Read-Only Endpoints

File: apps/web/views.py:396, 436, 459
Severity: Low
Type: Missing Access Control

Issue:

@api_view(["GET"])
def medians(request, course_id):  # No @permission_classes decorator
    # ...

@api_view(["GET"])
def course_professors(request, course_id):  # No permission check
    # ...

@api_view(["GET"])
def course_instructors(request, course_id):  # No permission check
    # ...

Impact:

  • These endpoints are accessible to unauthenticated users
  • May expose information you want to restrict
  • Inconsistent with other endpoints that require authentication

Recommendation:
Decide on intended access level and add decorators:

@api_view(["GET"])
@permission_classes([AllowAny])  # Explicit is better than implicit
def medians(request, course_id):
    # ...

Or restrict to authenticated users if appropriate:

@api_view(["GET"])
@permission_classes([IsAuthenticated])
def course_professors(request, course_id):
    # ...

12. Broad Exception Handling

File: apps/web/views.py:575-579
Severity: Low
Type: Error Handling

Issue:

except Exception:
    return Response(
        {"detail": "An error occurred processing your request"},
        status=500,
    )

Impact:

  • Catches all exceptions, including KeyboardInterrupt, SystemExit
  • Masks programming errors
  • Makes debugging difficult
  • Generic error message provides no useful information

Recommendation:

except (DatabaseError, ValueError) as e:
    logger.exception("Error processing review vote")
    return Response(
        {"detail": "An error occurred processing your request"},
        status=500,
    )

13. Potential N+1 in Course.get_instructors()

File: apps/web/models/course.py:185-208
Severity: Low
Type: Performance

Issue:

def get_instructors(self, term=CURRENT_TERM):
    instructors = []
    offerings = self.courseoffering_set.all()  # Prefetch needed

    if term:
        offerings = offerings.filter(term=term)

    for offering in offerings:
        for instructor in offering.instructors.all():  # N+1 query
            instructors.append(instructor)

Impact:

  • If not prefetched, generates one query per offering
  • Called in serializers which could amplify the problem

Recommendation:

def get_instructors(self, term=CURRENT_TERM):
    instructors = []
    offerings = self.courseoffering_set.prefetch_related('instructors').all()

    if term:
        offerings = offerings.filter(term=term)

    # Use set for O(1) lookups instead of list
    seen_ids = set()
    unique_instructors = []
    for offering in offerings:
        for instructor in offering.instructors.all():
            if instructor.id not in seen_ids:
                seen_ids.add(instructor.id)
                unique_instructors.append(instructor)

    return unique_instructors

14. Log Injection Vulnerability

File: apps/web/views.py:564
Severity: Low (Now Fixed)
Type: Log Injection

Recommendation:

logger.warning("Review %s not found for voting", str(review_id).replace('\n', '').replace('\r', ''))

Already posed by copilot check.


Additional Recommendations

Code Quality Improvements

  1. Use Database Indexes: Ensure frequently queried fields have indexes

    • Review.professor (already indexed ✓)
    • Review.term (already indexed ✓)
    • Consider composite indexes for common query patterns
  2. Add Rate Limiting: Implement rate limiting on voting endpoints

    # Add throttling classes to views
    from rest_framework.throttling import UserRateThrottle
    
    class VoteThrottle(UserRateThrottle):
        rate = '100/hour'
  3. Add Monitoring: Implement logging and monitoring for:

    • Failed authentication attempts
    • Suspicious voting patterns
    • Slow queries
    • Exception rates
  4. Security Headers: Add security headers in settings

    SECURE_BROWSER_XSS_FILTER = True
    SECURE_CONTENT_TYPE_NOSNIFF = True
    X_FRAME_OPTIONS = 'DENY'
  5. Input Sanitization: While Django ORM prevents SQL injection, ensure:

    • User-generated content is properly escaped in templates
    • External API responses are validated
    • File uploads (if any) are properly validated

Summary Statistics

  • High Severity Issues: 2
  • Medium Severity Issues: 8
  • Low Severity Issues: 4
  • Total Issues: 14

Metadata

Metadata

Labels

bugSomething isn't workinghelp wantedExtra attention is needed

Type

No fields configured for Bug.

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions