A Node.js-based ETL (Extract, Transform, Load) system that processes user revenue events through a client-server architecture and updates a PostgreSQL database.
The project consists of three main components:
- Client: Reads events from a JSONL file and sends them to the server
- Server: Receives events via HTTP, writes them to server events file
- Data Processor: Processes server events files and updates the PostgreSQL database
- Node.js
- PostgreSQL database
- npm package manager
- Clone the repository
- Install dependencies:
npm install- Run the
db.sqlfile in your PostgreSQL database to create theusers_revenuetable.
npm run serverThe server will:
- Listen on port 5000
- Accept POST requests at
/liveEvent - Write events to
server_events_YYYYMMDDHHMM.jsonlfiles - Append to a new file every minute (this is done so data processor can potentially run on different server files instead of only one)
- Provide user revenue data via GET
/userEvents/:userid
npm run clientThe client will:
- Read events from
events.jsonl - Send events to server's
/liveEventendpoint - Handle concurrent requests (limit: 25)
- Authenticate using the secret key
npm run processor -- <filename>Where <filename> is a server event file (e.g., server_events_202501251430.jsonl)
The processor will:
- Read events from the specified file
- Process events in batches (1000 events per batch)
- Update user revenue in the database using transactions
- Handle concurrent updates with retry mechanism
The data processor has been tested with:
- 3 simultaneous data processor instances running on different server files
- ~200 unique users
- ~120MB server files
- Local PostgreSQL database