Full-Stack Web

AI-Powered Stock News Sentiment Analyzer

Full-stack AI pipeline that scrapes real-time S&P 500 press releases, runs them through multi-model GPT sentiment analysis, and produces actionable buy/sell signals on a color-coded dashboard.

Timeline: ~4 months — Oct 2023 to Jan 2024

Key Results

500+ companies

S&P 500 Coverage

Full S&P 500 index loaded from curated JSON dataset for symbol selection

3 models

AI Models Integrated

GPT-4o (sentiment), GPT-4 (filtering), GPT-3.5-turbo (decisions) — plus experimental Gemini support

Every 1–2 min

Automation Frequency

bridge:run polls every minute, scrap:engine processes every two minutes with overlap protection

3 stages

Pipeline Stages

Scrape → Filter (relevance check) → Analyze (sentiment scoring) — fully automated per stock symbol

The Problem

Retail investors and traders monitoring S&P 500 stocks lacked a fast, automated way to consume breaking press releases and understand their likely impact on stock prices — manually reading dozens of articles per symbol per day was slow, inconsistent, and error-prone.

  • Traders wasted hours manually browsing Yahoo Finance press releases across multiple stock symbols every day.
  • Human sentiment interpretation of news is subjective and inconsistent — the same article might get different readings on different days.
  • No existing tool combined real-time press release scraping with AI-driven sentiment scoring into a single actionable dashboard.
  • Irrelevant articles (commentary, analysis pieces, tangentially related news) polluted the signal, requiring manual filtering before analysis.

The Solution

Built an end-to-end pipeline that uses headless browser automation to scrape Yahoo Finance press releases, filters them for relevance via GPT-4, sends each article through GPT-4o for structured sentiment analysis (positive/negative/neutral), and presents results in a color-coded web dashboard with buy/sell signals.

  1. 1Built a headless browser scraping engine using Laravel Dusk (Selenium/ChromeDriver) to navigate Yahoo Finance press release pages for each stock symbol, extracting article titles, links, publish timestamps, and current stock prices.
  2. 2Integrated OpenAI GPT-4o API via a dedicated AiService with custom system prompts — the AI returns structured JSON with impact (+1/0/−1), illustration, and deep_thinking fields for each article.
  3. 3Added a GPT-4-powered article relevance filter to pre-screen articles, discarding commentary and tangentially related content to ensure only breaking news passes through to sentiment analysis.
  4. 4Designed a relational schema (reports → stocks → cards) with status tracking and a draft table to deduplicate previously processed article links.
  5. 5Created two Artisan commands: bridge:run (syncs unprocessed reports from production every minute) and scrap:engine (processes one stock per run every two minutes), both scheduled with overlap protection.
  6. 6Implemented a split local-processing / remote-serving architecture — scraping and AI analysis run locally, then results are pushed to the production server via a webhook endpoint (api/webhook/receive-data).
  7. 7Built a Blade-based dashboard with Select2 multi-select for S&P 500 company selection, a reports table showing processing status, and a news report view with color-coded impact cards (green/orange/red).

Architecture Decisions

Key technical decisions made during the project and the reasoning behind them.

Local Processing + Production Sync via Webhooks

Reasoning

The scraping engine uses headless Chrome (Laravel Dusk), which is resource-intensive and unreliable on shared hosting. By running scraping and AI analysis locally and syncing results to production via webhooks, the system avoids server limitations while keeping the public-facing app lightweight.

Outcome

Enabled heavy browser automation and AI API calls without production server constraints. The bridge:run command polls for new reports every minute and scrap:engine processes them locally.

Multi-Model GPT Pipeline (GPT-4o + GPT-4 + GPT-3.5-turbo)

Reasoning

Different tasks required different accuracy/cost tradeoffs. GPT-4o provided the best structured output for sentiment analysis, GPT-4 was accurate enough for binary relevance filtering, and GPT-3.5-turbo handled lighter decision tasks cost-effectively.

Outcome

AI returns structured impact scores that drive buy/sell signals and color-coded cards. Custom prompts are stored in a database prompts table, making them configurable per report without code changes.

Laravel Dusk over HTTP-Based Scraping

Reasoning

Yahoo Finance press release pages render content dynamically via JavaScript. Traditional HTTP + DOM parsing couldn't access JS-rendered content. Laravel Dusk with headless Chrome handles SPAs and dynamic content natively.

Outcome

Reliable extraction of press release titles, links, datetime stamps, and stock prices from JavaScript-rendered Yahoo Finance pages.

Configurable Prompt System via Database

Reasoning

Different analysis contexts require different AI prompts. Rather than hardcoding prompts, a prompts table stores primary and secondary prompts linked to reports via foreign key, allowing per-report prompt customization.

Outcome

Prompts can be updated in the database without redeployment — supports A/B testing different analysis strategies per stock batch.

The Tech Stack

Laravel 9

PHP 8.0+ — Artisan commands for scheduled scraping, Eloquent ORM for report/stock/prompt models, Blade templating, Laravel Sanctum API auth, task scheduling with overlap protection

OpenAI GPT-4o

Multi-model AI pipeline — GPT-4o for primary sentiment analysis returning structured JSON, GPT-4 for article relevance filtering, GPT-3.5-turbo for lighter decision tasks

Laravel Dusk

Headless browser automation (Selenium/ChromeDriver) for scraping Yahoo Finance press release pages and extracting JS-rendered stock data

Selenium

ChromeDriver-based headless browsing for navigating dynamically rendered Yahoo Finance pages

MySQL

Relational database — reports, stocks, prompts, and draft tables with foreign key relationships and cascading deletes

Blade

Server-side templating for the dashboard — Select2 multi-select dropdowns, color-coded impact cards, reports table

Bootstrap 4

Responsive UI framework for the dashboard layout, report tables, and news card components

Markets.sh API

Financial data API integration for fetching portfolios, stock quotes, watchlists, symbol searches, and news articles with date range filtering

The Impact

A production AI-powered stock analysis pipeline covering all 500+ S&P 500 companies with a 3-stage automated pipeline (scrape → filter → analyze) running every 1–2 minutes.

Deep dives and comparisons related to the technologies used in this project.

Explore similar case studies with overlapping technologies and challenges.