Loading...
Loading...
In the rapid-fire world of the Indian stock market, information isn't power — filtered intelligence is. Every day, thousands of corporate filings hit the NSE, but 95% of them are routine noise. As an engineer, the challenge I set out to solve was:
How do we separate a "Board Meeting Intimation" from a "Strategic Pharma FDA Approval" in under 5 seconds?
Enter Bulkbeat TV — a high-concurrency, AI-driven intelligence system designed to process market data with the precision of a Tier-1 Hedge Fund desk.
Retail traders often fail not because they lack data, but because they are "pinged" to death. When a high-impact news item drops, every second counts. If your alert arrives 2 minutes late, the alpha (profit potential) has already vanished — absorbed by the faster hands.
Most existing bots rely on naive keyword matching. Search for "Order win" and alert. Simple. But consider this scenario:
In institutional terms, that's 0.01% of market cap — pure noise. Yet a legacy scraper would fire an alert, creating a false breakout signal and eroding trader trust over time.
My goal was to build a system that thinks before it alerts.
The heart of Bulkbeat TV is a custom Deterministic Intelligence Engine. Instead of just summarizing news, the system runs every filing through a rigorous 22-rule checklist before a single Telegram message is sent.
1. Quantitative Context The engine doesn't just read a number — it scales it. A ₹100 Crore order is transformational for a ₹400 Crore Small-Cap but irrelevant for an HDFC. The system automatically computes the deal-size-to-market-cap ratio and adjusts the impact score accordingly.
2. Source-Aware Skepticism Not all news is created equal. The engine assigns different Trust Scores to sources:
| Source | Trust Tier | Reason |
|---|---|---|
| NSE Direct Filing | HIGH | Regulatory obligation, verified |
| SME Platform | HIGH | Exchange-verified, lower coverage |
| Moneycontrol | MEDIUM | Speed-first, sometimes unverified |
| Economic Times | MEDIUM | Aggregated, risk of recycled news |
A media article about a merger floats through at Trust: MEDIUM. The moment the NSE official filing confirms it, the system upranks the event and fires the high-priority alert. No duplicate noise; just a single, confident signal.
3. Corporate Action Logic The system understands temporal relevance. A dividend announcement means nothing if the Ex-Date has already passed. The engine parses the Ex-Date, compares it to the current market clock, and suppresses stale corporate actions automatically.
4. Noise Suppression at the Ingestion Layer The very first filter is the most aggressive. Routine filings are categorically discarded in milliseconds, before they ever touch the AI engine:
NOISE_CATEGORIES = [
"Postal Ballot", "ESOP Grant", "Board Meeting Intimation",
"Secretarial Audit", "Annual Report Submission",
"Change of Registered Office", "Investor Presentation",
]
def is_noise(filing_category: str) -> bool:
return any(noise in filing_category for noise in NOISE_CATEGORIES)
This single filter eliminates ~60% of all incoming filings before any compute is spent.
The constraint was stark: run this on a resource-constrained 1GB RAM VPS, without losing a single packet of data, even during AI engine overload. Every architectural decision was made under that lens.
The bot scans four independent sources simultaneously:
A synchronous scraper would process these sequentially (~8-12 seconds per cycle). The async engine handles all four simultaneously in < 2 seconds, achieving true real-time coverage.
async def run_all_scrapers(session):
tasks = [
scrape_nse(session),
scrape_sme(session),
scrape_moneycontrol(session),
scrape_economic_times(session),
]
results = await asyncio.gather(*tasks, return_exceptions=True)
return [r for r in results if not isinstance(r, Exception)]
NSE's API is notoriously restrictive. A bot that hammers it with identical headers gets rate-limited within minutes. The solution:
429 Too Many Requests, the engine backs off with exponential delay, then resumes without human intervention.SQLite is the right choice for a single-node deployment — it requires zero infrastructure overhead. But vanilla SQLite breaks under concurrent writes. The solution was precision-tuned PRAGMA settings:
# Write-Ahead Logging: Non-blocking concurrent reads during writes
self.conn.execute("PRAGMA journal_mode=WAL")
# Full sync: Every write is durable, even on power loss
self.conn.execute("PRAGMA synchronous=FULL")
# Memory Cap: Lock the page cache at 2MB for our 1GB VPS
self.conn.execute("PRAGMA cache_size = -2000")
# 30-second busy timeout: No crash on lock contention
self.conn.execute("PRAGMA busy_timeout = 30000")
This turns SQLite into a high-concurrency, crash-safe data store — without a single external dependency.
The most critical architectural pattern in the system is the Zero-Loss, Self-Healing Pipeline. The problem it solves: what happens if the AI engine is busy scoring News Item #45 when News Items #46, #47, and #48 arrive from the scrapers?
In a naive system, they'd be dropped. In Bulkbeat TV, every incoming item is immediately persisted to the SQLite queue with processing_status = 0 (Pending). The AI worker picks items off the queue in order, regardless of how long it takes.
Scraper → [content_hash dedup] → SQLite Queue (status=0)
↓
AI Engine Worker
↓
[Score computed] → status=1 (Analyzed)
↓
[Score >= 8] → status=2 (Alerted) → Telegram
Deduplication is handled by a dual-layer SHA-256 content hash check — both on the URL and the full content body — preventing the same filing from being processed twice across different scraper runs.
The bot operates on a discipline of intentional silence. During market hours (9:15 AM – 3:30 PM), it sends no alerts unless the computed Impact Score ≥ 8 on a scale of 1-10. This is the institutional model: your attention is a finite resource, and every false alert degrades it.
A score of 7 (High-Impact Pharma CDMO contract) gets queued. A score of 9 (US FDA Approval for a key drug) fires immediately.
Post-market (3:30 PM – 8:30 AM), the system pivots to data collection mode. All scored news is queued. At 8:30 AM, 45 minutes before the opening bell, the system synthesizes all overnight intelligence into a single Morning Brief:
📊 BULKBEAT TV | MORNING BRIEF | 17 Apr 2026
🔴 HIGH IMPACT (Score 9)
SUVEN PHARMA | US FDA Approval | +₹XXCr Revenue Pipeline
🟡 WATCHLIST (Score 7-8)
IRCTC | Major Order Win | ₹450Cr (3.2% of Mkt Cap)
ZYDUS | ANDA Approval | Generic Drug #34
📋 MARKET SETUP: 3 Bullish triggers. Bias: POSITIVE.
Traders get institutional-class intelligence before the first tick.
Beyond market intelligence, NSE Pulse is also a subscription product. The billing engine — internally called "Hisab" (accounting in Hindi) — tracks users in Market Days, not calendar days. A user with 30 days doesn't lose a day on Saturday; they lose it only on actual NSE trading sessions.
Every credit and debit is immutably logged in a user_billing_log table — a full audit trail that both users and admins can query at any time.
BILLING HUB AUDIT
─────────────────────────────────────────
Event | Days | Reason | Balance
─────────────────────────────────────────
CREDIT | +30 | UPI Payment | 30
DEBIT | -1 | Trading Day | 29
DEBIT | -1 | Trading Day | 28
This project started as a personal gap I identified in the Indian retail trading ecosystem. The tools available to independent traders are years behind what institutional desks use. Yet the talent and engineering horsepower to bridge that gap exists right here.
Bulkbeat TV is proof that with the right combination of async engineering, deterministic AI rules, and disciplined system design, we can deliver Tier-1 Hedge Fund intelligence to the phones of independent traders — running on a ₹800/month VPS.
The next phase includes:
This project was a masterclass in building for constraints — where reliability, latency, and signal precision matter more than fancy ML models.
#FinTech #NSE #ArtificialIntelligence #Python #EngineeringExcellence #MadeInIndia