Batch Operation
🔹 Batch Operation কি?
👉 Batch operation = একসাথে অনেকগুলো operation একবারে করা
সহজ ভাষায়:
একটার পর একটা আলাদা request না পাঠিয়ে, একসাথে group করে করা
1️⃣ Simple Example (Real Life)
ধরো তুমি ১০টা file upload করছো:
❌ Without Batch
১০ বার upload button চাপতে হবে 😫
✅ With Batch
একবারে ১০টা file select → upload 😎
👉 এটাকেই বলে batch operation
2️⃣ Programming Example
❌ Without Batch
insert user1
insert user2
insert user3
👉 ৩টা আলাদা operation
✅ With Batch
insert [user1, user2, user3]
👉 একবারে সব
3️⃣ SQL Example
INSERT INTO Customers (id, name)
VALUES
(1, 'Ali'),
(2, 'John'),
(3, 'Marie');
👉 এক query → multiple rows insert
4️⃣ OpenSearch Example
POST _bulk
{ "index": { "_index": "customers", "_id": 1 } }
{ "name": "Ali" }
{ "index": { "_index": "customers", "_id": 2 } }
{ "name": "John" }
{ "index": { "_index": "customers", "_id": 3 } }
{ "name": "Marie" }
👉 এক API call → multiple document insert
5️⃣ কেন Batch Operation ব্যবহার করি?
🚀 1. Fast
1000টা request → 1টা request
Huge performance boost
⚡ 2. Less Network Cost
কম API call → কম latency
🔄 3. Efficient Processing
Server efficiently process করতে পারে
6️⃣ Where Batch Operation ব্যবহার হয়?
Database insert/update/delete
File upload
Email sending (bulk email)
Data migration
Logging systems
🔚 Final Summary
Batch operation = multiple কাজ একবারে করা
*********************************************************************************
🔹 1️⃣ Normal API vs Bulk API (Problem)
❌ Without Bulk
ধরো 1000টা document insert করছো:
Client → Server (request 1)
Client → Server (request 2)
Client → Server (request 3)
...
1000 times 😵
👉 Problem:
1000 HTTP request
1000 network latency
1000 connection overhead
✅ With Bulk
Client → Server (1 request containing 1000 docs) 😎
👉 Huge improvement!
🔹 2️⃣ Bulk API Internal Flow
Bulk API internally কয়েকটা step follow করে:
🧩 Step 1: Receive Request
Client → OpenSearch Node (Coordinator Node)
সব data একসাথে আসে (NDJSON format)
Coordinator node request handle করে
🧩 Step 2: Parse & Split
Coordinator node:
Bulk request → split into individual operations
Example:
Doc1 → shard A
Doc2 → shard B
Doc3 → shard A
👉 shard অনুযায়ী ভাগ করে
🧩 Step 3: Parallel Processing ⚡
সবচেয়ে powerful part:
Shard A → process doc1, doc3
Shard B → process doc2
👉 সব shard একসাথে parallel কাজ করে
🧩 Step 4: Write to Segment
Documents write হয় in-memory buffer এ
পরে disk এ flush হয় (Lucene segment)
👉 একবারে অনেক data write হয় → fast
🧩 Step 5: Response Return
OpenSearch → Client (single response)
সব operation এর result এক response এ আসে
🔹 3️⃣ কেন Bulk Fast? (Core Reasons)
🚀 1. Network Overhead কম
1000 request → 1 request
⚡ 2. Parallel Execution
Multiple shard simultaneously process করে
💾 3. Efficient Disk Write
Single write না, batch write
Disk I/O optimized
🔄 4. Less Thread Switching
Less context switching → better CPU usage
🔹 4️⃣ Internal Architecture Visualization
Client
↓
Coordinator Node
↓
Split → Shard A | Shard B | Shard C
↓ ↓ ↓
Parallel Processing
↓
Segments (Lucene)
↓
Response🔥 Mentor Level Insight (Important)
👉 OpenSearch fast কারণ:
Inverted Index (search fast)
Sharding (parallelism)
Bulk API (write fast)
👉 এই ৩টা combine হয়ে massive performance দেয়
🔚 Final Summary
Bulk API fast কারণ:
- 1 request instead of many
- Parallel shard processing
- Batch disk write
💡 Real Engineer Tip
Production এ:
1000–5000 docs per bulk → best
খুব বড় bulk → memory issue
retry logic implement করা উচিত