Header Ads

Header ADS

Batch Operation

🔹 Batch Operation কি?

👉 Batch operation = একসাথে অনেকগুলো operation একবারে করা

সহজ ভাষায়:

একটার পর একটা আলাদা request না পাঠিয়ে, একসাথে group করে করা


1️⃣ Simple Example (Real Life)

ধরো তুমি ১০টা file upload করছো:

❌ Without Batch

  1. ১০ বার upload button চাপতে হবে 😫

✅ With Batch

  1. একবারে ১০টা file select → upload 😎

👉 এটাকেই বলে batch operation


2️⃣ Programming Example

❌ Without Batch

insert user1
insert user2
insert user3

👉 ৩টা আলাদা operation


✅ With Batch

insert [user1, user2, user3]

👉 একবারে সব


3️⃣ SQL Example

INSERT INTO Customers (id, name)
VALUES 
(1, 'Ali'),
(2, 'John'),
(3, 'Marie');

👉 এক query → multiple rows insert


4️⃣ OpenSearch Example

POST _bulk
{ "index": { "_index": "customers", "_id": 1 } }
{ "name": "Ali" }
{ "index": { "_index": "customers", "_id": 2 } }
{ "name": "John" }
{ "index": { "_index": "customers", "_id": 3 } }
{ "name": "Marie" }

👉 এক API call → multiple document insert


5️⃣ কেন Batch Operation ব্যবহার করি?

🚀 1. Fast

  1. 1000টা request → 1টা request

  2. Huge performance boost


⚡ 2. Less Network Cost

  1. কম API call → কম latency


🔄 3. Efficient Processing

  1. Server efficiently process করতে পারে


6️⃣ Where Batch Operation ব্যবহার হয়?

  1. Database insert/update/delete

  2. File upload

  3. Email sending (bulk email)

  4. Data migration

  5. Logging systems


🔚 Final Summary

Batch operation = multiple কাজ একবারে করা


*********************************************************************************

 

🔹 1️⃣ Normal API vs Bulk API (Problem)

❌ Without Bulk

ধরো 1000টা document insert করছো:

Client → Server (request 1)
Client → Server (request 2)
Client → Server (request 3)
...
1000 times 😵

👉 Problem:

  • 1000 HTTP request

  • 1000 network latency

  • 1000 connection overhead


✅ With Bulk

Client → Server (1 request containing 1000 docs) 😎

👉 Huge improvement!


🔹 2️⃣ Bulk API Internal Flow

Bulk API internally কয়েকটা step follow করে:


🧩 Step 1: Receive Request

Client → OpenSearch Node (Coordinator Node)
  • সব data একসাথে আসে (NDJSON format)

  • Coordinator node request handle করে


🧩 Step 2: Parse & Split

Coordinator node:

Bulk request → split into individual operations

Example:

Doc1 → shard A
Doc2 → shard B
Doc3 → shard A

👉 shard অনুযায়ী ভাগ করে


🧩 Step 3: Parallel Processing ⚡

সবচেয়ে powerful part:

Shard A → process doc1, doc3
Shard B → process doc2

👉 সব shard একসাথে parallel কাজ করে


🧩 Step 4: Write to Segment

Documents write হয় in-memory buffer

পরে disk এ flush হয় (Lucene segment)

👉 একবারে অনেক data write হয় → fast


🧩 Step 5: Response Return

OpenSearch → Client (single response)

সব operation এর result এক response এ আসে


🔹 3️⃣ কেন Bulk Fast? (Core Reasons)

🚀 1. Network Overhead কম

1000 request → 1 request


⚡ 2. Parallel Execution

Multiple shard simultaneously process করে


💾 3. Efficient Disk Write

Single write না, batch write

Disk I/O optimized


🔄 4. Less Thread Switching

Less context switching → better CPU usage


🔹 4️⃣ Internal Architecture Visualization

Client
  ↓
Coordinator Node
  ↓
Split → Shard A | Shard B | Shard C
  ↓         ↓         ↓
Parallel Processing
  ↓
Segments (Lucene)
  ↓
Response





🔥 Mentor Level Insight (Important)

👉 OpenSearch fast কারণ:

  1. Inverted Index (search fast)

  2. Sharding (parallelism)

  3. Bulk API (write fast)

👉 এই ৩টা combine হয়ে massive performance দেয়


🔚 Final Summary

Bulk API fast কারণ:
- 1 request instead of many
- Parallel shard processing
- Batch disk write

💡 Real Engineer Tip

Production এ:

  1. 1000–5000 docs per bulk → best

  2. খুব বড় bulk → memory issue

  3. retry logic implement করা উচিত





Powered by Blogger.