Database/MongoDB

[몽고DB 완벽 가이드] 7. 집계 프레임워크

noahkim_ 2025. 4. 30. 06:38

크리스티나 초도로 , 섀넌 브래드쇼 , 오언 브라질 님의 "몽고DB 완벽 가이드" 책을 정리한 포스팅 입니다.

1. 파이프라인, 단계 및 조정 가능 항목

집계 프레임워크

항목	설명
정의	파이프라인 기반 데이터 분석·가공 도구 모음
입력	하나의 컬렉션
출력	가공된 도큐먼트 스트림
단계(Stage)	파이프라인의 구성 단위 - 각 단계는 도큐먼트 스트림을 입력으로 받고, 가공된 도큐먼트 스트림을 출력함
옵션(knobs / tunables)	각 단계는 파라미터 설정 가능 (원하는 방식으로 데이터 처리 가능)

2. 단계 시작하기: 익숙한 작업들

aggregate()

연산자	기능
$match	조건 필터링
$skip, $limit	페이징
$project	필드 선택 및 재구성
$sort	정렬

예제

db.movies.aggregate(
    {$match: {year: 1914}},
    {$skip: 1},
    {$limit: 2},
    {$project: {_id: 0, title: 1, year: 1}},
    {$sort: {title: 1}}
)

3. 표현식

카테고리	표현식	설명
🔹 불리언	$and	모든 조건이 참이면 true
	$or	하나라도 참이면 true
	$not	조건 부정
	$nor	모두 거짓이면 true
🔸 집합(Set)	$in	배열에 값이 포함되면 true
	$nin	배열에 값이 포함되지 않으면 true
	$setEquals	두 배열이 동일한지 비교
	$setIntersection	교집합 반환
	$setUnion	합집합 반환
	$setDifference	차집합 반환
	$anyElementTrue	배열 중 하나라도 true
	$allElementsTrue	모든 요소가 true
🔸 비교	$eq	값이 같음
	$ne	값이 다름
	$gt, $gte	초과 / 이상
	$lt, $lte	미만 / 이하
🔸 산술	$add	덧셈
	$subtract	뺄셈
	$multiply	곱셈
	$divide	나눗셈
	$mod	나머지
🔸 문자열	$concat	문자열 합치기
	$substr (구버전)	부분 문자열
	$substrBytes, $substrCP	바이트/코드 포인트 기준 서브스트링
	$toLower, $toUpper	소/대문자 변환
	$strLenBytes, $strLenCP	문자열 길이
	$indexOfBytes, $indexOfCP	문자열 인덱스
🔸 배열	$size	배열 크기
	$arrayElemAt	배열 요소 접근
	$slice	배열 일부 추출
	$filter	조건에 맞는 배열 필터링
🔸 가변적 / 조건	$cond	if-then-else
	$ifNull	null이면 대체
	$switch	다중 조건 분기
🔸 누산기 (Aggregation 전용)	$sum	합계
	$avg	평균
	$min, $max	최소 / 최대
	$push	배열로 누적
	$addToSet	중복 제거 배열 누적
	$first, $last	첫/마지막 값

예제) 불리언

{ $and: [ { $gt: ["$age", 20] }, { $lt: ["$age", 50] } ] }
{ $or:  [ { $eq: ["$role", "admin"] }, { $eq: ["$role", "user"] } ] }
{ $not: [ { $eq: ["$active", true] } ] }
{ $nor: [ { $eq: ["$status", "active"] }, { $eq: ["$status", "pending"] } ] }

예제) 집합

{ $in: ["apple", "$fruits"] }
{ $nin: ["banana", "$fruits"] }
{ $setEquals: [["a", "b"], ["b", "a"]] }
{ $setIntersection: [["a", "b"], ["b", "c"]] }
{ $setUnion: [["a"], ["b", "c"]] }
{ $setDifference: [["a", "b"], ["b"]] }
{ $anyElementTrue: [[true, false]] }
{ $allElementsTrue: [[true, true]] }

예제) 비교

{ $eq: ["$age", 30] }
{ $ne: ["$status", "inactive"] }
{ $gt: ["$score", 80] }
{ $lte: ["$price", 1000] }

예제) 산술

{ $add: ["$qty", "$bonus"] }
{ $subtract: [100, "$used"] }
{ $multiply: ["$qty", "$price"] }
{ $divide: ["$total", "$count"] }
{ $mod: ["$score", 2] }

예제) 문자열

{ $concat: ["$firstName", " ", "$lastName"] }
{ $substr: ["$title", 0, 5] }
{ $substrCP: ["$text", 0, 5] }
{ $toUpper: "$name" }
{ $strLenBytes: "$title" }
{ $indexOfBytes: ["$name", "kim"] }

예제) 배열

{ $size: "$tags" }
{ $arrayElemAt: ["$tags", 0] }
{ $slice: ["$comments", 5] }
{ $filter: { input: "$items", as: "item", cond: { $gt: ["$$item.price", 1000] } } }

예제) 조건

{ $cond: { if: <조건>, then: <A>, else: <B> } }
{ $ifNull: ["$nickname", "Anonymous"] }
{ $switch: { branches: [...], default: ... } }

예제) 누산기

{ $sum: "$price" }
{ $avg: "$score" }
{ $min: "$date" }
{ $push: "$tag" }
{ $addToSet: "$category" }
{ $first: "$status" }

4. $project

중첩 도큐먼트 필드 출력 가능

예제

db.movies.aggregate(
    {$match: {year: 1914}},
    {$project: {_id: 0, title: 1, year: 1, rating: "$tomatoes.viewer.rating"}},
).pretty()

5. $unwind

배열 필드의 각 요소에 대해 출력 도큐먼트가 하나씩 있는 출력을 생성할 수 있음
하나 이상의 전개 단계를 포함해야 함

예제

db.movies.aggregate(
    {$match: {year: 1914, title: "The Perils of Pauline"}},
    {$unwind: "$writers"},
    {$project: {_id: 0, title: 1, year: 1, writer: "$writers"}},
).pretty()

db.movies.aggregate(
    {$match: {year: 1914, title: "The Perils of Pauline"}},
    {$unwind: "$writers"},
    {$match: {writers: "George B. Seitz"}},    
    {$project: {_id: 0, title: 1, year: 1, writer: "$writers"}},
).pretty()

전개 후 매치 가능

6. 배열 표현식

$filter

$arayElemAt

$slice

예제

db.movies.aggregate([
    {$match: {$expr: {$and: [{ $isArray: "$writers" }, { $gt: [{ $size: "$writers" }, 0] }]}}},    
    {$project: {_id: 0, writer: {$filter: {input: "$writers",as: "writer", cond: {$regexMatch: {input: "$$writer",regex: "George"}}}}}},  
    {$match: {writer: { $ne: [] }}}
]).pretty()

db.movies.aggregate(
    {$match: {year: 1914}},
    {$project: {_id: 0, first_writer: {$arrayElemAt: ["$writers", 0]}, last_writer: {$arrayElemAt: ["$writers", -1]}}},
).pretty()

db.movies.aggregate(
    {$match: {year: 1914}},
    {$project: {_id: 0, early_writer: {$slice: ["$writers", 0, 2]}}},
).pretty()

db.movies.aggregate(
    {$match: {year: 1914}},
    {$project: {_id: 0, writer_count: {$size: "$writers"}}},
).pretty()

7. 누산기

$max

$sum

예제

db.transactions.aggregate([
    {$match: {account_id: 443178}},
    {$project: {_id: 0, transaction_count: 1, max_total_tx: {$max: "$transactions.total"}}}	
])

db.transactions.aggregate([
    {$match: {account_id: 443178}},
    {$unwind: "$transactions"},
    {$group: {_id: "$account_id", transaction_count: {$sum: 1}, sum_total_tx: {$sum: {$toDouble: "$transactions.total"}}}},
    {$project: {_id: 1, transaction_count: 1, sum_total_tx: 1}}	
])

8. 그룹화

SQL의 GROUP BY와 유사한 기능을 수행함

예제

db.orders.aggregate([{ $group: {_id: "$customer", totalSpent: { $sum: "$total" }}}])
db.orders.aggregate([{ $group: {_id: "$status", count: { $sum: 1 }}}])
db.orders.aggregate([{ $group: {_id: "$customer", avgOrder: { $avg: "$total" }}}])

9. 집계 파이프라인 결과를 컬렉션에 쓰기

특징	$out	$merge
주 용도	집계 결과를 새로운 컬렉션에 저장	집계 결과를 기존 컬렉션과 병합
기존 데이터	기존 데이터 덮어쓰기	기존 데이터에 병합(추가, 수정, 삭제 가능)
유연성	제한적 (새 컬렉션에 저장 또는 덮어쓰기)	매우 유연 (업데이트, 삽입, 삭제 등의 병합 옵션 가능)
사용 예	대규모 데이터 분석 결과를 별도의 컬렉션에 저장	데이터 집계 후 기존 데이터와 결합하거나 업데이트

예제

db.orders.aggregate([
  { $match: { status: "completed" } },
  { $group: { _id: "$customer_id", totalAmount: { $sum: "$amount" } } },
  { $out: "completed_orders_summary" }  # 결과를 "completed_orders_summary" 컬렉션에 저장
])

db.orders.aggregate([
  { $match: { status: "completed" } },
  { $group: { _id: "$customer_id", totalAmount: { $sum: "$amount" } } },
  { $merge: {
    into: "completed_orders_summary",  # 기존 컬렉션에 병합
    whenMatched: "merge",  # 기존 문서와 일치하면 병합
    whenNotMatched: "insert"  # 일치하는 문서가 없으면 새로운 문서 삽입
  }}
])

충돌나는 필드는 새로운 값으로 덮어씀

저작자표시 (새창열림)

'Database > MongoDB' 카테고리의 다른 글

[몽고DB 완벽 가이드] 8. 트랜잭션 (1)	2025.04.30
[몽고DB 완벽 가이드] 6. 특수 인덱스와 컬렉션 유형 (0)	2025.04.30
[몽고DB 완벽 가이드] 5. 인덱싱 (0)	2025.04.29
[몽고DB 완벽 가이드] 4. 쿼리 (1)	2025.04.28
[몽고DB 완벽 가이드] 3. 도큐먼트 생성, 갱신, 삭제 (0)	2025.04.28

현재글[몽고DB 완벽 가이드] 7. 집계 프레임워크

Today :
Yesterday :

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

IT LAB

[몽고DB 완벽 가이드] 7. 집계 프레임워크

크리스티나 초도로 , 섀넌 브래드쇼 , 오언 브라질 님의 "몽고DB 완벽 가이드" 책을 정리한 포스팅 입니다.

1. 파이프라인, 단계 및 조정 가능 항목

집계 프레임워크

2. 단계 시작하기: 익숙한 작업들

aggregate()

3. 표현식

4. $project

5. $unwind

6. 배열 표현식

$filter

$arayElemAt

$slice

7. 누산기

$max

$sum

8. 그룹화

9. 집계 파이프라인 결과를 컬렉션에 쓰기

'Database > MongoDB' 카테고리의 다른 글

'Database/MongoDB'의 다른글

티스토리툴바

[몽고DB 완벽 가이드] 7. 집계 프레임워크

크리스티나 초도로 , 섀넌 브래드쇼 , 오언 브라질 님의 "몽고DB 완벽 가이드" 책을 정리한 포스팅 입니다.

1. 파이프라인, 단계 및 조정 가능 항목

집계 프레임워크

2. 단계 시작하기: 익숙한 작업들

aggregate()

3. 표현식

4. $project

5. $unwind

6. 배열 표현식

$filter

$arayElemAt

$slice

7. 누산기

$max

$sum

8. 그룹화

9. 집계 파이프라인 결과를 컬렉션에 쓰기

'Database > MongoDB' 카테고리의 다른 글

'Database/MongoDB'의 다른글

관련글

티스토리툴바