Kotlin - Sequences
Overview
Sequences in Kotlin provide lazy evaluation for collections operations. Unlike regular collection operations that process elements eagerly, sequences process elements on-demand, making them more efficient for large datasets and chained operations.
๐ฏ Learning Objectives:
- Understand the difference between eager and lazy evaluation
- Learn to create and work with sequences
- Master sequence operations and transformations
- Apply sequences for performance optimization
- Choose between collections and sequences appropriately
Collections vs Sequences
Understanding the fundamental difference between collections and sequences is crucial for choosing the right approach.
Eager Evaluation (Collections)
fun main() {
val numbers = listOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
// Each operation creates a new intermediate collection
val result = numbers
.filter {
println("Filtering $it")
it % 2 == 0
}
.map {
println("Mapping $it")
it * it
}
.take(2)
println("Result: $result")
// Output shows all filtering happens first, then all mapping
// Filtering 1, Filtering 2, Filtering 3, ... Filtering 10
// Mapping 2, Mapping 4, Mapping 6, Mapping 8, Mapping 10
// Result: [4, 16]
}
Lazy Evaluation (Sequences)
fun main() {
val numbers = listOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
// Operations are applied element by element
val result = numbers.asSequence()
.filter {
println("Filtering $it")
it % 2 == 0
}
.map {
println("Mapping $it")
it * it
}
.take(2)
.toList() // Terminal operation triggers evaluation
println("Result: $result")
// Output shows element-by-element processing
// Filtering 1, Filtering 2, Mapping 2, Filtering 3, Filtering 4, Mapping 4
// Result: [4, 16]
}
Key Insight: Sequences process elements one by one through the entire chain, while collections process all elements through each operation before moving to the next.
Creating Sequences
From Collections
fun main() {
val list = listOf(1, 2, 3, 4, 5)
val sequence = list.asSequence()
val array = arrayOf("a", "b", "c", "d")
val arraySequence = array.asSequence()
val map = mapOf("one" to 1, "two" to 2, "three" to 3)
val mapSequence = map.asSequence()
println(sequence.toList())
println(arraySequence.toList())
println(mapSequence.toList())
}
Sequence Builders
fun main() {
// sequenceOf - create from elements
val sequence1 = sequenceOf(1, 2, 3, 4, 5)
// generateSequence - infinite sequence with generator
val fibonacci = generateSequence(1 to 1) { (a, b) -> b to (a + b) }
.map { it.first }
.take(10)
println("Fibonacci: ${fibonacci.toList()}")
// generateSequence with seed and next function
val powers = generateSequence(1) { it * 2 }.take(8)
println("Powers of 2: ${powers.toList()}")
// Empty sequence
val empty = emptySequence()
println("Empty: ${empty.toList()}")
}
Custom Sequence Building
fun main() {
// Using sequence builder
val customSequence = sequence {
yield(1)
yield(2)
yieldAll(listOf(3, 4, 5))
yieldAll(generateSequence(6) { it + 1 }.take(3))
}
println("Custom sequence: ${customSequence.toList()}")
// Conditional yielding
val conditionalSequence = sequence {
for (i in 1..10) {
if (i % 2 == 0) {
yield(i)
}
}
}
println("Even numbers: ${conditionalSequence.toList()}")
}
Sequence Operations
Intermediate Operations (Lazy)
fun main() {
val numbers = (1..1000).asSequence()
// These operations are lazy - no processing happens yet
val processed = numbers
.filter { it % 2 == 0 }
.map { it * it }
.filter { it > 100 }
.take(5)
println("Sequence created, no processing yet")
// Terminal operation triggers processing
val result = processed.toList()
println("Result: $result")
// Other intermediate operations
val transformed = (1..10).asSequence()
.drop(3) // Skip first 3 elements
.dropWhile { it < 6 } // Skip while condition is true
.takeWhile { it < 9 } // Take while condition is true
.distinct() // Remove duplicates
.sorted() // Sort elements
println("Transformed: ${transformed.toList()}")
}
Terminal Operations (Eager)
fun main() {
val numbers = (1..10).asSequence()
.filter { it % 2 == 0 }
.map { it * it }
// Collection terminal operations
val list = numbers.toList()
val set = numbers.toSet()
val array = numbers.toList().toTypedArray()
// Aggregation operations
val sum = numbers.sum()
val count = numbers.count()
val average = numbers.average()
val max = numbers.maxOrNull()
val min = numbers.minOrNull()
// Find operations
val first = numbers.first() // Throws if empty
val firstOrNull = numbers.firstOrNull()
val any = numbers.any { it > 50 }
val all = numbers.all { it > 0 }
val none = numbers.none { it < 0 }
println("Sum: $sum, Count: $count, Average: $average")
println("Max: $max, Min: $min")
println("First: $first, Any > 50: $any")
}
Performance Comparison
Large Dataset Processing
import kotlin.system.measureTimeMillis
fun main() {
val largeList = (1..1_000_000).toList()
// Collections approach - creates intermediate collections
val collectionTime = measureTimeMillis {
val result = largeList
.filter { it % 2 == 0 }
.map { it * it }
.filter { it > 1000 }
.take(100)
println("Collection result size: ${result.size}")
}
// Sequence approach - no intermediate collections
val sequenceTime = measureTimeMillis {
val result = largeList.asSequence()
.filter { it % 2 == 0 }
.map { it * it }
.filter { it > 1000 }
.take(100)
.toList()
println("Sequence result size: ${result.size}")
}
println("Collection time: ${collectionTime}ms")
println("Sequence time: ${sequenceTime}ms")
println("Sequence is ${collectionTime.toDouble() / sequenceTime}x faster")
}
Memory Usage Comparison
fun main() {
val numbers = (1..1_000_000).toList()
// Collections - creates multiple intermediate lists
println("Collections approach:")
val collectionsResult = numbers
.onEach { if (it % 100_000 == 0) println("Processing $it") }
.filter { it % 2 == 0 }
.map { it * 2 }
.take(10)
println("Collection result: $collectionsResult")
// Sequences - processes elements one by one
println("\nSequence approach:")
val sequenceResult = numbers.asSequence()
.onEach { if (it % 100_000 == 0) println("Processing $it") }
.filter { it % 2 == 0 }
.map { it * 2 }
.take(10)
.toList()
println("Sequence result: $sequenceResult")
}
Real-World Examples
File Processing
import java.io.File
fun processLargeFile(filename: String): List {
return File(filename).readLines().asSequence()
.filter { line -> line.isNotBlank() }
.map { line -> line.trim() }
.filter { line -> !line.startsWith("#") } // Skip comments
.map { line -> line.uppercase() }
.distinct()
.sorted()
.take(1000) // Limit results
.toList()
}
// Simulated example
fun main() {
val lines = listOf(
" apple ",
"# comment",
"banana",
"",
" APPLE ",
"cherry",
"# another comment",
"date"
)
val processed = lines.asSequence()
.filter { it.isNotBlank() }
.map { it.trim() }
.filter { !it.startsWith("#") }
.map { it.uppercase() }
.distinct()
.sorted()
.toList()
println("Processed lines: $processed")
}
Data Pipeline
data class Person(val name: String, val age: Int, val city: String)
fun main() {
val people = listOf(
Person("Alice", 25, "New York"),
Person("Bob", 30, "Los Angeles"),
Person("Charlie", 35, "New York"),
Person("Diana", 28, "Chicago"),
Person("Eve", 32, "Los Angeles"),
Person("Frank", 27, "New York")
)
// Data processing pipeline using sequences
val result = people.asSequence()
.filter { it.age >= 25 }
.groupingBy { it.city }
.eachCount()
.asSequence()
.sortedByDescending { it.value }
.take(2)
.toList()
println("Top cities with people aged 25+: $result")
// Average age by city
val avgAgeByCity = people.asSequence()
.groupBy { it.city }
.mapValues { (_, people) ->
people.asSequence().map { it.age }.average()
}
println("Average age by city: $avgAgeByCity")
}
Infinite Sequences
fun main() {
// Prime numbers generator
fun isPrime(n: Int): Boolean {
if (n < 2) return false
return (2..kotlin.math.sqrt(n.toDouble()).toInt()).none { n % it == 0 }
}
val primes = generateSequence(2) { it + 1 }
.filter { isPrime(it) }
.take(10)
println("First 10 primes: ${primes.toList()}")
// Random number sequence with conditions
val randomSequence = generateSequence { kotlin.random.Random.nextInt(1, 100) }
.distinct()
.filter { it % 5 == 0 }
.take(5)
println("Random multiples of 5: ${randomSequence.toList()}")
// Collatz sequence
fun collatz(n: Int) = generateSequence(n) { current ->
when {
current == 1 -> null
current % 2 == 0 -> current / 2
else -> current * 3 + 1
}
}
println("Collatz sequence for 13: ${collatz(13).toList()}")
}
Advanced Sequence Patterns
Windowed Operations
fun main() {
val numbers = (1..10).asSequence()
// Windowed - sliding window
val windows = numbers.windowed(size = 3, step = 1)
println("Windows of size 3:")
windows.forEach { println(it) }
// Windowed with transformation
val movingAverages = (1..10).asSequence()
.windowed(size = 3, step = 1) { window ->
window.average()
}
println("Moving averages: ${movingAverages.toList()}")
// Chunked - non-overlapping windows
val chunks = (1..10).asSequence().chunked(3)
println("Chunks of size 3:")
chunks.forEach { println(it) }
}
Sequence Composition
fun main() {
// Combining multiple sequences
val seq1 = sequenceOf(1, 2, 3)
val seq2 = sequenceOf(4, 5, 6)
val seq3 = sequenceOf(7, 8, 9)
val combined = seq1 + seq2 + seq3
println("Combined: ${combined.toList()}")
// Zip sequences
val letters = sequenceOf("a", "b", "c", "d")
val numbers = sequenceOf(1, 2, 3)
val zipped = letters.zip(numbers) { letter, number -> "$letter$number" }
println("Zipped: ${zipped.toList()}")
// FlatMap with sequences
val nestedSequence = sequenceOf(
sequenceOf(1, 2),
sequenceOf(3, 4, 5),
sequenceOf(6)
)
val flattened = nestedSequence.flatten()
println("Flattened: ${flattened.toList()}")
}
When to Use Sequences
Use Sequences When:
- Large datasets: Processing millions of elements
- Multiple operations: Chaining many transformations
- Early termination: Using take(), first(), any(), etc.
- Memory constraints: Cannot afford intermediate collections
- Infinite data: Working with potentially infinite streams
Use Collections When:
- Small datasets: Less than ~1000 elements
- Single operations: Only one transformation
- Multiple access: Need to iterate multiple times
- Random access: Need indexing or size information
- Sorting/grouping: Operations that need all elements
Performance Guidelines
fun main() {
val data = (1..1000).toList()
// โ
Good use of sequence - multiple operations, early termination
val sequenceResult = data.asSequence()
.filter { it % 2 == 0 }
.map { it * it }
.filter { it > 100 }
.take(5)
.toList()
// โ Poor use of sequence - single operation
val poorSequence = data.asSequence()
.map { it * 2 }
.toList()
// โ
Better for single operation
val betterCollection = data.map { it * 2 }
// โ Don't use sequence for size-dependent operations
val badSize = data.asSequence()
.filter { it % 2 == 0 }
.count() // Defeats lazy evaluation
// โ
Better approach
val goodSize = data.count { it % 2 == 0 }
}
Common Pitfalls
Repeated Terminal Operations
fun main() {
val sequence = (1..5).asSequence()
.onEach { println("Processing $it") }
.map { it * 2 }
// โ This will process the sequence twice!
println("First access: ${sequence.toList()}")
println("Second access: ${sequence.toList()}")
// โ
Store the result if you need it multiple times
val result = sequence.toList()
println("First access: $result")
println("Second access: $result")
}
Stateful Operations
fun main() {
// โ Stateful operations break lazy evaluation benefits
val numbers = (1..10).asSequence()
.sorted() // Needs all elements - not truly lazy
.take(3)
// โ
Better to sort after limiting
val betterApproach = (1..10).asSequence()
.take(3)
.sorted()
println("Numbers: ${numbers.toList()}")
println("Better: ${betterApproach.toList()}")
}
Key Takeaways
- Sequences provide lazy evaluation, processing elements on-demand
- Use sequences for large datasets and multiple chained operations
- Terminal operations trigger evaluation; intermediate operations are lazy
- Sequences can be infinite, created with generators
- Choose sequences vs collections based on data size and usage patterns
- Be aware of stateful operations that defeat lazy evaluation
Practice Exercises
- Create a sequence that generates prime numbers and find the first 50 primes
- Process a large dataset using sequences and compare performance with collections
- Build a data processing pipeline using sequences for log file analysis
- Implement a custom sequence builder for generating test data
Quiz
- What's the main difference between collections and sequences in terms of evaluation?
- When should you prefer sequences over collections?
- What happens when you call a terminal operation on a sequence multiple times?
Show Answers
- Collections use eager evaluation (process all elements through each operation), while sequences use lazy evaluation (process elements one by one through the entire chain).
- Use sequences for large datasets, multiple chained operations, early termination scenarios, or when memory is constrained.
- The sequence is re-evaluated from the beginning each time - the computation is not cached.