Kotlin - Sequences

Overview

Sequences in Kotlin provide lazy evaluation for collections operations. Unlike regular collection operations that process elements eagerly, sequences process elements on-demand, making them more efficient for large datasets and chained operations.

๐ŸŽฏ Learning Objectives:
  • Understand the difference between eager and lazy evaluation
  • Learn to create and work with sequences
  • Master sequence operations and transformations
  • Apply sequences for performance optimization
  • Choose between collections and sequences appropriately

Collections vs Sequences

Understanding the fundamental difference between collections and sequences is crucial for choosing the right approach.

Eager Evaluation (Collections)

fun main() {
    val numbers = listOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    
    // Each operation creates a new intermediate collection
    val result = numbers
        .filter { 
            println("Filtering $it")
            it % 2 == 0 
        }
        .map { 
            println("Mapping $it")
            it * it 
        }
        .take(2)
    
    println("Result: $result")
    
    // Output shows all filtering happens first, then all mapping
    // Filtering 1, Filtering 2, Filtering 3, ... Filtering 10
    // Mapping 2, Mapping 4, Mapping 6, Mapping 8, Mapping 10
    // Result: [4, 16]
}

Lazy Evaluation (Sequences)

fun main() {
    val numbers = listOf(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
    
    // Operations are applied element by element
    val result = numbers.asSequence()
        .filter { 
            println("Filtering $it")
            it % 2 == 0 
        }
        .map { 
            println("Mapping $it")
            it * it 
        }
        .take(2)
        .toList() // Terminal operation triggers evaluation
    
    println("Result: $result")
    
    // Output shows element-by-element processing
    // Filtering 1, Filtering 2, Mapping 2, Filtering 3, Filtering 4, Mapping 4
    // Result: [4, 16]
}
Key Insight: Sequences process elements one by one through the entire chain, while collections process all elements through each operation before moving to the next.

Creating Sequences

From Collections

fun main() {
    val list = listOf(1, 2, 3, 4, 5)
    val sequence = list.asSequence()
    
    val array = arrayOf("a", "b", "c", "d")
    val arraySequence = array.asSequence()
    
    val map = mapOf("one" to 1, "two" to 2, "three" to 3)
    val mapSequence = map.asSequence()
    
    println(sequence.toList())
    println(arraySequence.toList())
    println(mapSequence.toList())
}

Sequence Builders

fun main() {
    // sequenceOf - create from elements
    val sequence1 = sequenceOf(1, 2, 3, 4, 5)
    
    // generateSequence - infinite sequence with generator
    val fibonacci = generateSequence(1 to 1) { (a, b) -> b to (a + b) }
        .map { it.first }
        .take(10)
    
    println("Fibonacci: ${fibonacci.toList()}")
    
    // generateSequence with seed and next function
    val powers = generateSequence(1) { it * 2 }.take(8)
    println("Powers of 2: ${powers.toList()}")
    
    // Empty sequence
    val empty = emptySequence()
    println("Empty: ${empty.toList()}")
}

Custom Sequence Building

fun main() {
    // Using sequence builder
    val customSequence = sequence {
        yield(1)
        yield(2)
        yieldAll(listOf(3, 4, 5))
        yieldAll(generateSequence(6) { it + 1 }.take(3))
    }
    
    println("Custom sequence: ${customSequence.toList()}")
    
    // Conditional yielding
    val conditionalSequence = sequence {
        for (i in 1..10) {
            if (i % 2 == 0) {
                yield(i)
            }
        }
    }
    
    println("Even numbers: ${conditionalSequence.toList()}")
}

Sequence Operations

Intermediate Operations (Lazy)

fun main() {
    val numbers = (1..1000).asSequence()
    
    // These operations are lazy - no processing happens yet
    val processed = numbers
        .filter { it % 2 == 0 }
        .map { it * it }
        .filter { it > 100 }
        .take(5)
    
    println("Sequence created, no processing yet")
    
    // Terminal operation triggers processing
    val result = processed.toList()
    println("Result: $result")
    
    // Other intermediate operations
    val transformed = (1..10).asSequence()
        .drop(3)           // Skip first 3 elements
        .dropWhile { it < 6 } // Skip while condition is true
        .takeWhile { it < 9 } // Take while condition is true
        .distinct()        // Remove duplicates
        .sorted()          // Sort elements
    
    println("Transformed: ${transformed.toList()}")
}

Terminal Operations (Eager)

fun main() {
    val numbers = (1..10).asSequence()
        .filter { it % 2 == 0 }
        .map { it * it }
    
    // Collection terminal operations
    val list = numbers.toList()
    val set = numbers.toSet()
    val array = numbers.toList().toTypedArray()
    
    // Aggregation operations
    val sum = numbers.sum()
    val count = numbers.count()
    val average = numbers.average()
    val max = numbers.maxOrNull()
    val min = numbers.minOrNull()
    
    // Find operations
    val first = numbers.first() // Throws if empty
    val firstOrNull = numbers.firstOrNull()
    val any = numbers.any { it > 50 }
    val all = numbers.all { it > 0 }
    val none = numbers.none { it < 0 }
    
    println("Sum: $sum, Count: $count, Average: $average")
    println("Max: $max, Min: $min")
    println("First: $first, Any > 50: $any")
}

Performance Comparison

Large Dataset Processing

import kotlin.system.measureTimeMillis

fun main() {
    val largeList = (1..1_000_000).toList()
    
    // Collections approach - creates intermediate collections
    val collectionTime = measureTimeMillis {
        val result = largeList
            .filter { it % 2 == 0 }
            .map { it * it }
            .filter { it > 1000 }
            .take(100)
        println("Collection result size: ${result.size}")
    }
    
    // Sequence approach - no intermediate collections
    val sequenceTime = measureTimeMillis {
        val result = largeList.asSequence()
            .filter { it % 2 == 0 }
            .map { it * it }
            .filter { it > 1000 }
            .take(100)
            .toList()
        println("Sequence result size: ${result.size}")
    }
    
    println("Collection time: ${collectionTime}ms")
    println("Sequence time: ${sequenceTime}ms")
    println("Sequence is ${collectionTime.toDouble() / sequenceTime}x faster")
}

Memory Usage Comparison

fun main() {
    val numbers = (1..1_000_000).toList()
    
    // Collections - creates multiple intermediate lists
    println("Collections approach:")
    val collectionsResult = numbers
        .onEach { if (it % 100_000 == 0) println("Processing $it") }
        .filter { it % 2 == 0 }
        .map { it * 2 }
        .take(10)
    
    println("Collection result: $collectionsResult")
    
    // Sequences - processes elements one by one
    println("\nSequence approach:")
    val sequenceResult = numbers.asSequence()
        .onEach { if (it % 100_000 == 0) println("Processing $it") }
        .filter { it % 2 == 0 }
        .map { it * 2 }
        .take(10)
        .toList()
    
    println("Sequence result: $sequenceResult")
}

Real-World Examples

File Processing

import java.io.File

fun processLargeFile(filename: String): List {
    return File(filename).readLines().asSequence()
        .filter { line -> line.isNotBlank() }
        .map { line -> line.trim() }
        .filter { line -> !line.startsWith("#") } // Skip comments
        .map { line -> line.uppercase() }
        .distinct()
        .sorted()
        .take(1000) // Limit results
        .toList()
}

// Simulated example
fun main() {
    val lines = listOf(
        "  apple  ",
        "# comment",
        "banana",
        "",
        "  APPLE  ",
        "cherry",
        "# another comment",
        "date"
    )
    
    val processed = lines.asSequence()
        .filter { it.isNotBlank() }
        .map { it.trim() }
        .filter { !it.startsWith("#") }
        .map { it.uppercase() }
        .distinct()
        .sorted()
        .toList()
    
    println("Processed lines: $processed")
}

Data Pipeline

data class Person(val name: String, val age: Int, val city: String)

fun main() {
    val people = listOf(
        Person("Alice", 25, "New York"),
        Person("Bob", 30, "Los Angeles"),
        Person("Charlie", 35, "New York"),
        Person("Diana", 28, "Chicago"),
        Person("Eve", 32, "Los Angeles"),
        Person("Frank", 27, "New York")
    )
    
    // Data processing pipeline using sequences
    val result = people.asSequence()
        .filter { it.age >= 25 }
        .groupingBy { it.city }
        .eachCount()
        .asSequence()
        .sortedByDescending { it.value }
        .take(2)
        .toList()
    
    println("Top cities with people aged 25+: $result")
    
    // Average age by city
    val avgAgeByCity = people.asSequence()
        .groupBy { it.city }
        .mapValues { (_, people) -> 
            people.asSequence().map { it.age }.average() 
        }
    
    println("Average age by city: $avgAgeByCity")
}

Infinite Sequences

fun main() {
    // Prime numbers generator
    fun isPrime(n: Int): Boolean {
        if (n < 2) return false
        return (2..kotlin.math.sqrt(n.toDouble()).toInt()).none { n % it == 0 }
    }
    
    val primes = generateSequence(2) { it + 1 }
        .filter { isPrime(it) }
        .take(10)
    
    println("First 10 primes: ${primes.toList()}")
    
    // Random number sequence with conditions
    val randomSequence = generateSequence { kotlin.random.Random.nextInt(1, 100) }
        .distinct()
        .filter { it % 5 == 0 }
        .take(5)
    
    println("Random multiples of 5: ${randomSequence.toList()}")
    
    // Collatz sequence
    fun collatz(n: Int) = generateSequence(n) { current ->
        when {
            current == 1 -> null
            current % 2 == 0 -> current / 2
            else -> current * 3 + 1
        }
    }
    
    println("Collatz sequence for 13: ${collatz(13).toList()}")
}

Advanced Sequence Patterns

Windowed Operations

fun main() {
    val numbers = (1..10).asSequence()
    
    // Windowed - sliding window
    val windows = numbers.windowed(size = 3, step = 1)
    println("Windows of size 3:")
    windows.forEach { println(it) }
    
    // Windowed with transformation
    val movingAverages = (1..10).asSequence()
        .windowed(size = 3, step = 1) { window ->
            window.average()
        }
    
    println("Moving averages: ${movingAverages.toList()}")
    
    // Chunked - non-overlapping windows
    val chunks = (1..10).asSequence().chunked(3)
    println("Chunks of size 3:")
    chunks.forEach { println(it) }
}

Sequence Composition

fun main() {
    // Combining multiple sequences
    val seq1 = sequenceOf(1, 2, 3)
    val seq2 = sequenceOf(4, 5, 6)
    val seq3 = sequenceOf(7, 8, 9)
    
    val combined = seq1 + seq2 + seq3
    println("Combined: ${combined.toList()}")
    
    // Zip sequences
    val letters = sequenceOf("a", "b", "c", "d")
    val numbers = sequenceOf(1, 2, 3)
    
    val zipped = letters.zip(numbers) { letter, number -> "$letter$number" }
    println("Zipped: ${zipped.toList()}")
    
    // FlatMap with sequences
    val nestedSequence = sequenceOf(
        sequenceOf(1, 2),
        sequenceOf(3, 4, 5),
        sequenceOf(6)
    )
    
    val flattened = nestedSequence.flatten()
    println("Flattened: ${flattened.toList()}")
}

When to Use Sequences

Use Sequences When:

  • Large datasets: Processing millions of elements
  • Multiple operations: Chaining many transformations
  • Early termination: Using take(), first(), any(), etc.
  • Memory constraints: Cannot afford intermediate collections
  • Infinite data: Working with potentially infinite streams

Use Collections When:

  • Small datasets: Less than ~1000 elements
  • Single operations: Only one transformation
  • Multiple access: Need to iterate multiple times
  • Random access: Need indexing or size information
  • Sorting/grouping: Operations that need all elements

Performance Guidelines

fun main() {
    val data = (1..1000).toList()
    
    // โœ… Good use of sequence - multiple operations, early termination
    val sequenceResult = data.asSequence()
        .filter { it % 2 == 0 }
        .map { it * it }
        .filter { it > 100 }
        .take(5)
        .toList()
    
    // โŒ Poor use of sequence - single operation
    val poorSequence = data.asSequence()
        .map { it * 2 }
        .toList()
    
    // โœ… Better for single operation
    val betterCollection = data.map { it * 2 }
    
    // โŒ Don't use sequence for size-dependent operations
    val badSize = data.asSequence()
        .filter { it % 2 == 0 }
        .count() // Defeats lazy evaluation
    
    // โœ… Better approach
    val goodSize = data.count { it % 2 == 0 }
}

Common Pitfalls

Repeated Terminal Operations

fun main() {
    val sequence = (1..5).asSequence()
        .onEach { println("Processing $it") }
        .map { it * 2 }
    
    // โŒ This will process the sequence twice!
    println("First access: ${sequence.toList()}")
    println("Second access: ${sequence.toList()}")
    
    // โœ… Store the result if you need it multiple times
    val result = sequence.toList()
    println("First access: $result")
    println("Second access: $result")
}

Stateful Operations

fun main() {
    // โŒ Stateful operations break lazy evaluation benefits
    val numbers = (1..10).asSequence()
        .sorted() // Needs all elements - not truly lazy
        .take(3)
    
    // โœ… Better to sort after limiting
    val betterApproach = (1..10).asSequence()
        .take(3)
        .sorted()
    
    println("Numbers: ${numbers.toList()}")
    println("Better: ${betterApproach.toList()}")
}

Key Takeaways

  • Sequences provide lazy evaluation, processing elements on-demand
  • Use sequences for large datasets and multiple chained operations
  • Terminal operations trigger evaluation; intermediate operations are lazy
  • Sequences can be infinite, created with generators
  • Choose sequences vs collections based on data size and usage patterns
  • Be aware of stateful operations that defeat lazy evaluation

Practice Exercises

  1. Create a sequence that generates prime numbers and find the first 50 primes
  2. Process a large dataset using sequences and compare performance with collections
  3. Build a data processing pipeline using sequences for log file analysis
  4. Implement a custom sequence builder for generating test data

Quiz

  1. What's the main difference between collections and sequences in terms of evaluation?
  2. When should you prefer sequences over collections?
  3. What happens when you call a terminal operation on a sequence multiple times?
Show Answers
  1. Collections use eager evaluation (process all elements through each operation), while sequences use lazy evaluation (process elements one by one through the entire chain).
  2. Use sequences for large datasets, multiple chained operations, early termination scenarios, or when memory is constrained.
  3. The sequence is re-evaluated from the beginning each time - the computation is not cached.