API Reference: LazyListMapper¶
LazyListMapper[T] represents a lazy evaluation pipeline. Operations are recorded but not executed until you explicitly materialize the results.
Class Signature¶
Key Characteristics¶
- Deferred Execution: Operations build a plan without executing
- Memory Efficient: No intermediate lists created
- Single Pass: Generator-based, typically consumed once
- Optimizable: Entire pipeline can be optimized before execution
Construction¶
From ListMapper¶
The primary way to create a LazyListMapper is from a ListMapper:
from functional_list import ListMapper
# Convert eager to lazy
lazy = ListMapper[int](1, 2, 3, 4, 5).lazy()
Chaining Immediately¶
from functional_list import ListMapper
# Start eager, switch to lazy, build pipeline
lazy_pipeline = (
ListMapper[int](*range(1000))
.lazy() # Switch to lazy mode
.map(lambda x: x * x) # Not executed yet
.filter(lambda x: x > 100) # Not executed yet
)
# Nothing has executed at this point!
Lazy Transformation Methods¶
These methods record operations without executing them:
map¶
Record a mapping operation.
Parameters:
- fn: Function to apply to each element
Returns: New LazyListMapper with recorded map operation
Example:
lazy = (
ListMapper[int](1, 2, 3)
.lazy()
.map(lambda x: x * 2) # Recorded, not executed
.map(lambda x: x + 1) # Also recorded
)
result = lazy.to_list() # Now executes: [3, 5, 7]
filter¶
Record a filtering operation.
Parameters:
- fn: Predicate function
Returns: New LazyListMapper with recorded filter operation
Example:
lazy = (
ListMapper[int](1, 2, 3, 4, 5, 6)
.lazy()
.filter(lambda x: x % 2 == 0) # Recorded
)
result = lazy.to_list() # Executes: [2, 4, 6]
flat_map¶
Record a flat-mapping operation.
Parameters:
- fn: Function returning an iterable
Returns: New LazyListMapper with recorded flat_map operation
Example:
lazy = (
ListMapper[str]("hello world", "foo bar")
.lazy()
.flat_map(lambda s: s.split()) # Recorded
)
result = lazy.to_list() # Executes: ['hello', 'world', 'foo', 'bar']
distinct¶
Record a distinct operation to remove duplicates.
Returns: New LazyListMapper with recorded distinct operation
Example:
lazy = (
ListMapper[int](1, 2, 2, 3, 1, 4)
.lazy()
.distinct() # Recorded, not executed
)
result = lazy.to_list() # Executes: [1, 2, 3, 4]
# Chained with other operations
lazy = (
ListMapper[int](1, 2, 3, 2, 1, 4, 5, 6)
.lazy()
.map(lambda x: x * 2) # [2, 4, 6, 4, 2, 8, 10, 12]
.distinct() # [2, 4, 6, 8, 10, 12]
.filter(lambda x: x > 4) # [6, 8, 10, 12]
)
result = lazy.to_list() # [6, 8, 10, 12]
Order Preservation
distinct() preserves the order of first occurrence, making it deterministic and predictable.
Performance
In lazy mode, distinct() uses streaming deduplication with a set for hashable types, making it memory-efficient for large datasets.
union¶
Lazily combine two LazyListMapper pipelines.
Parameters:
- other: Another LazyListMapper to union with
Returns: New LazyListMapper that will yield elements from both pipelines
Raises:
- TypeError: If other is not a LazyListMapper instance
Important Notes:
- The union is lazy - no execution until materialization
- Elements from self are yielded first, then elements from other
- Does not remove duplicates - use .union(other).distinct() for that
- Streaming implementation - memory efficient for large datasets
- Type compatibility checking is deferred to materialization time
Examples:
# Basic lazy union
lazy1 = ListMapper[int](1, 2, 3).lazy()
lazy2 = ListMapper[int](4, 5, 6).lazy()
result = lazy1.union(lazy2).to_list()
# Result: [1, 2, 3, 4, 5, 6]
# Union after transformations
lazy1 = ListMapper[int](1, 2, 3).lazy().map(lambda x: x * 2)
lazy2 = ListMapper[int](4, 5, 6).lazy().map(lambda x: x * 3)
result = lazy1.union(lazy2).to_list()
# Result: [2, 4, 6, 12, 15, 18]
# Transformations after union
lazy1 = ListMapper[int](1, 2, 3).lazy()
lazy2 = ListMapper[int](4, 5, 6).lazy()
result = lazy1.union(lazy2).map(lambda x: x * 2).to_list()
# Result: [2, 4, 6, 8, 10, 12]
# Union with distinct for deduplication
lazy1 = ListMapper[int](1, 2, 3).lazy()
lazy2 = ListMapper[int](3, 4, 5).lazy()
result = lazy1.union(lazy2).distinct().to_list()
# Result: [1, 2, 3, 4, 5]
# Multiple unions (chaining)
lazy1 = ListMapper[int](1, 2).lazy()
lazy2 = ListMapper[int](3, 4).lazy()
lazy3 = ListMapper[int](5, 6).lazy()
result = lazy1.union(lazy2).union(lazy3).to_list()
# Result: [1, 2, 3, 4, 5, 6]
# Complex pipeline with union
result = (
ListMapper[int](1, 2, 3)
.lazy()
.map(lambda x: x * 2)
.union(ListMapper[int](4, 5, 6).lazy().map(lambda x: x * 3))
.filter(lambda x: x > 5)
.distinct()
.to_list()
)
# Result: [6, 12, 15, 18]
Type Safety
Lazy union validates that the other parameter is a LazyListMapper, but does not check element type compatibility at method call. Type compatibility is checked during iteration/materialization.
Memory Efficiency
Lazy union uses streaming iteration, making it perfect for combining large datasets without loading everything into memory:
Laziness Preserved
Union doesn't force evaluation - it remains lazy until you materialize:
Materialization Methods¶
These methods trigger execution and return concrete results:
to_list¶
Execute pipeline and return a Python list.
Returns: Python list with results
Example:
Single Consumption
After calling to_list(), the generator may be exhausted:
collect¶
Execute pipeline and return a ListMapper.
Parameters:
- backend: Optional execution backend
Returns: ListMapper with materialized results
Example:
lazy = ListMapper[int](1, 2, 3).lazy().map(lambda x: x * 2)
eager = lazy.collect() # ListMapper[2, 4, 6]
# Can now use eager operations
result = eager.sort().take(2)
With Backend:
from functional_list import ListMapper, LocalBackend
lazy = ListMapper[int](*range(10000)).lazy().map(expensive_fn)
# Execute the pipeline using multiprocessing
result = lazy.collect(backend=LocalBackend(mode="processes", workers=4))
Lazy Operations That Require Materialization¶
These operations record transformations that need to see all elements. The operations are not executed immediately - they're recorded and will be applied when an action method (to_list(), collect(), foreach()) is called.
sort¶
Lazily record a sort operation.
def sort(
self,
*,
key: Optional[Callable[[T], Any]] = None,
reverse: bool = False,
backend: Optional[BaseBackend] = None,
) -> LazyListMapper[T]
Parameters:
- key: Optional function to extract comparison key from each element
- reverse: If True, sort in descending order
- backend: Optional backend for execution (used during materialization)
Returns: New LazyListMapper with the sort operation recorded (not executed)
Important: The sort is NOT executed when you call this method. It's recorded as an operation and will be executed only when you call an action method like to_list(), collect(), or foreach().
Examples:
# Sort operation is recorded, NOT executed
lazy = ListMapper(3, 1, 4, 1, 5, 9).lazy()
sorted_lazy = lazy.sort() # NO execution yet!
print("Sort recorded")
# Can continue building the pipeline
mapped = sorted_lazy.map(lambda x: x * 2) # Still no execution
print("Map recorded")
# NOW it executes: sort first, then map
result = mapped.to_list() # Execution happens here!
# Result: [2, 2, 6, 8, 10, 18]
# Sort with key function - also lazy
lazy = ListMapper("apple", "pie", "a", "cherry").lazy()
by_length = lazy.sort(key=lambda x: len(x)) # Recorded
result = by_length.map(str.upper).to_list() # Executed
# Result: ['A', 'PIE', 'APPLE', 'CHERRY']
# Complex pipeline - all lazy until to_list()
result = (
ListMapper(*range(100))
.lazy()
.filter(lambda x: x % 2 == 0) # Recorded
.map(lambda x: x ** 2) # Recorded
.sort(reverse=True) # Recorded
.filter(lambda x: x > 1000) # Recorded
.map(lambda x: x // 100) # Recorded
.to_list() # NOW all operations execute!
)
Truly Lazy
Unlike the previous implementation, sort() now behaves like other lazy operations (map(), filter()). It only records the intention to sort and defers execution until an action method is called.
Execution on Action
Operations are executed when you call:
- to_list() - Materializes to Python list
- collect() - Materializes to ListMapper
- foreach() - Executes for side effects
- Direct iteration - for x in lazy_mapper:
Performance
Building a lazy pipeline has virtually zero cost. All transformations are recorded as operations and executed in one pass when materialized.
order_by_key¶
Lazily record a sort by natural ordering.
Returns: New LazyListMapper with the sort operation recorded (not executed)
Example:
lazy = ListMapper(3, 1, 4, 1, 5, 9).lazy()
sorted_lazy = lazy.order_by_key() # Operation recorded, NOT executed
result = sorted_lazy.map(lambda x: x * 2).to_list() # NOW executes
# Result: [2, 2, 6, 8, 10, 18]
Terminal Operations¶
These methods execute the pipeline and return a final value:
reduce¶
Execute and reduce to a single value.
Parameters:
- fn: Binary reduction function
Returns: Single reduced value
Example:
foreach¶
Execute the pipeline for side effects.
Parameters:
- fn: Function to execute (return value ignored)
Returns: None
Example:
lazy = ListMapper[int](1, 2, 3).lazy().map(lambda x: x * 2)
lazy.foreach(lambda x: print(f"Value: {x}"))
# Prints:
# Value: 2
# Value: 4
# Value: 6
take¶
Execute pipeline and take first n results.
Parameters:
- n: Number of elements to take
Returns: ListMapper with first n results
Example:
# Efficient: only processes first 10 items
lazy = (
ListMapper[int](*range(1_000_000))
.lazy()
.map(lambda x: x * x)
.filter(lambda x: x > 100)
)
top_10 = lazy.take(10) # Only computes what's needed!
Early Termination
take(n) is very efficient with lazy pipelines - it stops processing after getting n results.
Operations That Force Materialization¶
Some operations require seeing all data and return ListMapper:
order_by_key¶
Sort by key (forces materialization).
Parameters:
- reverse: Sort in descending order if True
Returns: Sorted ListMapper
Example:
lazy = (
ListMapper(("b", 2), ("a", 1), ("c", 3))
.lazy()
.filter(lambda pair: pair[1] > 0)
)
sorted_result = lazy.order_by_key() # Materializes and sorts
# Result: ListMapper[('a', 1), ('b', 2), ('c', 3)]
Iteration Support¶
You can iterate over a lazy pipeline directly:
lazy = ListMapper[int](1, 2, 3).lazy().map(lambda x: x * 2)
for item in lazy:
print(item)
# Prints: 2, 4, 6
Single Iteration
Iterating consumes the generator:
Working with Backends¶
When you collect() a lazy pipeline, you can specify a backend:
from functional_list import ListMapper, LocalBackend, RayBackend
# Build pipeline
lazy = (
ListMapper[int](*range(100000))
.lazy()
.map(expensive_function)
.filter(some_predicate)
)
# Execute with threading
result1 = lazy.collect(backend=LocalBackend(mode="threads", workers=10))
# Or execute with Ray
result2 = lazy.collect(backend=RayBackend(num_cpus=8))
See the Backends Guide for more details.
Best Practices¶
Use for Large Datasets¶
Lazy evaluation shines with large data:
# Good: Memory efficient
lazy = (
ListMapper.from_parquet("huge_file.parquet") # Could be GBs
.lazy()
.filter(lambda row: row["active"])
.map(transform)
.filter(lambda row: row["score"] > 0.8)
)
# Only materialize top results
top_100 = lazy.take(100)
Combine with Eager for Best Results¶
Start lazy, materialize when data is reduced:
# Filter down lazily
filtered = (
ListMapper[int](*range(1_000_000))
.lazy()
.filter(lambda x: x % 1000 == 0) # Reduces to ~1000 items
.collect() # Materialize (now small)
)
# Use eager operations on small result
result = filtered.sort().take(10)
Reuse Pipelines¶
Build reusable pipeline templates:
def build_pipeline(threshold: int):
"""Return a reusable lazy pipeline"""
return (
lambda data:
data.lazy()
.filter(lambda x: x > threshold)
.map(lambda x: x * 2)
)
# Use it
pipeline = build_pipeline(100)
result1 = pipeline(ListMapper[int](*range(1000))).collect()
result2 = pipeline(ListMapper[int](*range(500))).collect()
Comparison: Eager vs Lazy¶
| Operation | Eager (ListMapper) |
Lazy (LazyListMapper) |
|---|---|---|
map() |
✅ Executes immediately | ⏸️ Records for later |
filter() |
✅ Executes immediately | ⏸️ Records for later |
collect() |
N/A | ✅ Triggers execution |
to_list() |
Copies to list | ✅ Triggers execution |
Indexing [i] |
✅ Works | ❌ Not supported |
| Multiple iterations | ✅ Yes | ⚠️ Once (generator) |
| Memory usage | Higher (intermediates) | Lower (streaming) |
| Early termination | ❌ No | ✅ Yes (take) |
Common Patterns¶
Pattern 1: Filter-Map-Reduce¶
result = (
ListMapper[int](*range(1000))
.lazy()
.filter(lambda x: x % 2 == 0)
.map(lambda x: x * x)
.reduce(lambda x, y: x + y)
)
Pattern 2: Take Top N After Complex Processing¶
top_10 = (
ListMapper.from_csv("large_file.csv")
.lazy()
.map(parse_row)
.filter(is_valid)
.map(calculate_score)
.filter(lambda score: score > 0)
.take(10) # Efficient!
)
Pattern 3: Lazy → Eager → Lazy¶
result = (
large_data
.lazy()
.filter(expensive_filter) # Lazy
.collect() # Materialize (now smaller)
.sort() # Eager operation
.lazy() # Back to lazy
.map(transform) # Lazy again
.take(100) # Efficient
)
See Also¶
- ListMapper API - Eager evaluation mode
- Eager vs Lazy Concepts - When to use each
- Lazy Pipelines Guide - Detailed lazy evaluation guide
- Backends Guide - Parallel execution