Skip to content

API Reference: LazyListMapper

LazyListMapper[T] represents a lazy evaluation pipeline. Operations are recorded but not executed until you explicitly materialize the results.

Class Signature

class LazyListMapper(Generic[T]):
    """Lazy evaluation pipeline for functional transformations."""

Key Characteristics

  • Deferred Execution: Operations build a plan without executing
  • Memory Efficient: No intermediate lists created
  • Single Pass: Generator-based, typically consumed once
  • Optimizable: Entire pipeline can be optimized before execution

Construction

From ListMapper

The primary way to create a LazyListMapper is from a ListMapper:

from functional_list import ListMapper

# Convert eager to lazy
lazy = ListMapper[int](1, 2, 3, 4, 5).lazy()

Chaining Immediately

from functional_list import ListMapper

# Start eager, switch to lazy, build pipeline
lazy_pipeline = (
    ListMapper[int](*range(1000))
    .lazy()                         # Switch to lazy mode
    .map(lambda x: x * x)          # Not executed yet
    .filter(lambda x: x > 100)     # Not executed yet
)

# Nothing has executed at this point!

Lazy Transformation Methods

These methods record operations without executing them:

map

Record a mapping operation.

def map(self, fn: Callable[[T], U]) -> LazyListMapper[U]

Parameters: - fn: Function to apply to each element

Returns: New LazyListMapper with recorded map operation

Example:

lazy = (
    ListMapper[int](1, 2, 3)
    .lazy()
    .map(lambda x: x * 2)      # Recorded, not executed
    .map(lambda x: x + 1)      # Also recorded
)

result = lazy.to_list()  # Now executes: [3, 5, 7]

filter

Record a filtering operation.

def filter(self, fn: Callable[[T], bool]) -> LazyListMapper[T]

Parameters: - fn: Predicate function

Returns: New LazyListMapper with recorded filter operation

Example:

lazy = (
    ListMapper[int](1, 2, 3, 4, 5, 6)
    .lazy()
    .filter(lambda x: x % 2 == 0)  # Recorded
)

result = lazy.to_list()  # Executes: [2, 4, 6]

flat_map

Record a flat-mapping operation.

def flat_map(self, fn: Callable[[T], Iterable[U]]) -> LazyListMapper[U]

Parameters: - fn: Function returning an iterable

Returns: New LazyListMapper with recorded flat_map operation

Example:

lazy = (
    ListMapper[str]("hello world", "foo bar")
    .lazy()
    .flat_map(lambda s: s.split())  # Recorded
)

result = lazy.to_list()  # Executes: ['hello', 'world', 'foo', 'bar']

distinct

Record a distinct operation to remove duplicates.

def distinct(self) -> LazyListMapper[T]

Returns: New LazyListMapper with recorded distinct operation

Example:

lazy = (
    ListMapper[int](1, 2, 2, 3, 1, 4)
    .lazy()
    .distinct()  # Recorded, not executed
)

result = lazy.to_list()  # Executes: [1, 2, 3, 4]

# Chained with other operations
lazy = (
    ListMapper[int](1, 2, 3, 2, 1, 4, 5, 6)
    .lazy()
    .map(lambda x: x * 2)      # [2, 4, 6, 4, 2, 8, 10, 12]
    .distinct()                # [2, 4, 6, 8, 10, 12]
    .filter(lambda x: x > 4)   # [6, 8, 10, 12]
)

result = lazy.to_list()  # [6, 8, 10, 12]

Order Preservation

distinct() preserves the order of first occurrence, making it deterministic and predictable.

Performance

In lazy mode, distinct() uses streaming deduplication with a set for hashable types, making it memory-efficient for large datasets.

union

Lazily combine two LazyListMapper pipelines.

def union(self, other: LazyListMapper[T]) -> LazyListMapper[T]

Parameters: - other: Another LazyListMapper to union with

Returns: New LazyListMapper that will yield elements from both pipelines

Raises: - TypeError: If other is not a LazyListMapper instance

Important Notes: - The union is lazy - no execution until materialization - Elements from self are yielded first, then elements from other - Does not remove duplicates - use .union(other).distinct() for that - Streaming implementation - memory efficient for large datasets - Type compatibility checking is deferred to materialization time

Examples:

# Basic lazy union
lazy1 = ListMapper[int](1, 2, 3).lazy()
lazy2 = ListMapper[int](4, 5, 6).lazy()
result = lazy1.union(lazy2).to_list()
# Result: [1, 2, 3, 4, 5, 6]

# Union after transformations
lazy1 = ListMapper[int](1, 2, 3).lazy().map(lambda x: x * 2)
lazy2 = ListMapper[int](4, 5, 6).lazy().map(lambda x: x * 3)
result = lazy1.union(lazy2).to_list()
# Result: [2, 4, 6, 12, 15, 18]

# Transformations after union
lazy1 = ListMapper[int](1, 2, 3).lazy()
lazy2 = ListMapper[int](4, 5, 6).lazy()
result = lazy1.union(lazy2).map(lambda x: x * 2).to_list()
# Result: [2, 4, 6, 8, 10, 12]

# Union with distinct for deduplication
lazy1 = ListMapper[int](1, 2, 3).lazy()
lazy2 = ListMapper[int](3, 4, 5).lazy()
result = lazy1.union(lazy2).distinct().to_list()
# Result: [1, 2, 3, 4, 5]

# Multiple unions (chaining)
lazy1 = ListMapper[int](1, 2).lazy()
lazy2 = ListMapper[int](3, 4).lazy()
lazy3 = ListMapper[int](5, 6).lazy()
result = lazy1.union(lazy2).union(lazy3).to_list()
# Result: [1, 2, 3, 4, 5, 6]

# Complex pipeline with union
result = (
    ListMapper[int](1, 2, 3)
    .lazy()
    .map(lambda x: x * 2)
    .union(ListMapper[int](4, 5, 6).lazy().map(lambda x: x * 3))
    .filter(lambda x: x > 5)
    .distinct()
    .to_list()
)
# Result: [6, 12, 15, 18]

Type Safety

Lazy union validates that the other parameter is a LazyListMapper, but does not check element type compatibility at method call. Type compatibility is checked during iteration/materialization.

lazy1 = ListMapper(1, 2, 3).lazy()
eager = ListMapper(4, 5, 6)
lazy1.union(eager)  # TypeError: requires another LazyListMapper

Memory Efficiency

Lazy union uses streaming iteration, making it perfect for combining large datasets without loading everything into memory:

source1 = ListMapper.from_parquet("large1.parquet").lazy()
source2 = ListMapper.from_parquet("large2.parquet").lazy()

result = (
    source1
    .union(source2)
    .filter(lambda row: row["active"])
    .take(1000)  # Only processes what's needed!
)

Laziness Preserved

Union doesn't force evaluation - it remains lazy until you materialize:

lazy1 = ListMapper(1, 2, 3).lazy()
lazy2 = ListMapper(4, 5, 6).lazy()

# Union doesn't execute
union_lazy = lazy1.union(lazy2)

# Still lazy - can add more operations
result = union_lazy.map(lambda x: x * 2).to_list()

Materialization Methods

These methods trigger execution and return concrete results:

to_list

Execute pipeline and return a Python list.

def to_list(self) -> List[T]

Returns: Python list with results

Example:

lazy = ListMapper[int](1, 2, 3).lazy().map(lambda x: x * 2)
result = lazy.to_list()  # [2, 4, 6]

Single Consumption

After calling to_list(), the generator may be exhausted:

lazy = ListMapper[int](1, 2).lazy().map(lambda x: x * 2)
list1 = lazy.to_list()  # [2, 4]
list2 = lazy.to_list()  # [] - generator exhausted!

collect

Execute pipeline and return a ListMapper.

def collect(self, *, backend: Optional[Backend] = None) -> ListMapper[T]

Parameters: - backend: Optional execution backend

Returns: ListMapper with materialized results

Example:

lazy = ListMapper[int](1, 2, 3).lazy().map(lambda x: x * 2)
eager = lazy.collect()  # ListMapper[2, 4, 6]

# Can now use eager operations
result = eager.sort().take(2)

With Backend:

from functional_list import ListMapper, LocalBackend

lazy = ListMapper[int](*range(10000)).lazy().map(expensive_fn)

# Execute the pipeline using multiprocessing
result = lazy.collect(backend=LocalBackend(mode="processes", workers=4))

Lazy Operations That Require Materialization

These operations record transformations that need to see all elements. The operations are not executed immediately - they're recorded and will be applied when an action method (to_list(), collect(), foreach()) is called.

sort

Lazily record a sort operation.

def sort(
    self,
    *,
    key: Optional[Callable[[T], Any]] = None,
    reverse: bool = False,
    backend: Optional[BaseBackend] = None,
) -> LazyListMapper[T]

Parameters: - key: Optional function to extract comparison key from each element - reverse: If True, sort in descending order - backend: Optional backend for execution (used during materialization)

Returns: New LazyListMapper with the sort operation recorded (not executed)

Important: The sort is NOT executed when you call this method. It's recorded as an operation and will be executed only when you call an action method like to_list(), collect(), or foreach().

Examples:

# Sort operation is recorded, NOT executed
lazy = ListMapper(3, 1, 4, 1, 5, 9).lazy()
sorted_lazy = lazy.sort()  # NO execution yet!
print("Sort recorded")

# Can continue building the pipeline
mapped = sorted_lazy.map(lambda x: x * 2)  # Still no execution
print("Map recorded")

# NOW it executes: sort first, then map
result = mapped.to_list()  # Execution happens here!
# Result: [2, 2, 6, 8, 10, 18]

# Sort with key function - also lazy
lazy = ListMapper("apple", "pie", "a", "cherry").lazy()
by_length = lazy.sort(key=lambda x: len(x))  # Recorded
result = by_length.map(str.upper).to_list()   # Executed
# Result: ['A', 'PIE', 'APPLE', 'CHERRY']

# Complex pipeline - all lazy until to_list()
result = (
    ListMapper(*range(100))
    .lazy()
    .filter(lambda x: x % 2 == 0)       # Recorded
    .map(lambda x: x ** 2)              # Recorded
    .sort(reverse=True)                 # Recorded
    .filter(lambda x: x > 1000)         # Recorded
    .map(lambda x: x // 100)            # Recorded
    .to_list()                          # NOW all operations execute!
)

Truly Lazy

Unlike the previous implementation, sort() now behaves like other lazy operations (map(), filter()). It only records the intention to sort and defers execution until an action method is called.

Execution on Action

Operations are executed when you call: - to_list() - Materializes to Python list - collect() - Materializes to ListMapper
- foreach() - Executes for side effects - Direct iteration - for x in lazy_mapper:

Performance

Building a lazy pipeline has virtually zero cost. All transformations are recorded as operations and executed in one pass when materialized.

order_by_key

Lazily record a sort by natural ordering.

def order_by_key(self) -> LazyListMapper[T]

Returns: New LazyListMapper with the sort operation recorded (not executed)

Example:

lazy = ListMapper(3, 1, 4, 1, 5, 9).lazy()
sorted_lazy = lazy.order_by_key()  # Operation recorded, NOT executed
result = sorted_lazy.map(lambda x: x * 2).to_list()  # NOW executes
# Result: [2, 2, 6, 8, 10, 18]

Terminal Operations

These methods execute the pipeline and return a final value:

reduce

Execute and reduce to a single value.

def reduce(self, fn: Callable[[T, T], T]) -> T

Parameters: - fn: Binary reduction function

Returns: Single reduced value

Example:

lazy = ListMapper[int](1, 2, 3, 4, 5).lazy()
total = lazy.reduce(lambda x, y: x + y)  # 15

foreach

Execute the pipeline for side effects.

def foreach(self, fn: Callable[[T], Any]) -> None

Parameters: - fn: Function to execute (return value ignored)

Returns: None

Example:

lazy = ListMapper[int](1, 2, 3).lazy().map(lambda x: x * 2)
lazy.foreach(lambda x: print(f"Value: {x}"))
# Prints:
# Value: 2
# Value: 4
# Value: 6

take

Execute pipeline and take first n results.

def take(self, n: int) -> ListMapper[T]

Parameters: - n: Number of elements to take

Returns: ListMapper with first n results

Example:

# Efficient: only processes first 10 items
lazy = (
    ListMapper[int](*range(1_000_000))
    .lazy()
    .map(lambda x: x * x)
    .filter(lambda x: x > 100)
)

top_10 = lazy.take(10)  # Only computes what's needed!

Early Termination

take(n) is very efficient with lazy pipelines - it stops processing after getting n results.

Operations That Force Materialization

Some operations require seeing all data and return ListMapper:

order_by_key

Sort by key (forces materialization).

def order_by_key(self, *, reverse: bool = False) -> ListMapper[Tuple[K, V]]

Parameters: - reverse: Sort in descending order if True

Returns: Sorted ListMapper

Example:

lazy = (
    ListMapper(("b", 2), ("a", 1), ("c", 3))
    .lazy()
    .filter(lambda pair: pair[1] > 0)
)

sorted_result = lazy.order_by_key()  # Materializes and sorts
# Result: ListMapper[('a', 1), ('b', 2), ('c', 3)]

Iteration Support

You can iterate over a lazy pipeline directly:

lazy = ListMapper[int](1, 2, 3).lazy().map(lambda x: x * 2)

for item in lazy:
    print(item)
# Prints: 2, 4, 6

Single Iteration

Iterating consumes the generator:

lazy = ListMapper[int](1, 2).lazy().map(lambda x: x * 2)

for item in lazy:
    print(item)  # 2, 4

for item in lazy:
    print(item)  # Nothing - already consumed!

Working with Backends

When you collect() a lazy pipeline, you can specify a backend:

from functional_list import ListMapper, LocalBackend, RayBackend

# Build pipeline
lazy = (
    ListMapper[int](*range(100000))
    .lazy()
    .map(expensive_function)
    .filter(some_predicate)
)

# Execute with threading
result1 = lazy.collect(backend=LocalBackend(mode="threads", workers=10))

# Or execute with Ray
result2 = lazy.collect(backend=RayBackend(num_cpus=8))

See the Backends Guide for more details.

Best Practices

Use for Large Datasets

Lazy evaluation shines with large data:

# Good: Memory efficient
lazy = (
    ListMapper.from_parquet("huge_file.parquet")  # Could be GBs
    .lazy()
    .filter(lambda row: row["active"])
    .map(transform)
    .filter(lambda row: row["score"] > 0.8)
)

# Only materialize top results
top_100 = lazy.take(100)

Combine with Eager for Best Results

Start lazy, materialize when data is reduced:

# Filter down lazily
filtered = (
    ListMapper[int](*range(1_000_000))
    .lazy()
    .filter(lambda x: x % 1000 == 0)  # Reduces to ~1000 items
    .collect()                         # Materialize (now small)
)

# Use eager operations on small result
result = filtered.sort().take(10)

Reuse Pipelines

Build reusable pipeline templates:

def build_pipeline(threshold: int):
    """Return a reusable lazy pipeline"""
    return (
        lambda data: 
        data.lazy()
        .filter(lambda x: x > threshold)
        .map(lambda x: x * 2)
    )

# Use it
pipeline = build_pipeline(100)
result1 = pipeline(ListMapper[int](*range(1000))).collect()
result2 = pipeline(ListMapper[int](*range(500))).collect()

Comparison: Eager vs Lazy

Operation Eager (ListMapper) Lazy (LazyListMapper)
map() ✅ Executes immediately ⏸️ Records for later
filter() ✅ Executes immediately ⏸️ Records for later
collect() N/A ✅ Triggers execution
to_list() Copies to list ✅ Triggers execution
Indexing [i] ✅ Works ❌ Not supported
Multiple iterations ✅ Yes ⚠️ Once (generator)
Memory usage Higher (intermediates) Lower (streaming)
Early termination ❌ No ✅ Yes (take)

Common Patterns

Pattern 1: Filter-Map-Reduce

result = (
    ListMapper[int](*range(1000))
    .lazy()
    .filter(lambda x: x % 2 == 0)
    .map(lambda x: x * x)
    .reduce(lambda x, y: x + y)
)

Pattern 2: Take Top N After Complex Processing

top_10 = (
    ListMapper.from_csv("large_file.csv")
    .lazy()
    .map(parse_row)
    .filter(is_valid)
    .map(calculate_score)
    .filter(lambda score: score > 0)
    .take(10)  # Efficient!
)

Pattern 3: Lazy → Eager → Lazy

result = (
    large_data
    .lazy()
    .filter(expensive_filter)   # Lazy
    .collect()                   # Materialize (now smaller)
    .sort()                      # Eager operation
    .lazy()                      # Back to lazy
    .map(transform)              # Lazy again
    .take(100)                   # Efficient
)

See Also