Class BaseDocumentTransformer<RunInput, RunOutput>Abstract

Abstract base class for document transformation systems.

A document transformation system takes an array of Documents and returns an array of transformed Documents. These arrays do not necessarily have to have the same length.

One example of this is a text splitter that splits a large document into many smaller documents.

Type Parameters

Hierarchy

Constructors

Properties

name?: string

Methods

  • Method to invoke the document transformation. This method calls the transformDocuments method with the provided input.

    Parameters

    • input: RunInput

      The input documents to be transformed.

    • _options: BaseCallbackConfig

      Optional configuration object to customize the behavior of callbacks.

    Returns Promise<RunOutput>

    A Promise that resolves to the transformed documents.

  • Create a new runnable sequence that runs each individual runnable in series, piping the output of one runnable into another runnable or runnable-like.

    Type Parameters

    • NewRunOutput

    Parameters

    • coerceable: RunnableLike<RunOutput, NewRunOutput>

      A runnable, function, or object whose values are functions or runnables.

    Returns Runnable<RunInput, Exclude<NewRunOutput, Error>, RunnableConfig>

    A new runnable sequence.

  • Generate a stream of events emitted by the internal steps of the runnable.

    Use to create an iterator over StreamEvents that provide real-time information about the progress of the runnable, including StreamEvents from intermediate results.

    A StreamEvent is a dictionary with the following schema:

    • event: string - Event names are of the format: on_[runnable_type]_(start|stream|end).
    • name: string - The name of the runnable that generated the event.
    • run_id: string - Randomly generated ID associated with the given execution of the runnable that emitted the event. A child runnable that gets invoked as part of the execution of a parent runnable is assigned its own unique ID.
    • tags: string[] - The tags of the runnable that generated the event.
    • metadata: Record<string, any> - The metadata of the runnable that generated the event.
    • data: Record<string, any>

    Below is a table that illustrates some events that might be emitted by various chains. Metadata fields have been omitted from the table for brevity. Chain definitions have been included after the table.

    event name chunk input output
    on_llm_start [model name] {'input': 'hello'}
    on_llm_stream [model name] 'Hello' OR AIMessageChunk("hello")
    on_llm_end [model name] 'Hello human!'
    on_chain_start format_docs
    on_chain_stream format_docs "hello world!, goodbye world!"
    on_chain_end format_docs [Document(...)] "hello world!, goodbye world!"
    on_tool_start some_tool {"x": 1, "y": "2"}
    on_tool_stream some_tool {"x": 1, "y": "2"}
    on_tool_end some_tool {"x": 1, "y": "2"}
    on_retriever_start [retriever name] {"query": "hello"}
    on_retriever_chunk [retriever name] {documents: [...]}
    on_retriever_end [retriever name] {"query": "hello"} {documents: [...]}
    on_prompt_start [template_name] {"question": "hello"}
    on_prompt_end [template_name] {"question": "hello"} ChatPromptValue(messages: [SystemMessage, ...])

    Parameters

    Returns AsyncGenerator<StreamEvent, any, unknown>

  • Stream all output from a runnable, as reported to the callback system. This includes all inner runs of LLMs, Retrievers, Tools, etc. Output is streamed as Log objects, which include a list of jsonpatch ops that describe how the state of the run has changed in each step, and the final state of the run. The jsonpatch ops can be applied in order to construct state.

    Parameters

    Returns AsyncGenerator<RunLogPatch, any, unknown>

  • Default implementation of transform, which buffers input and then calls stream. Subclasses should override this method if they can start producing output while input is still being generated.

    Parameters

    • generator: AsyncGenerator<RunInput, any, unknown>
    • options: Partial<RunnableConfig>

    Returns AsyncGenerator<RunOutput, any, unknown>

  • Transform a list of documents.

    Parameters

    • documents: RunInput

      A sequence of documents to be transformed.

    Returns Promise<RunOutput>

    A list of transformed documents.

  • Bind lifecycle listeners to a Runnable, returning a new Runnable. The Run object contains information about the run, including its id, type, input, output, error, startTime, endTime, and any tags or metadata added to the run.

    Parameters

    • params: {
          onEnd?: ((run, config?) => void | Promise<void>);
          onError?: ((run, config?) => void | Promise<void>);
          onStart?: ((run, config?) => void | Promise<void>);
      }

      The object containing the callback functions.

      • Optional onEnd?: ((run, config?) => void | Promise<void>)
          • (run, config?): void | Promise<void>
          • Called after the runnable finishes running, with the Run object.

            Parameters

            Returns void | Promise<void>

      • Optional onError?: ((run, config?) => void | Promise<void>)
          • (run, config?): void | Promise<void>
          • Called if the runnable throws an error, with the Run object.

            Parameters

            Returns void | Promise<void>

      • Optional onStart?: ((run, config?) => void | Promise<void>)
          • (run, config?): void | Promise<void>
          • Called before the runnable starts running, with the Run object.

            Parameters

            Returns void | Promise<void>

    Returns Runnable<RunInput, RunOutput, RunnableConfig>

Generated using TypeDoc